Page 1
Use of the MULTINEST algorithm for gravitational wave
data analysis
Farhan Feroz1, Jonathan R. Gair2, Michael P. Hobson1and Edward
K. Porter3
1Astrophysics Group, Cavendish Laboratory, JJ Thomson Avenue, Cambridge CB3 0HE, UK
2Institute of Astronomy, Madingley Road, Cambridge CB3 0HA, UK
3APC, UMR 7164, Universit´ e Paris 7 Denis Diderot, 10, rue Alice Domon et L´ eonie Duquet,
75205 Paris Cedex 13, France
Abstract.
data analysis. MULTINEST is a multimodal nested sampling algorithm designed to efficiently
evaluate the Bayesian evidence and return posterior probability densities for likelihood
surfaces containing multiple secondary modes. The algorithm employs a set of ‘live’ points
which are updated by partitioning the set into multiple overlapping ellipsoids and sampling
uniformly from within them. This set of ‘live’ points climbs up the likelihood surface through
nested iso-likelihood contours and the evidence and posterior distributions can be recovered
from the point set evolution. The algorithm is model-independent in the sense that the specific
problem being tackled enters only through the likelihood computation, and does not change
how the ‘live’ point set is updated. In this paper, we consider the use of the algorithm for
gravitational wave data analysis by searching a simulated LISA data set containing two non-
spinning supermassive black hole binary signals. The algorithm is able to rapidly identify all
the modes of the solution and recover the true parameters of the sources to high precision.
We describe an application of the MULTINEST algorithm to gravitational wave
PACS numbers: 04.25.Nx, 04.30.Db, 04.80.Cc
arXiv:0904.1544v2 [gr-qc] 23 Jul 2009
Page 2
MULTINEST for gravitational wave data analysis2
1. Introduction
There is currently much active research into data analysis algorithms for gravitational wave
detectors, both on the ground and for the proposed space-based gravitational wave detector,
the Laser Interferometer Space Antenna (LISA) [1]. Research into LISA data analysis is
being encouraged by the Mock LISA Data Challenges [2] (MLDC), which so far have
included data sets containing individual and multiple white-dwarf binaries, a realisation of the
whole galaxy of compact binaries, single and multiple non-spinning supermassive black hole
(SMBH) binaries, either isolated or on top of a galactic confusion background and isolated
extreme-mass-ratio inspiral (EMRI) sources in purely instrumental noise. While most of
these Challenges have been solved in the sense that several groups returned solutions that
matched the injected parameters, the EMRI case caused particular difficulty [3, 4, 5, 6]. These
problems arose primarily because the likelihood surface contains many secondary maxima,
and so algorithms such as Markov Chain Monte Carlo (MCMC) tended to become stuck
on secondary maxima and were unable to find the primary mode. In Challenge 3 of the
MLDC,twonewsourceswereintroducedthathavesimilarmulti-modallikelihoodsurfaces—
spinning black hole binaries and cosmic string cusps [7]. The final analysis of the LISA data
will also have to contend with the presence of multiple overlapping sources simultaneously
present in the data stream. For these reasons, it is desirable to have algorithms that are able
to simultaneously identify and characterize all the modes of the likelihood surface. One
approach is to use an evolutionary algorithm, and a recent implementation of these ideas
is described in [8]. However, there are also existing tools that have been developed for other
applications that have the necessary features to be useful for the gravitational wave case. One
such algorithm is MULTINEST [9, 10], which is a multi-modal nested sampling algorithm.
The nested sampling algorithm [11] was developed as a tool for evaluating the Bayesian
evidence. It employs a set of ‘live’ points, each of which represents a particular set of
parameters in the multi-dimensional search space.
progresses, and climb together through nested contours of increasing likelihood. At each step
the algorithm works by finding a point of higher likelihood than the lowest likelihood point in
the ‘live’ point set and then replacing the lowest likelihood point with the new point. Nested
sampling has been applied to the issue of model selection for ground based observations of
gravitational waves [12] and forms part of the evolutionary algorithm for LISA data analysis
discussed in [8]. However, the primary difficulty in using nested sampling is to efficiently
sample points of higher likelihood to allow the points to climb up the likelihood surface.
MULTINEST solves the problem of efficient sampling by using the current set of ‘live’
points as a model of the shape of the likelihood surface. The idea is to decompose the
set of ‘live’ points into a sequence of overlapping ellipsoids, and then sample from within
one of these ellipsoids which is chosen at random from the current set. The algorithm is
model-free in the sense that the evolution of the ‘live’ point set does not make use of any
knowledge about the properties of the likelihood surface. The only model-specific element
is the subroutine which computes the likelihood at a given point in parameter space. This
approach reduces much of the computational overhead associated with evaluating Fisher
Matrices etc. in order to move the ‘live’ points on the likelihood surface. MULTINEST
has proven to be able to efficiently find and separate modes of a likelihood surface in many
different applications [13, 14, 15, 16]. This suggests it may be a powerful tool for gravitational
wave data analysis as well, and in this paper we explore these possibilities using a search for
non-spinning SMBH binaries as a test case. Such sources are known to possess a degeneracy
in the likelihood, since at low frequency the response of the detector to the true sky position
and the point antipodal to it on the sky can not be distinguished. We can therefore use these
These points move as the algorithm
Page 3
MULTINEST for gravitational wave data analysis3
sources as a test case to investigate the ability of MULTINEST to explore a multi-modal
likelihood in the context of gravitational wave detection.
The coalescences of supermassive black holes (SMBHs) in binaries are expected to be
a major source of gravitational waves (GW) for LISA, and we should be able to detect such
systems out to a redshift of z ∼ 10 with intrinsic parameter errors of a fraction of a percent
to a few percent and extrinsic parameter errors of a few to a few tens of percent [17, 18].
The detection problem for non-spinning SMBH binaries has already been solved, in the sense
that several groups have been able to successfully detect and recover parameters for such
systems both in controlled studies [18, 19] and in blind tests in the context of the MLDC.
Several algorithms have been proven to be successful — Metropolis Hastings Monte Carlo
(MHMC) [18, 19], template-bank based searches [20, 21], time-frequency methods [22] and
hierarchical searches mixing time-frequency and MHMC techniques [23]. More recently,
the evolutionary algorithm has also been shown to be effective in searches for non-spinning
SMBH binaries [8]. Thus, although the application of MULTINEST to these systems will be
an interesting test case, it is not essential to the success of LISA. However, the real power of
the algorithm will be in its application to multi-modal likelihood surfaces for systems such
as EMRIs. The present work is an indication of the power of the technique and a stimulus
for further research on this algorithm in the context of gravitational wave data analysis. In
addition, the evidence values returned by MULTINEST can be used for model selection,
which will also be important for LISA. Bayesian model selection using MCMC techniques
has already been explored in the LISA context [24] for the case of white-dwarf binaries.
MULTINEST provides an alternative approach to addressing similar questions, although we
will not discuss that application of the algorithm in the present paper.
The paper is organised as follows. In section 2 we give a brief introduction to Bayesian
inference and then give details of the MULTINEST algorithm in section 3, including an
overviewofnestedsamplingandadescriptionoftheellipsoidalsamplingscheme. Insection4
we briefly describe the waveform and noise models used in this analysis. In section 5 we
describe an application of the algorithm to a simultaneous search for two non-spinning black
hole binaries in a single data set. We provide results on the parameter recovery achieved by
the algorithm and compare these to the one sigma error estimates from the Fisher Matrix. We
finish in section 6 with a summary and a discussion of possible future applications of this
work.
2. Bayesian Inference
Our detection methodology is built upon the principles of Bayesian inference, and so we
begin by giving a brief summary of this framework. Bayesian inference methods provide a
consistent approach to the estimation of a set of parameters Θ in a model (or hypothesis) H
for the data D. Bayes’ theorem states that
Pr(Θ|D,H) =Pr(D|Θ,H)Pr(Θ|H)
Pr(D|H)
where Pr(Θ|D,H) ≡ P(Θ) is the posterior probability distribution of the parameters,
Pr(D|Θ,H) ≡ L(Θ) is the likelihood, Pr(Θ|H) ≡ π(Θ) is the prior distribution, and
Pr(D|H) ≡ Z is the Bayesian evidence.
Bayesian evidence is simply the factor required to normalise the posterior over Θ and is
given by:
?
,
(1)
Z =
L(Θ)π(Θ)dNΘ,
(2)
Page 4
MULTINEST for gravitational wave data analysis4
where N is the dimensionality of the parameter space. Since the Bayesian evidence is
independent of the parameter values Θ, it is usually ignored in parameter estimation problems
and the posterior inferences are obtained by exploring the un–normalized posterior using
standard MCMC sampling methods.
Bayesian parameter estimation has been used quite extensively in a variety of
astronomical applications, including gravitational wave astronomy, although standard MCMC
methods, such as the basic Metropolis–Hastings algorithm or the Hamiltonian sampling
technique (see e.g. [27]), can experience problems in sampling efficiently from a multi–
modal posterior distribution or one with large (curving) degeneracies between parameters.
Moreover, MCMC methods often require careful tuning of the proposal distribution to sample
efficiently, and testing for convergence can be problematic.
In order to select between two models H0and H1one needs to compare their respective
posterior probabilities given the observed data set D, as follows:
Pr(H1|D)
Pr(H0|D)=Pr(D|H1)Pr(H1)
Pr(D|H0)Pr(H0)=Z1
Z0
Pr(H1)
Pr(H0),
(3)
where Pr(H1)/Pr(H0) is the prior probability ratio for the two models, which can often
be set to unity but occasionally requires further consideration (see, for example, [13, 14] for
the cases where the prior probability ratios should not be set to unity). It can be seen from
Eq. (3) that the Bayesian evidence plays a central role in Bayesian model selection. As the
average of likelihood over the prior, the evidence automatically implements Occam’s razor:
a simpler theory which agrees well enough with the empirical evidence is preferred. A more
complicated theory will only have a higher evidence if it is significantly better at explaining
the data than a simpler theory.
Unfortunately, evaluation of Bayesian evidence involves the multidimensional integral
(Eq. (2)) and thus presents a challenging numerical task.
thermodynamic integration [28, 24] are extremely computationally expensive which makes
evidence evaluation typically at least an order of magnitude more costly than parameter
estimation. Some fast approximate methods have been used for evidence evaluation, such
as treating the posterior as a multivariate Gaussian centred at its peak (see, for example,
[25]), but this approximation is clearly a poor one for highly non-Gaussian and multi–modal
posteriors. Various alternative information criteria for model selection are discussed in [26],
but the evidence remains the preferred method.
Standard techniques like
3. Nested Sampling and the MULTINEST Algorithm
Nested sampling [11] is a Monte Carlo method targetted at the efficient calculation of the
evidence, but also produces posterior inferences as a by-product. It calculates the evidence
by transforming the multi–dimensional evidence integral into a one–dimensional integral that
is easy to evaluate numerically. This is accomplished by defining the prior volume X as
dX = π(Θ)dDΘ, so that
?
where the integral extends over the region(s) of parameter space contained within the iso-
likelihood contour L(Θ) = λ. The evidence integral, Eq. (2), can then be written as
Z =
0
X(λ) =
L(Θ)>λ
π(Θ)dNΘ,
(4)
?1
L(X)dX,
(5)
Page 5
MULTINEST for gravitational wave data analysis5
(a)(b)
Figure 1. Cartoon illustrating (a) the posterior of a two dimensional problem; and (b) the
transformed L(X) function where the prior volumes Xiare associated with each likelihood
Li.
where L(X), the inverse of Eq. (4), is a monotonically decreasing function of X. Thus, if one
can evaluate the likelihoods Li= L(Xi), where Xiis a sequence of decreasing values,
0 < XM< ··· < X2< X1< X0= 1,
as shown schematically in Fig. 1, the evidence can be approximated numerically using
standard quadrature methods as a weighted sum
?
where the weights wifor the simple trapezium rule are given by wi=1
example of a posterior in two dimensions and its associated function L(X) is shown in Fig. 1.
(6)
Z =
M
i=1
Liwi,
(7)
2(Xi−1− Xi+1). An
3.1. Evidence Evaluation
In order to evaluate the evidence value given in (Eq. 7), a set of N ‘live’ points are drawn
uniformly from the full prior π(Θ) with the initial prior volume X0set to 1. The likelihood
value for each of the N ‘live’ points is calculated and the point with the minimum likelihood
value L0 is removed from the ‘live’ point set and is replaced by another point sampled
uniformly from the prior with likelihood L > L0. This results in the reduction in the prior
volume within the iso-likelihood contour by a factor t1i.e. X1= t1X0where t1follows the
distribution of the largest of the N samples drawn uniformly from the interval [0,1] which is
given by Pr(t) = NtN−1. This procedure is repeated at each subsequent iteration i, at which
the point with the lowest likelihood value Liis replaced by a new point sampled uniformly
from the prior with L > Liuntil the entire prior volume has been traversed. The expected
value and the standard deviation of logt, which dominates the geometrical exploration is
given by:
E[logt] = −1/N,
With the values of logt at different iterations being independent of each other, the prior
volume at iteration i is expected to be:
logXi≈ −(i ±
Thus, one takes
Xi= exp(−i/N).
σ[logt] = 1/N.
(8)
√i)/N.
(9)
(10)
Page 6
MULTINEST for gravitational wave data analysis6
3.2. Stopping Criterion
The primary aim of nested sampling algorithm is to calculate the evidence value and so it
should be terminated when the evidence value has been calculated to required accuracy. One
way to achieve this is by checking whether the change in the evidence value from one iteration
to the other is smaller than a specified tolerance but in cases where the posterior contains
narrow peaks close to its maximum, this can result in an underestimated evidence value.
Skilling [11] provides a robust stopping condition by calculating an upper limit on the
evidencethatcanbedeterminedfromthesetofcurrent‘live’points. Thislimitiscalculatedby
assuming that all the ‘live’ points have the likelihood value equal to the maximum–likelihood
Lmaxand so the largest evidence contribution that can be made by the remaining portion of
the posterior is ∆Zi = LmaxXi. Lmaxcan be taken to be the maximum likelihood value
amongst the ‘live’ points. At the beginning of the nested sampling algorithm, it is possible for
the high likelihood regions of the posterior to not have been explored adequately resulting in
the maximum–likelihood value Lmax, amongst the ‘live’ points being a lot smaller than the
true maximum–likelihood value. The remaining prior volume Xion the other hand would
be quite high and therefore the largest evidence contribution ∆Zi = LmaxXi would be
high as well allowing the algorithm to proceed. At subsequent stages, the remaining prior
volume Xidecreases exponentially as given by (Eq. 10) and the high likelihood regions of
the posteriors are explored in greater detail with estimated Lmaxgetting closer and closer to
the true maximum–likelihood value. The largest evidence contribution ∆Zi= LmaxXi, thus
provides a robust stopping criteria.
We choose to stop when this quantity would no longer change the final evidence estimate
by 0.5 in log–evidence.
3.3. Posterior Inferences
Although the nested sampling algorithm has been designed to calculate the evidence value,
accurate posterior inferences can also be generated as a by-product. Once the evidence Z is
found, each one of the final ‘live’ points and the discarded points from the nested sampling
process, i.e., the points with the lowest likelihood value at each iteration i of the algorithm is
simply assigned the probability weight
pi=Liwi
Z
These samples can then be used to calculate inferences of posterior parameters such as
means, standard deviations, covariances and so on, or to construct marginalised posterior
distributions.
.
(11)
3.4. MULTINEST Algorithm
The most challenging task in implementing the nested sampling algorithm is drawing samples
from the prior within the hard constraint L > Li at each iteration i.
naive approach that draws blindly from the prior would result in a steady decrease in the
acceptance rate of new samples with decreasing prior volume (and increasing likelihood). The
MULTINEST algorithm [9, 10] tackles this problem through an ellipsoidal rejection sampling
scheme by enclosing the ‘live’ point set within a set of (possibly overlapping) ellipsoids
and a new point is then drawn uniformly from the region enclosed by these ellipsoids. The
number of points in an individual ellipsoid and the total number of ellipsoids is decided by
an ‘expectation–maximization’ algorithm so that the total sampling volume, which is equal
Employing a
Page 7
MULTINEST for gravitational wave data analysis7
(a) (b)
Figure 2. Illustrations of the ellipsoidal decompositions performed by MULTINEST. The
points given as input are overlaid on the resulting ellipsoids. 1000 points were sampled
uniformly from: (a) two non-intersecting ellipsoids; and (b) a torus.
to the sum of volumes of the ellipsoids, is minimized. This allows maximum flexibility and
efficiency by breaking up a mode resembling a Gaussian into a relatively small number of
ellipsoids, and if the posterior mode possesses a pronounced curving degeneracy so that it
more closely resembles a (multi–dimensional) ‘banana’ then it is broken into a relatively
large number of small ‘overlapping’ ellipsoids (see Fig. 2). With enough ‘live’ points, this
approach allows the detection of all the modes simultaneously, resulting in typically two
orders of magnitude improvement in efficiency and accuracy over standard methods for
inference problems in cosmology and particle physics phenomenology (see, for example,
[13, 14, 15, 16]).
The ellipsoidal decomposition scheme described above also provides a mechanism for
mode identification. By forming chains of overlapping ellipsoids (enclosing the ‘live’ points),
the algorithm can identify distinct modes with distinct ellipsoidal chains, e.g., in Fig. 2 panel
(a) the algorithm identifies two distinct modes while in panel (b) the algorithm identifies only
one mode as all the ellipsoids are linked with each other because of the overlap between them.
Once distinct modes have been identifed, they are evolved independently.
Another feature of the MULTINEST algorithm is the evaluation of the global as well as
the ‘local’ evidence values associated with each mode. These evidence values can be used
in calculating the probability that an identified ‘local’ peak in the posterior corresponds to a
real object (see, for instance, [12, 13, 14]). We defer the discussion of quantifying the SMBH
detection to a later work.
4. Problem Description
4.1. Waveform model
The waveform from a binary composed of two non-spinning black holes depends on nine
parameters:?λ = {ln(Mc),ln(µ),θ,φ,ln(tc),ι,ϕc,ln(DL),ψ}, where Mcis the chirp mass,
µ is the reduced-mass, (θ,φ) are the sky location of the source, tcis the time-to-coalescence,
ι is the inclination of the orbit of the binary, ϕcis the phase of the GW at coalescence, DL
Page 8
MULTINEST for gravitational wave data analysis8
is the luminosity distance and ψ is the polarization of the GW measured in the Solar System
barycentre (SSB). ψ is the angle between the principal polarisation axes of the source and the
“natural” polarisation axes to use in the SSB, which are perpendicular to the line of sight to
the source and the orbital angular momentum of the binary. We model the waveform using the
restricted post-Newtonian approximation, in which the two polarizations of the GW are [29]
?1 + cos2ι??Gmω
h×= −4Gmη
c2DL
c3
where m = m1+ m2is the total mass of the binary, η = m1m2/m2is the reduced mass
ratio, G is Newton’s constant and c is the speed of light. The chirp mass and reduced mass
are given in terms of m and η as Mc= mη3/5and µ = mη. The function ω = dΦorb/dt is
the orbital frequency, and Φ = ϕc− ϕ(t) = 2Φorbis the gravitational wave phase. We take
these to 2PN order
c3
8Gm
?1855099
Φ(t) = ϕc−2
η
?9275495
where
c3η
5Gm(tc− t).
The above model describes the inspiral only and not the merger or ringdown phase. To avoid
artifacts when taking Fourier transforms, we follow the approach introduced in [18] and now
used in the MLDC [2] and continue the inspiral until the orbit reaches the innermost stable
circular orbit at 6M, but employ a hyperbolic taper from 7M to ensure the waveform goes
smoothly to zero as the end of the inspiral approaches. The taper has the value of 1 for
r > 7M and then smoothly goes to zero at r = 6M, where the waveform finishes. To
implement the LISA response, we use the low-frequency approximation, as described in [30].
In the low-frequency limit we expect there to be bright modes in the likelihood at both the
true sky position and at a position approximately antipodal to it, as mentioned earlier. The
antipodal position corresponds to the shift θ → π − θ, φ → φ + π. The ‘antipodal’ mode
of the likelihood is strictly antipodal at low frequencies, but moves as the gravitational wave
frequency increases.
The four parameters {DL,ι,ϕc,ψ} are extrinsic parameters which only affect how the
gravitational waveform phase at the detector projects into a detector response. It is possible to
search over these automatically using a generalisation of the F-statistic [31]. Details of how
the F-statistic is computed for non-spinning SMBH binaries may be found in [18]. We make
use of the F-statistic maximization in the first stage of our search.
A gravitational wave search is sensitive to redshifted masses, ¯ m1= (1 + z)m1, rather
than the intrinsic masses. All our subsequent results will be quoted for redshifted mass
quantities, which we will denote with an overbar for clarity.
h+=2Gmη
c2DL
c3
?2
3
cos(Φ),
(12)
cosι
?Gmω
?2
3
sin(Φ),
(13)
ω(t) =
?
Θ−3/8+
?743
258048η +371
?3715
258048η +1855
2688+1132η
?
Θ−5/8−3π
?
Θ3/8−3π
?
10Θ−3/4
?
4Θ1/4
?
+
14450688+56975
?
14450688+284875
2048η2
?
2048η2
Θ−7/8
(14)
Θ5/8+
8064+5596η
+
Θ1/8
,
(15)
Θ(t;tc) =
(16)
Page 9
MULTINEST for gravitational wave data analysis9
4.2. Likelihood evaluation
The gravitational waveform signals can be thought of as occupying a vector space, on which
there is a natural scalar product [32, 33]
?∞
where
?∞
is the Fourier transform of the time domain waveform h(t). The quantity Sn(f) is the one-
sided noise spectral density of the detector. For a family of sources with waveforms h(t;?λ)
that depend on parameters?λ, the output of the detector, s(t) = h(t;?λ0) + n(t), consists of
the true signal h(t;?λ0) and a particular realisation of the noise, n(t). Assuming that the noise
is stationary and Gaussian, the logarithm of the likelihood that the parameter values are given
by?λ is
??λ
and it is this log-likelihood that is evaluated at each ‘live’ point in the search. The constant,
C, depends on the dimensionality of the search space, but its value is not important as we are
only interested in the relative likelihoods of different points.
In section 5 we will compare the errors in our recovered parameters to the theoretical
noise-induced errors. The latter can be computed as σi=?(Γ−1)iiwhere
Γij=
∂λi
∂λj
is the Fisher Information Matrix (FIM).
?h|s? = 2
0
df
Sn(f)
?˜h(f)˜ s∗(f) +˜h∗(f)˜ s(f)
?
.
(17)
˜h(f) =
−∞
dth(t)e2πıft
(18)
logL
?
= C −
?
s − h
??λ
????s − h
??λ
??
/2,
(19)
?∂h
????
∂h
?
(20)
4.3. Noise model
The noise consists of an instrumental part, which we take from [34]
1
4L2
?
where L = 5 × 106km is the arm-length for LISA, Spos
Sacc
has units of Hz−1. The quantity f∗= 1/(2πL) is the mean transfer frequency for the LISA
arms. In addition, there is a noise contribution from the confusion foreground of galactic
compact binaries. We represent this using the model of [35, 36]
Sinst
n
(f) =
?
2Spos
n (f)
?
2 +
?f
?f
f∗
2??
???
+ 8Sacc
n(f) 1 + cos2
f∗
1
(2πf)4+(2π10−4Hz)2
(2πf)6
??
.
(21)
n (f) = 4 × 10−22m2/Hz and
n(f) = 9×10−30m2/s4/Hz are the position and acceleration noise respectively. Sinst
n
(f)
Sconf
n
(f) =
10−44.62(f/Hz)−2.3
10−4< (f/Hz) ≤ 10−3
10−3< (f/Hz) ≤ 10−2.7
10−2.7< (f/Hz) ≤ 10−2.4
10−2.4< (f/Hz) ≤ 10−2
10−50.92(f/Hz)−4.4
10−62.8(f/Hz)−8.8
10−89.68(f/Hz)−20
Hz−1,
(22)
Page 10
MULTINEST for gravitational wave data analysis 10
m1/M?
1 × 107
4 × 106
m2/M?
1 × 106
1 × 106
tc/yrsθφzιψϕc
1
2
0.90
1.02
0.6283
2.1206
4.7124
3.9429
1.0
0.8
1.1120
0.6565
1.2330
1.0646
2.2220
1.5116
Table 1. Parameter values for the two SMBHBs considered in this study. The dual-TDI
channel SNRs for the sources are 200 and 131 respectively.
The total noise is the sum of these contributions. At low frequencies, the LISA detector can
be thought to consist of two independent right-angle interferometers with independent noise
described by the same power spectral density. The total likelihood is obtained by summing
the scalar product (17) over the two detectors, and the FIM of the network is similarly given
by the sum of the two FIMs.
5. Results of MULTINEST Search
To test the MULTINEST algorithm we generated a data set containing two SMBH binaries,
one of which coalesced during the observation time, and one of which coalesced a few days
after the end of the observation. We took these sources to have the same parameters as the
two sources searched for by the evolutionary algorithm in [8] to facilitate direct comparison
between the two techniques. The parameters of the two sources are summarised in Table 1.
We used a standard flat WMAP cosmology, with radiation, matter and dark energy densities
(ΩR,ΩM,ΩΛ) = (4.9 × 10−5,0.27,0.73) and H0=71 km/s/Mpc [37]. The redshifts of the
sources therefore correspond to luminosity distances of 6.634Gpc and 5.024Gpc respectively.
The redshifted chirp mass and reduced mass are (¯
for source 1, and (2.997 × 106,1.44 × 106)M?for source 2, and the full, dual-TDI channel,
signal-to-noise ratios (SNRs) are 200 and 131 respectively.
For the first stage of the search, we included the F-statistic maximization in the
likelihood evaluation and used MULTINEST with 1000 ‘live’ points to search only the five
dimensional space of intrinsic parameters ( ¯
m1, ¯
search of ¯
m1, ¯
In general, the number of ‘live’ points required for a search is problem specific, depending
on the dimensionality of the search space, and on the complexity of the likelihood surface.
Therefore, the ideal number must be found by trial and error. Using too few ‘live’ points leads
tounder-samplingoftheposterior. Itisnormallyclearfromtheposteriordistributionswhether
the likelihood has been properly sampled or not. Using more ‘live’ points than necessary does
not affect the results, but is computationally more expensive. The choice of 1000 in this
case was based on previous experience in other contexts, but appeared to work well. The
algorithm returned 11 possible modes and, based on the time of coalesence, it was clear that
five were associated with the coalescing source, and six with the non-coalescing source. Four
modes were the same secondaries found using the evolutionary algorithm [8], each of which
corresponded to mass and spin values close to those of one of the sources, but had sky position
in the vicinity of one of the two sky solutions of the other source. Of the three additional
secondaries identified in the likelihood, one was associated with the coalescing source, and
wasveryclosetothetruevaluesofthemassesandtc, butwasalsointhewrongpositiononthe
sky. The other two secondaries were associated with the non-coalescing source and had mass
and tcparameters that were relatively close to the true values, although many Fisher Matrix
Mc, ¯ µ) = (4.9289×106,1.8182×106)M?
m2,θ,φ,tc). We used wide priors for this
m2∈?5 × 105,3 × 107?M?, tc∈ [0.85,1.1]yrs, θ ∈ [0,π] and φ ∈ [0,2π].
Page 11
MULTINEST for gravitational wave data analysis 11
σ away, but had incorrect sky locations. It is likely that all of these secondaries correspond
to parameter space points for which the waveforms match the brighter waveform cycles that
come toward the end of the inspiral, but not the earlier part of the waveform. This stage of the
search took approximately one and a half hours to run on a single 3.0 GHz Intel Woodcrest
processor, and used ∼ 170,000 likelihood evaluations.
In Table 2 we list the mean and standard deviation of the posteriors recovered by the
algorithm in this stage, for the two true and the two antipodal modes of the sources. In Table 3
we assess the error in parameter recovery as a multiple of the theoretical error, computed from
the Fisher matrix. This is one of the standard approaches used to assess entries to the LISA
Mock Data Challenge and so we include it for easier comparison to other algorithms discussed
in the literature. The posterior distributions are not Gaussian, which is assumed for the Fisher
matrix estimate, so we do not necessarily expect these to be an accurate indication of the
true error. In practice, we will estimate the uncertainty in the parameter estimation from
the recovered posterior and, in all cases, the true solution lies within 1-σ of the peak of the
recovered posterior, when σ is computed from the posterior in that way. We quote results in
Table 3 for the two true solutions, and for the two antipodal sky solutions. Errors are quoted
for the redshifted chirp mass, ¯
Mc, and reduced mass, ¯ µ, rather than ¯
former are used more conventionally in the literature and this therefore facilitates comparison
to other work, in particular the evolutionary algorithm [8].
We see that the algorithm recovered all of the intrinsic parameters to extremely high
precision, being within 1σ of the true parameters for all but the antipodal sky solution of the
second source. As mentioned earlier, the ‘antipodal’ solution is no longer exactly antipodal
on the sky at higher frequencies. Thus, the apparently larger errors in the parameters for the
antipodal solution do not mean that the algorithm failed to find the true peak of the likelihood,
but merely that the secondary is shifted from the precisely antipodal position. These results
can be directly compared to results obtained using the evolutionary algorithm in [8]. The
MULTINEST results are slightly better than those of the evolutionary algorithm, although in
both cases the results were within the error we would expect from noise fluctuations and so
this difference could easily be due to a difference in the noise realisation used to generate
the data. We note that the evolutionary algorithm also recovered parameters for the antipodal
solution of the non-coalescing source that were about 4σ from the true parameters, which is
consistent with the prior explanation for that difference. Presently, the MULTINEST algorithm
runs about five to ten times faster than the evolutionary algorithm, but the latter has not yet
been optimized and further work is needed to understand how the efficiency and speed of both
algorithms will scale when they are applied to other types of gravitational wave source.
The fact that the intrinsic parameters of the coalescing source are all so well determined
is due to a combination of the particular noise realisation used and the strong correlations
between the intrinsic parameters. The correlations ensure that if one of the parameters is well-
determined then all of the parameters will be well-determined. The results seem surprising
because the error is so low in all of the parameters, which, if the parameters were uncorrelated,
would require an usually low error in five independent variables. However, in reality we only
require an unusually low error in one parameter and then correlations force all the errors to
be small. Over many noise realisations we would expect the errors in the intrinsic parameters
to wander over the range shown in the posteriors, but this wandering would be correlated
between the different intrinsic parameters.
For the second stage of the search, we ran MULTINEST on the full nine-dimensional
parameter space. We ran a separate analysis, each with 500 ‘live’ points, in the vicinity of
each of the modes identified during the first stage of the search, taking priors on the intrinsic
parametersthatwereλ ∈ [˜λ−5σ,˜λ+5σ], where˜λandσ werethemeanandstandarddeviation
m1and ¯
m2, since the
Page 12
MULTINEST for gravitational wave data analysis 12
tc/yrs
¯
m1/M?
(2.0005 ± 0.0016) × 107
(2.0015 ± 0.0014) × 107
(7.2743 ± 0.0740) × 106
(7.5472 ± 0.0727) × 106
¯
m2/M?
(1.9995 ± 0.0014) × 106
(1.9985 ± 0.0012) × 106
(1.7852 ± 0.0147) × 106
(1.7329 ± 0.0134) × 106
1
1A
2
2A
0.9000 ± 1 × 10−5
0.9000 ± 1 × 10−5
1.0200 ± 2 × 10−4
1.0200 ± 2 × 10−4
θφ
1
1A
2
2A
0.6269 ± 0.0047
2.5131 ± 0.0048
2.1267 ± 0.0060
1.0156 ± 0.0059
4.7312 ± 0.0092
1.5892 ± 0.0087
3.9480 ± 0.0076
0.8076 ± 0.0077
Table 2. Parameter inferences derived from the first, F-statistic, stage of the search. The
quoted uncertainties are the one sigma errors from the posterior recovered by MULTINEST.
We list the parameter inferences for both the true and antipodal (‘A’) sky solutions for each
source.
¯
Mc/M?
1.289 × 103
0.1164
1.198 × 103
0.1336
6.986 × 102
0.7587
7.025 × 102
3.3452
¯ µ/M?
4.719 × 103
0.0763
4.405 × 103
0.0795
8.642 × 103
1.198
8.683 × 103
4.0735
tc/yrs
θφ
1
σFIM
∆λ
σFIM
∆λ
σFIM
∆λ
σFIM
∆λ
1.209 × 10−5
0.0511
1.128 × 10−5
0.9247
3.292 × 10−5
1.1562
3.301 × 10−5
2.7418
6.315 × 10−3
0.0019
9.683 × 10−3
0.1532
6.283 × 10−3
0.9742
6.446 × 10−3
0.3818
1.159 × 10−2
0.2036
8.529 × 10−3
0.6597
7.854 × 10−3
1.0675
7.631 × 10−3
1.0907
1A
2
2A
Table 3. Errors in the intrinsic parameter estimation from the first stage of the search. We
quote errors for both the true and antipodal (‘A’) sky solutions for each source. The first row
for each solution gives the one sigma estimation from the Fisher matrix, σFIM., while the
second row gives ∆λ = |(λT− λR)/σFIM|, where λTdenotes the true parameter value,
and λRdenotes the recovered value of the parameter.
in that parameter as estimated from the posterior recovered in the first stage of the search. In
addition, we took natural priors on the four extrinsic parameters — ι ∈ [0,π], φc ∈ [0,π],
ψ ∈ [0,π] and DL ∈ [2,15]Gpc. If the population of supermassive black hole mergers
was expected to be uniformly distributed in space, then the natural prior on DLwould be
proportional to D2
assembly of structure and will therefore be complicated and very model-dependent. For that
reason, we preferred to use a minimal prior, i.e., a flat prior. This is consistent with the
approach used for the MLDC, in which the source SNRs are drawn from a flat prior [2]. The
prior on φcis [0,π] rather than [0,2π] since there is an angle degeneracy in the extrinsic
subspace corresponding to the shift ψ → ψ + π/2, φc→ φc+ π.
In the second stage of the search, only one mode of the likelihood was found within
L. However, in practice the distribution of events will trace the hierarchical
Page 13
MULTINEST for gravitational wave data analysis13
the tight priors on the intrinsic parameters provided by the first stage. The second stage of
the search required ∼ 50,000 likelihood evaluations and took ∼ 30 minutes on a single 3.0
GHz Intel Woodcrest processor for each mode. The marginalized posterior distributions for
the true sky locations of the bright and faint sources are shown in Figs. 3 and 4 respectively.
The posteriors for the faint source are very broad in the extrinsic parameters. This is to
be expected, as we do not observe the coalesence for this source and so the error in the
estimate of the phase at coalesence, φc, is very large, and propagates into other parameters
due to correlations. In Table 4 we quote the values of the extrinsic parameters recovered
by the search for the two true modes and the two antipodal modes, and also quote errors as
multiples of the one sigma Fisher Matrix error estimate as before. The antipodal solutions
have different values for ι and ψ as well, which are obtained by the transformation ι → π−ι,
ψ → π − ψ [38], so the true values of the extrinsic parameters at the antipodal sky
location are (ι,ψ,θ,φ) = (2.0296,1.9086,2.5133,1.5708) for the coalescing source,“1”,
and (2.4851,0.5062,1.0210,0.8013) for the non-coalescing source, “2”. We do not include
the intrinsic parameters in this table as their values did not change significantly in the second
stage. The posteriors shown in Figs. 3–4 were computed during the second search stage for all
parameters. We see that the algorithm recovers the extrinsic parameters for both sources and
both sky solutions to the same high precision as the intrinsic parameters. The posteriors are
highly non-Gaussian for the extrinsic parameters of the non-coalescing source in particular, so
the Fisher matrix error should not be trustworthy. However, it appears to be giving predictions
that are largely consistent with the recovered posteriors.
The MULTINEST algorithm also returns the evidence associated with each of the
recovered modes. These log-evidences were 19,948.5 and 19,948.2 for the true and antipodal
modes of the coalescing source and 8,551.3 and 8,555.1 for the true and antipodal modes of
the non-coalescing source. These evidence values do not give very strong reason to favour one
of the two sky-position modes over the other, but the sky position degeneracy is nearly perfect
for the sources we are considering and therefore we would not necessarily expect to be able
to distinguish them. However, the evidences of the 7 other modes identified by MULTINEST
were several hundred lower than the evidences of the true and antipodal modes of the source
to which they corresponded. There would therefore be good reason to favour the true and
antipodal modes over the other modes in the likelihood. If we were using full TDI waveforms
at higher frequency we would expect to be able to distinguish between the true and antipodal
solutions, and this should be reflected in the evidence values. More investigation is needed
to fully understand the utility of evidence for characterisation of modes in the likelihood for
these and other gravitational wave sources.
6. Discussion
We have described the use of a multi-modal nested sampling algorithm, MULTINEST, as
a tool for gravitational wave data analysis, illustrating the algorithm by searching for the
signals from non-spinning supermassive black hole binaries in simulated LISA data. We used
MULTINEST to search a data stream containing two such mergers, and the algorithm was
able to simultaneously and successfully identify both the true sky position of each source, and
the antipodal sky solution, plus several other secondary modes of the likelihood surface. In
a first stage search using the F-statistic, the algorithm is able to recover the intrinsic source
parameters to very high precision, within 2σ of the true answer as measured by the theoretical
noise-induced error computed from the Fisher Matrix. Following up with a search on the full
physical, nine-dimensional, parameter space, the search is also able to recover the extrinsic
parameters to similar precision. In addition, the algorithm naturally recovers the posterior
Page 14
MULTINEST for gravitational wave data analysis 14
Figure 3. Marginalized 1D and 2D posteriors for the primary mode of the bright, coalescing
source (source 1) determined in the second stage analysis using the full likelihood with priors
on the intrinsic parameters derived from the posteriors recovered in the first, F-statistic,
stage. Each plot shows the 2D posterior over the row (vertical axis) and column (horizontal
axis) parameters. At the top of each column is the 1D posterior for that column parameter.
The parameters and the parameter ranges shown in the plots, from top to bottom or left
to right, are 0.63 < θ < 0.64, 4.69 < φ < 4.74, 0.899992 < tc < 0.900008,
1.993 × 107M?< ¯
0.9 < ι < 1.2, 2 < φc < 6 and 1 < ψ < 3 respectively. The true parameter values are
shown by the blue vertical lines and red crosses overlayed on the 1D and 2D marginalized
posterior plots respectively.
m1< 2.007 × 107M?, 1.993 × 106M?< ¯m2< 2.007 × 106M?,
Page 15
MULTINEST for gravitational wave data analysis 15
Figure 4. As Figure 3, but now for the faint, non-coalescing source (source 2). The parameter
ranges shown in the plots, from top to bottom or left to right, are 2.11 < θ < 2.14,
3.93 < φ < 3.97, 1.01988 < tc < 1.02004, 7.1 × 106M? < ¯
1.74 × 106M? < ¯
respectively.
m1 < 7.5 × 106M?,
m2 < 1.81 × 106M?, 0 < ι < 0.8, 0 < φc < 2π and 0 < ψ < π
Page 16
MULTINEST for gravitational wave data analysis16
ιψφc
DL/Gpc
1
λR
σFIM
∆λ
λR
σFIM
∆λ
λR
σFIM
∆λ
λR
σFIM
∆λ
1.1245
0.0468
0.2668
2.0251
0.0468
0.0948
0.7052
0.1573
0.3096
2.4844
0.1573
0.0046
1.2271
0.0442
0.1340
1.9055
0.0442
0.0694
2.8670
0.2201
1.0519
0.3116
0.2201
0.8839
2.2203
0.5657
0.0029
2.2119
0.5657
0.0109
4.7653
1.4276
0.0785
4.9330
1.4276
0.2629
6.5655
3.3787
0.0203
6.5973
3.3787
0.0179
2.7038
2.6170
0.8866
5.7121
2.6170
0.1960
1A
2
2A
Table 4. Maximum a posteriori values of the extrinsic parameters found in the second stage
of the search. We quote results for both the two true and the two antipodal sky solutions, as
for the intrinsic parameters. For each solution, the first row indicates the recovered value of
the parameter, λR, the second row lists the one sigma error estimate from the Fisher Matrix,
σFIM, and the third row gives ∆λ = |(λT− λR)/σFIM|, where λTis the true value of the
parameter.
distributions and the Bayesian evidence associated with each mode of the solution.
The algorithm works extremely quickly, taking about two and a half hours on a single
CPU in this case to complete the whole search. MULTINEST is fully parallel, which reduces
the run time for the whole search to ∼ 20 minutes when using 10 processors.
importantly, it achieves these results without using any specific knowledge of the signals
for which it is searching — the waveform model enters only in the likelihood evaluation and
not in determining how the ‘live’ point set is updated. While many effective algorithms for
searching for non-spinning SMBH binaries are already known, these sources have a multi-
modal likelihood which provides a good test case with which to scope out the effectiveness
of this algorithm for gravitational wave data analysis. The real power of this algorithm is in
its ability to simultaneously identify and characterize all modes of a multi-modal likelihood
surface. For signals from EMRIs, the presence of many secondaries in the likelihood has
caused problems in data analysis [3, 4, 5, 6], and this is likely to be true in searches for
spinning black hole binaries as well. MULTINEST is perfectly adapted to tackling these types
of problem, and so these are the next challenges for the algorithm. Research is needed to
investigate how the run time, number of ‘live’ points needed etc. will scale for these other
problems. However, the fact that the algorithm performed so well in the non-spinning black
hole case, without using any knowledge of the properties of the waveforms, is reason to expect
MULTINEST will also perform strongly in other gravitational wave data analysis applications,
for both space-based and ground-based detectors.
A potentially even more important application of the algorithm in the context of
gravitational-wave data analysis will be for evidence evaluation. MULTINEST computes ev-
idence values as it progresses, which can be used for model selection, e.g., to determine the
number of a particular type of source that are present in a data stream or to test alternative
theories of gravity etc. Model selection questions have been explored in a LISA context using
other algorithms, including MCMC [24] and nested sampling [12]. The speed and efficiency
More
Page 17
MULTINEST for gravitational wave data analysis 17
of MULTINEST in answering similar questions should be explored in the future, in order to
compare and contrast to these other techniques. The results described in this paper suggest
that MULTINEST will also perform extremely well in such a model evaluation context.
Acknowledgements
This work was performed using the Darwin Supercomputer of the University of Cambridge
High Performance Computing Service (http://www.hpc.cam.ac.uk/), provided by
Dell Inc. using Strategic Research Infrastructure Funding from the Higher Education Funding
Council for England and the authors would like to thank Dr. Stuart Rankin for computational
assistance. The SMBH waveform and F-Statistic codes used in this work were jointly
developed by EKP and Neil J. Cornish at Montana State University. FF is supported by
the Cambridge Commonwealth Trust, Isaac Newton and the Pakistan Higher Education
Commission Fellowships. JG’s work is supported by the Royal Society.
References
[1] K. Danzman et al., LISA pre-phase A report, Max-Planck-Institute fur Quantenoptic, Report MPQ 233 (1998).
[2] Babak S. et al., Class. Quantum Grav. 25 184026 (2008).
[3] Gair J R, Mandel I & Wen L, Class. Quantum Grav. 25 184031 (2008).
[4] Cornish N J, preprint arXiv:0804.3323 (2008)
[5] Gair J R, Porter E K, Babak S & Barack L Class. Quantum Grav. 25 184030 (2008).
[6] Babak S., Gair J. R. & Porter E. K. preprint arXiv:0902.4133 (2009).
[7] Key J. S. & Cornish N. J., Phys.Rev.D79 043014 (2009).
[8] Gair J. R. & Porter E. K., preprint arXiv:0903.3733 (2009).
[9] Feroz F., Hobson M. P., 2008, MNRAS 384, 449
[10] Feroz F., Hobson M. P. & Bridges M., preprint arXiv:0809.3437 (2008).
[11] Skilling J., 2004, in Fischer R., Preuss R., Toussaint U. V., eds, American Institute of Physics Conference Series
Nested Sampling. pp 395–405
[12] Veitch J. & Vecchio A., Class. Quantum Grav. 25 184010 (2008).
[13] Feroz F., Marshall P. J. & Hobson M. P., preprint arXiv:0810.0781 (2008).
[14] Feroz F., Hobson M. P., Zwart J. T. L., Saunders R. D. E. & Grainge K. J. B., preprint arXiv:0811.1199 (2008).
[15] Feroz F., Allanach B. C., Hobson M., Abdus Salam S. S., Trotta R., Weber A. M., 2008, Journal of High Energy
Physics, 10, 64
[16] Trotta R., Feroz F., Hobson M., Roszkowski L., Ruiz de Austri R., 2008, Journal of High Energy Physics, 12,
24
[17] Lang R. N. & Hughes S. A., Phys.Rev.D74 122001 (2006).
[18] Cornish N. J. & Porter, E. K., Class. Quantum Grav. 24 5729 (2007).
[19] Cornish N. J. & Porter, E. K., Phys.Rev.D75 021301 (2007).
[20] Babak S., Class. Quantum Grav. 25 195011 (2008).
[21] Harry I. W., Fairhurst S. & Sathyaprakash B. S., Class. Quantum Grav. 25 184027 (2008).
[22] Babak S. et al., Class. Quantum Grav. 25 114037 (2008).
[23] Brown D. A., Crowder J., Cutler C., Mandel I. & Vallisneri M., Class. Quantum Grav. 24 S595 (2007).
[24] Littenberg T. B. & Cornish N. J., preprint arXiv:0902.0368 (2009).
[25] Hobson M. P., Bridle S. L., Lahav O., 2002, MNRAS 335, 377
[26] Liddle A. R., MNRAS Lett. 377 L74 (2007).
[27] Mackay D. J. C., 2003, Information Theory, Inference and Learning Algorithms. Information Theory, Inference
and Learning Algorithms, by David J. C. MacKay, pp. 640. ISBN 0521642981. Cambridge, UK: Cambridge
University Press, October 2003.
[28] ´O Ruanaidh J., Fitzgerald W., 1996, Numerical Bayesian Methods Applied to Signal Processing. Springer
Verlag:New York
[29] Blanchet L., Iyer B. R., Will C. M. & Wiseman A. G., Class. Quantum Grav. 13 575 (1996).
[30] Cutler C., Phys.Rev.D57 7089 (1998).
[31] Jaranowski P., Kr´ olak A. & Schutz B. F., Phys.Rev.D58 063001 (1998).
[32] Helstrom S. W., Statistical Theory of Signal Detection (London: Pergamon) (1968).
[33] Owen B. J. Phys.Rev.D53 6749 (1996).
Page 18
MULTINEST for gravitational wave data analysis 18
[34] Cornish N. J. Phys.Rev.D65 022004 (2001).
[35] Nelemans G., Yungelson L. R. & Portegies-Zwart S. F., MNRAS 349 181 (2004).
[36] Timpano S., Rubbo L. & Cornish N. J., Phys.Rev.D73 122001 (2006).
[37] Verdi L. et al. ApJS 148 195 (2003).
[38] Arnaud K. et al. Class. Quantum Grav. 24, S529 (2007).
Download full-text