Testing the minimum variance method for estimating large-scale velocity moments
ABSTRACT The estimation and analysis of large-scale bulk flow moments of peculiar
velocity surveys is complicated by non-spherical survey geometry, the
non-uniform sampling of the matter velocity field by the survey objects and the
typically large measurement errors of the measured line-of-sight velocities.
Previously, we have developed an optimal `minimum variance' (MV) weighting
scheme for using peculiar velocity data to estimate bulk flow moments for
idealized, dense and isotropic surveys with Gaussian radial distributions, that
avoids many of these complications. These moments are designed to be easy to
interpret and are comparable between surveys. In this paper, we test the
robustness of our MV estimators using numerical simulations. Using MV weights,
we estimate the bulk flow moments for various mock catalogues extracted from
the LasDamas and the Horizon Run numerical simulations and compare these
estimates to the moments calculated directly from the simulation boxes. We show
that the MV estimators are unbiased and negligibly affected by non-linear
flows.
-
Citations (0)
-
Cited In (0)
Page 1
Mon. Not. R. Astron. Soc. 000, 000–000 (0000) Printed 4 January 2012 (MN LATEX style file v2.2)
Testing the Minimum Variance Method for Estimating
Large Scale Velocity Moments
Shankar Agarwal1,?& Hume A. Feldman1,†& Richard Watkins2,‡
1Department of Physics & Astronomy, University of Kansas, Lawrence, KS 66045, USA.
2Department of Physics, Willamette University, Salem, OR 97301, USA.
emails:?sagarwal@ku.edu;†feldman@ku.edu;‡rwatkins@willamette.edu
ABSTRACT
The estimation and analysis of large-scale bulk flow moments of peculiar velocity
surveys is complicated by non-spherical survey geometry, the non-uniform sampling
of the matter velocity field by the survey objects, and the typically large measure-
ment errors of the measured line-of-sight velocities. Previously we have developed an
optimal “minimum variance” (MV) weighting scheme for using peculiar velocity data
to estimate bulk flow moments for idealized dense and isotropic surveys with Gaus-
sian radial distributions that avoids many of these complications. These moments are
designed to be easy to interpret and are comparable between surveys. In this pa-
per, we test the robustness of our MV estimators using numerical simulations. Using
MV weights, we estimate the underlying bulk flow moments for DEEP, SFI++ and
COMPOSITE mock catalogues extracted from the LasDamas and the Horizon Run
numerical simulations and compare these estimates to the true moments calculated
directly from the simulation boxes. We show that the MV estimators are negligibly
affected by nonlinear flows; in particular they are unbiased and have errors that are
consistent with predictions from linear theory.
Subject headings: cosmology: distance scales
large scale structure of the universe
tion cosmology: theory galaxies: kinematics and dynamics
galaxies: statistics.
cosmology:
cosmology: observa-
1 INTRODUCTION
Peculiar velocities are a sensitive probe of the underlying
large-scale matter density fluctuations in our Universe. In
particular, large, all-sky surveys of the peculiar velocities of
galaxies or clusters of galaxies can provide important con-
straints on cosmological parameters. However, studies of pe-
culiar velocities suffer from several drawbacks, including: (i)
The presence of small-scale, nonlinear flows, such as infall
into clusters, can potentially bias analyses which typically
rely on linear theory, (ii) Sparse, non-uniform sampling of
the peculiar velocity field can lead to aliasing of small-scale
power onto large scales and bias due to heavier sampling
of dense regions, (iii) Large measurement uncertainties of
individual peculiar velocity measurements, particularly for
distant galaxies or clusters, make it necessary to work with
large surveys in order to extract meaningful constraints.
These difficulties have often been addressed by calculat-
ing statistics from peculiar velocity surveys that are designed
to primarily reflect large-scale flows which are well described
by linear theory. The most common statistic used is the bulk
flow, which represents the average motion of the objects in
a survey. The bulk flow statistic has been investigated ex-
tensively by many groups (Dressler & Faber 1990; Kaiser
1991; Feldman & Watkins 1994; Watkins & Feldman 1995;
Strauss et al. 1995; Jaffe & Kaiser 1995; Hudson et al. 1999;
da Costa et al. 2000a; Hudson et al. 2004; Parnovsky & Tu-
gay 2004; Sarkar et al. 2007; Kashlinsky et al. 2008, 2010;
Macaulay et al. 2011; Ma et al. 2011; Nusser et al. 2011;
Nusser & Davis 2011; Abate & Feldman 2011). However,
bulk flow estimates can be difficult to interpret since how
they sample the peculiar velocity field depends strongly on
the characteristics of the particular survey being considered.
In addition, results from bulk flow analyses have often been
controversial, highlighting the importance of developing a
robust bulk flow statistic that is easy to interpret and that
can be compared between surveys with different geometries.
In Watkins et al. (2009) (hereafter Paper I) and Feld-
man et al. (2010) (hereafter Paper II), we developed the
“Minimum Variance” (MV) moments that were designed to
estimate the bulk flow of a volume of a given scale rather
than a particular peculiar velocity survey. We stress that the
MV moments do not represent the bulk motion of the galax-
ies in a survey, rather they are estimates of the bulk motion
of a given volume of space. The MV algorithm was designed
c ? 0000 RAS
arXiv:1201.0128v1 [astro-ph.CO] 30 Dec 2011
Page 2
2Agarwal & Feldman & Watkins
to make a clean estimate of the large-scale bulk flow as a
function of scale using the available peculiar velocity data.
Essentially, each velocity datum in a real survey is weighted
in a way that minimizes the variance of the difference be-
tween the MV-weighted bulk flow of the real survey and an
idealized survey bulk flow, on a characteristic scale R. The
MV analysis suggested bulk flow velocities well in excess
of expectations from ΛCDM model with WMAP7 (Larson
et al. 2010) central parameters.
Indeed there are a few recent observations that suggest
that the standard model may be incomplete. Large-scale
anomalies found in the maps of temperature anisotropies
in the CMB (Sarkar et al. 2010; Copi et al. 2010; Bennett
et al. 2011); a recent estimate (Lee & Komatsu 2010) of
the occurrence of high-velocity merging systems such as the
Bullet Cluster is unlikely at a ∼ 6σ level; large excess of
power in the statistical clustering of luminous red galaxies
(LRG) in the photometric SDSS galaxy sample (Thomas
et al. 2011); Kovetz et al. (2010) find a unique direction
in the CMB sky determined by anomalous mean tempera-
ture ring profiles, also centered about the direction of the
flow detected above; larger than expected cross correlation
between samples of galaxies and lensing of the CMB (Ho
et al. 2008; Hirata et al. 2008); type Ia Supernovae (SNIa)
seem to be brighter than expected at High Redshift (Kowal-
ski et al. 2008); small voids (∼ 10 Mpc) are observed to be
much emptier than predicted (Gottl¨ ober et al. 2003); ob-
servations indicate denser high concentration cluster haloes
than the shallow low concentration and density profile pre-
dictions (de Blok 2005; Gentile et al. 2005).
In this paper, we use N-body simulations to investigate
the robustness of our MV scheme for estimating the bulk
flow moments of the underlying velocity field, over a volume
of a particular scale. First, we extract a mock catalogue (de-
scribed in Sec. 3) from N-body simulations. Given this mock
catalogue, we use our MV algorithm (described in Sec. 2) to
estimate the bulk flow moments {ux,uy,uz} of the underly-
ing velocity field over a volume of a particular scale. Second,
we position ourselves in the N-body simulation box at the
location of the center of the mock catalogue, and calculate
the “true” moments {Vx,Vy,Vz} by averaging the veloci-
ties of all the galaxies in the simulation box; each galaxy
being weighted by a Gaussian radial distribution function
f(r) = e−r2/2R2
tion. Note that a large number of particles in the simulation
box is preferable, as it makes the “true” moments a good
representation of the bulk of the underlying velocity field.
Finally, we compare the MV-weighted moments {ux,uy,uz}
with the Gaussian-weighted “true” moments {Vx,Vy,Vz}. A
close match between the two would indicate that the MV
scheme accurately estimates the underlying bulk flow.
G where RG is the width of the distribu-
In Sec. 2 we review the formalism to construct estima-
tors for the idealized survey bulk flow moments. In Sec. 3
we describe the simulations we use and surveys we model to
extract the mock catalogues. In sec. 4 we compare the MV-
weighted bulk flow moments with the Gaussian-weighted
“true” moments . We discuss our results and conclude in
Sec. 5.
2THE MINIMUM VARIANCE METHOD
Individual
plagued by large uncertainties and contributions from small-
scale, nonlinear processes which are difficult to model the-
oretically. Both of these problems can be greatly reduced if
instead of considering individual velocities an average veloc-
ity over a sample, commonly called the bulk flow, is worked
with. The three components of the bulk flow ui can be writ-
ten as weighted averages of the measured radial peculiar
velocities of a survey
?
where Sn is the radial peculiar velocity of the n-th galaxy
of a survey, and wi,n is the weight assigned to this velocity
in the calculation of ui. By far the most common weighting
scheme used in studies of the bulk flow, which we will call the
MLE (Maximum Likelihood Estimate) method, is obtained
from a maximum likelihood analysis introduced by Kaiser
(1988). By modeling galaxy motions as being due to a uni-
form flow and assuming Gaussian distributed measurement
uncertainties, the likelihood function
?
is obtained, where ˆ rn is the unit position vector of the n-th
galaxy, σnis the measurement uncertainty of the n-th galaxy
and σ∗ is a 1-D velocity dispersion accounting for smaller-
scale motions. Maximizing this likelihood gives a bulk flow
estimate of the form of Eq. 1, with weights
radialpeculiarvelocity measurements are
ui =
n
wi,nSn, (1)
L(ui,σ∗) =
n
1
?
σ2
n+ σ2
∗
exp
?−1
2(Sn− ˆ rn,iui)2
σ2
n+ σ2
∗
?
.(2)
wi,n =
3
?
j=1
A−1
ij
ˆ rn,j
σ2
n+ σ2
∗,(3)
where
Aij =
?
n
ˆ rn,iˆ rn,j
σ2
n+ σ2
∗. (4)
These weights play the dual roles of accounting for geomet-
rical factors, e.g. picking out the x component of velocities
in a calculation of ux, and down-weighting velocities with
large uncertainties. However, the fact that velocity uncer-
tainties are typically proportional to distance, together with
the sparseness of velocity catalogues at their outer edges,
means that nearby objects are greatly emphasized in calcu-
lations of the MLE bulk flow. Indeed, studies of the window
functions of these moments (Paper I) have shown that MLE
bulk flow moments of a survey are typically sensitive to flows
on scales much smaller than the survey’s physical diameter,
thus complicating their interpretation.
In Paper I, we introduced an alternative to the MLE
weights that yield bulk flow moments that are much eas-
ier to interpret. First, we imagine an idealized survey con-
taining radial velocities that well sample the velocity field
in a region. This survey consists of a large number of ob-
jects with zero measurement uncertainty. For simplicity, the
radial distribution of this idealized survey is taken to be a
gaussian profile of the form n(r) ∝ e−r2/2R2
Gwhere RGgives
c ? 0000 RAS, MNRAS 000, 000–000
Page 3
Testing the Minimum Variance Method for Estimating Large Scale Velocity Moments3
a measure of the depth of the survey. This idealized survey
has easily interpretable bulk flow components Ui that are
not affected by aliasing and which reflect the motion of a
well defined volume. Our goal is to construct estimators for
the idealized survey bulk flow components Ui, out of the
measured radial peculiar velocities Sn and positions rn con-
tained in a real survey. We assume that Sn can be expressed
as Sn = vn+ δn, where vn is the radial component of the
linear peculiar velocity field at the location of the object
and δn accounts for the measurement noise as well as any
nonlinear flow, e.g. infall into a cluster. In order to calculate
the weights to use for the bulk flow estimators, we minimize
the variance ?(ui−Ui)2?, where the average is over different
realizations of a particular power spectrum. Expanding this
expression out using Eq. 1 for the bulk flow estimate, we
obtain
?(ui− Ui)2?
=
?
−2
n,m
wi,nwi,m?SnSm? + ?U2
?
i?
(5)
n
wi,n?Uivn?,
where we have used the fact that the measurement error
included in Sn is uncorrelated with the bulk flow Ui.
Before we minimize this expression with respect to the
weights wi,n, we impose the following constraint introduced
in Paper II. Suppose that the velocity field were a pure bulk
flow, so that Sn = Uigi(rn) + δn, where Ui are the 3 bulk
moments {Ux,Uy,Uz}; gi(rn) are the direction cosines of
the n-th galaxy {ˆ rn,x, ˆ rn,y, ˆ rn,z} and δn is the noise due
to measurement error. We ask that the estimators ui give
the correct amplitude for the flow on average, namely that
?ui? = Ui. Plugging the expression for Sninto Eq. 1 give the
constraint that
?
This set of three constraints is implemented using La-
grange multipliers, so that we derive the desired weights by
taking a derivative of the expression
?
?
with respect to wi,n and setting the resulting expression
equal to zero. Solving for the weights then gives
?
2
n
wi,ngj(rn) = δij
(6)
m,n
wi,mwi,n?SmSn? + ?U2
??
i? − 2
?
?
n
wi,n?Uivn?
(7)
+
j
λij
n
wi,ngj(rn) − δij
wi,n =
?
m
G−1
mn
?SmUi? −1
?
j
λijgj(rm)
?
, (8)
where G is the covariance matrix of the individual measured
velocities, Gmn = ?SmSn?. The Lagrange multipliers can be
found by plugging Eq. 8 into Eq. 6 and solving for λij,
?
m,n
λij =
?
k
M−1
ik
??
G−1
mn?SmUk?gj(rn) − δjk
??
,(9)
where the matrix M is given by
?
In linear theory, the correlation ?SmUi? and the covari-
ance matrix G that appear in our expression for wi,n can
be calculated for a given density power spectrum P(k) (for
details see Paper II).
Mij =1
2
m,n
G−1
mngi(rn)gj(rm) (10)
?SmUi?
=
N?
?
N?
?
n?=1
w?
i,n??Smvn??
(11)
=
n?=1
w?
i,n?H2
0Ω1.1
2π2
m
?
dk P(k)fmn?(k)
where
w?
i,n? =
3
?
j=1
A−1
ij
ˆ r?
n?,j
N?
are the weights of an ideal, isotropic survey consisting of
N?exact radial velocities vn? measured at randomly selected
positions r?
n? with
Aij =
N?
?
n?=1
ˆ r?
n?,iˆ r?
N?
n?,j
.
Gmn
=
H2
0Ω1.1
2π2
?ˆ rn· v(rn) ˆ rm· v(rm)? + δmn(σ2
where fmn(k) is the angle averaged window function,
?
×exp?ikˆk · (rn− rm)?
Thus, given a peculiar velocity survey and a power spec-
trum model P(k) we can calculate the weights wi,n for esti-
mating the MV moments. We use the power spectrum model
given by Eisenstein & Hu (1998) with WMAP7 (Larson et
al. 2010) central parameters.
m
?
dk P(k)fmn(k) + δmn(σ2
∗+ σ2
n) (12)
=
∗+ σ2
n)
fmn(k)=
d2ˆk
4π
?ˆ rn·ˆk??ˆ rm·ˆk?
(13)
3MOCK CATALOGUES
The N-body simulations we use in our analysis are (i) Las-
Damas (hereafter LD) (McBride et al. 2011) and (ii) Horizon
Run (hereafter HR) (Kim et al. 2009). These are designed
to model the Sloan Digital Sky Survey (SDSS) observations.
The LD (HR) simulation parameters are: Ωm = 0.25 (0.26),
Ωb = 0.04 (0.044), ΩΛ = 0.75 (0.74), h = 0.7 (0.72), σ8 =
0.8 (0.794), ns = 1.0 (0.96) and LBox = 1 (6.592)h−1Gpc for
the matter, baryonic and cosmological constant normalized
densities, the Hubble constant, the normalization of matter
density fluctuations, the primordial spectral index and the
simulation box size respectively. The HR simulations sam-
ple the density field at z = 0 and identifying galaxies using
c ? 0000 RAS, MNRAS 000, 000–000
Page 4
4Agarwal & Feldman & Watkins
Figure 1. Top row: DEEP catalogue (left) and its radial distri-
bution (right). Bottom row: DEEP mock catalogue (left) and its
radial distribution (right).
subhalos (Kim et al. 2008) whereas the LD simulations has
information at z = 0.13 and bound groups of dark matter
particles (halos) are identified using a parallel friends-of-
friends (FOF) code (Gardner et al. 2007).
The LD data we use consists of 41 independent realiza-
tions, each in a 1h−1Gpc box with the same initial power
spectrum. We extract 100 mock catalogues from each of the
41 LD boxes, for a total of 4100 mocks. The mock centers are
randomly chosen inside the box. The mocks are extracted in
a way that they come as close as possible to the radial dis-
tribution of real catalogues. The HR simulation is a single
realization in a much bigger 6.592h−1Gpc box. As such, we
extract 5000 randomly distributed mocks.
We create mocks of three different peculiar velocity sur-
veys from the simulations: i) The “DEEP” compilation in-
cludes 103 SNIa (Tonry et al. 2003), 70 SC Tully-Fisher
(TF) clusters (Giovanelli et al. 1998b; Dale et al. 1999a), 56
SMAC fundamental plane (FP) clusters (Hudson et al. 1999,
2004), 50 EFAR FP clusters (Colless et al. 2001) and 15 TF
clusters (Willick 1999). The DEEP catalogue consists of 294
data points with a characteristic MLE depth of 50 h−1Mpc,
calculated using?wnrn/?wnwhere the MLE weights are
catalogue (Masters et al. 2006; Springob et al. 2007, 2009)
is the densest and most complete peculiar velocity survey
wn = 1/(σ2
n+σ2
∗). We assume σ∗ = 150km/s. ii) The SFI++
Figure 2. Top row: COMPOSITE catalogue (left) and its radial
distribution (right). Bottom row: COMPOSITE mock catalogue
(left) and its radial distribution (right). The mock does not have
as many close by objects as there are in the COMPOSITE cata-
logue.
of field spirals to date. We use the data from the corrected
dataset (Springob et al. 2009), the sample consists of 2821
TF field galaxies. The characteristic depth is 34 h−1Mpc.
iii) The “COMPOSITE” catalogue is a compilation of the
DEEP and SFI++ catalogues as well as the group SFI++
catalogue (Springob et al. 2009), the ENEAR (da Costa
et al. 2000b; Bernardi et al. 2002; Wegner et al. 2003) survey
and a surface brightness fluctuations (SBF) survey (Tonry
et al. 2001). With 4481 data points, the COMPOSITE cat-
alogue has a characteristic depth of 33 h−1Mpc. The DEEP
and SFI++ catalogues are completely independent whereas
the COMPOSITE is a compilation of these and other cat-
alogues. For further details on these catalogues see Paper I
and II. We have used these particular catalogues to investi-
gate the effect of geometry and density on our results. The
reason for using these catalogues is that we want to compare
the results using a very sparse catalogue (DEEP) and the
better sky coverage and higher density of the COMPOSITE
catalogue. We chose the SFI++ catalogue as an intermedi-
ate case study. We tested our MV formalism on the DEEP,
SFI++ and COMPOSITE mocks extracted from the LD
and HR simulations. As we mentioned earlier, we extracted
4100 mocks from the LD simulations and 5000 from the
HR simulation. The results based on the 5000 mock surveys
c ? 0000 RAS, MNRAS 000, 000–000
Page 5
Testing the Minimum Variance Method for Estimating Large Scale Velocity Moments5
from the HR simulation are virtually identical to the ones
for the LD simulations. As such, in the rest of this paper, we
display results only for the 4100 mocks extracted from the
LD simulations. Moreover, since our results for the SFI++
catalogue are very similar to the ones for the DEEP and
COMPOSITE catalogues, we do not display SFI++ results.
In Figs. 1 − 2, we show the DEEP and COMPOS-
ITE real catalogues (top rows) and a sample mock cata-
logue (bottom rows). The N-body simulations do not have
as many close by objects as there are in the COMPOSITE
catalogue, which is why the COMPOSITE mocks match the
radial distribution only beyond ∼ 50h−1Mpc.
Once we have identified a random point in the simula-
tion box, we extract a set of galaxies that has the same radial
selection function about this point as the catalogue we are
creating mocks of. To make the mocks more realistic, we also
impose a 10olatitude zone-of-avoidance cut. From the sim-
ulations we find the angular position, the true line-of-sight
peculiar velocity vs and the redshift cz = ds+ vs for each
mock galaxy, where dsis the true radial distance of the mock
galaxy from the random center we selected, all in km/s. We
then perturb the true radial distance ds of the mock galaxy
with a velocity error drawn from a Gaussian distribution of
width equal to the corresponding real galaxy’s velocity er-
ror, σn. Thus, dp = ds+δd, where dp is the perturbed radial
distance of the mock galaxy (in km/s) and δd is the veloc-
ity error. The mock galaxy’s measured line-of-sight peculiar
velocity vp is then assigned to be vp = cz − dp, where cz
is the redshift we found above. The reason for this proce-
dure is that the weight we assign to each galaxy in the mock
catalogues will then be similar to the weights of the real cat-
alogues, since these depend on the radial distribution errors
of the survey objects. However, since the weights also de-
pend on the angular distribution of objects, which are more
clustered in the real surveys than the mock surveys, we will
see below that the large scale flow estimators obtained from
the mock surveys are significantly better than those from
the real surveys.
In Fig. 3, we show the probability distribution for the
the 4100 MV-weighted bulk flow moments ui(solid) and the
Gaussian-weighted moments Vi (dashed) for the LasDamas
simulations. As shown in Fig. 3, the distributions for the
MV-estimated bulk flow moments (solid histogram) and the
true moments (dashed histogram) are both Gaussian dis-
tributed. This is as expected for large scale moments and
reflects the fact that nonlinear motions, which can lead to
nonGaussian tails in the velocity distributions for individual
galaxies, have been effectively averaged out. The widths of
the distributions match well with the expectations of linear
theory, which predicts a 110 km/s width, virtually identi-
cal to the ones shown in the figure. Further, as predicted in
Watkins et al. (2009); Feldman et al. (2010) the probably of
getting a bulk flow of the magnitude found is indeed ≈ 1%.
4BULK FLOW MOMENTS
For each of the 4100 LD (5000 HR) mocks, we estimated
the bulk flow moments {ux,uy,uz} using our MV weight-
Figure 3. Histograms showing the normalized probability distri-
bution for the MV and the Gaussian-weighted bulk flow moments
for the directions x and z in the top and bottom rows respec-
tively for the two types of mock catalogs: DEEP (left column)
and COMPOSITE (right column) as in Fig. 5. The MV-weighted
bulk flow moments ui are the solid histogram. The Gaussian-
weighted “true” moments Viare shown as dashed histogram. We
also superimpose a Gaussian centered at zero with width of the
RMS calculated. It is clear that the distributions of both the MV
and Gaussian-weighted moments are Gaussian distributed. We do
not show the y direction since it is statistically identical to the x
direction. The SFI++ catalog shows very similar trends and so
was not displayed.
ing scheme (Sec. 2). We then compared the results to the
Gaussian-weighted bulk moments {Vx,Vy,Vz} calculated by
going to the same central points for each of the 4100 LD
(5000 HR) mock catalogues and averaging the velocities of
all the galaxies in the simulation box, each galaxy being
weighted by a Gaussian weight of width RG = 50h−1Mpc.
Here we present our results from the LasDamas and Horizon
Run simulations.
In Fig. 4, we show the bulk flow moments in the x and z
directions (in the top and bottom rows respectively) for the
4100 DEEP (left column) and COMPOSITE (right column)
mock catalogues, extracted from the 41 LasDamas simula-
tion boxes. The MV-weighted moments ui and the corre-
sponding Gaussian-weighted “true” moments Vi are plotted
against each other and the positive correlation between the
two is clearly visible. A perfect correlation would put all
4100 points on the diagonal.
In Fig. 5 we show the probability distribution for the
difference between the MV-weighted bulk flow moments ui
and the Gaussian-weighted ideal moments Vi for the 4100
mock surveys from the LasDamas simulations. A Gaussian
centered at zero and with the same width as the probabil-
c ? 0000 RAS, MNRAS 000, 000–000