Page 1

arXiv:1205.4240v2 [astro-ph.CO] 20 Jul 2012

Mon. Not. R. Astron. Soc. 000, 000–000 (0000)Printed 23 July 2012(MN LATEX style file v2.2)

How covariant is the galaxy luminosity function?

Robert E. Smith1,2⋆

1Institute for Theoretical Physics, University of Zurich, Zurich CH 8037

2Argelander-Institute for Astronomy, Auf dem H¨ ugel 71, D-53121 Bonn, Germany

23 July 2012

ABSTRACT

We investigate the error properties of certain galaxy luminosity function (GLF) esti-

mators. Using a cluster expansion of the density field, we show how, for both volume

and flux limited samples, the GLF estimates are covariant. The covariance matrix

can be decomposed into three pieces: a diagonal term arising from Poisson noise; a

sample variance term arising from large-scale structure in the survey volume; an occu-

pancy covariance term arising due to galaxies of different luminosities inhabiting the

same cluster. To evaluate the theory one needs: the mass function and bias of clus-

ters, and the conditional luminosity function (CLF). We use a semi-analytic model

(SAM) galaxy catalogue from the Millennium run N-body simulation and the CLF of

Yang et al. (2003) to explore these effects. The GLF estimates from the SAM and the

CLF qualitatively reproduce results from the 2dFGRS. We also measure the luminos-

ity dependence of clustering in the SAM and find reasonable agreement with 2dFGRS

results for bright galaxies. However, for fainter galaxies, L < L∗, the SAM overpredicts

the relative bias by ∼10-20%. We use the SAM data to estimate the errors in the GLF

estimates for a volume limited survey of volume V ∼ 0.13h−3Gpc3. We find that

different luminosity bins are highly correlated: for L < L∗the correlation coefficient

is r > 0.5. Our theory is in good agreement with these measurements. These strong

correlations can be attributed to sample variance. For a flux-limited survey of similar

volume, the estimates are only slightly less correlated. We explore the importance of

these effects for GLF model parameter estimation. We show that neglecting to take

into account the bin-to-bin covariances, induced by the large-scale structures in the

survey, can lead to significant systematic errors in best-fit parameters. For Schechter

function fits, the most strongly affected parameter is the characteristic luminosity L∗,

which can be significantly underestimated.

Key words: Cosmology: large-scale structure of Universe. Galaxies: abundances.

1 INTRODUCTION

The galaxy luminosity function (hereafter GLF) is one of

the central pillars of modern observational cosmology. Com-

monly denoted φ(L), it informs us about the comoving space

density of galaxies, per unit luminosity interval L to L+dL.

Its central importance originates through the following: it

enables one to quantify the mean space density of galaxies

in a patch of space; it provides a means for quantifying the

evolution over time of the galaxy population in the Universe;

it is one of the main tools for testing models of galaxy for-

mation; finally it plays a central role in large-scale structure

work, in the construction of mock galaxy catalogues and

sample weighting for clustering estimates.

There is a vast and rich literature on this subject

that goes back to Hubble (1936), and for a review of de-

⋆res@physik.unizh.ch

velopments through to the mid 90’s see the reviews by

Binggeli et al. (1988) and Strauss & Willick (1995) and ref-

erences there in. Over the past decade the invention of

massive multi-object spectrographs has revolutionised this

area of research and has led to an explosion in the num-

ber of available redshifts with which to estimate the GLF:

at low redshifts there has been the 2dFGRS (Folkes et al.

1999; Cole et al. 2001; Norberg et al. 2002; Croton et al.

2005), the SDSS (Blanton et al. 2001, 2003), and GAMA

(Loveday et al. 2012) surveys; and at higher redshifts the

VVDS (Ilbert et al. 2005), DEEP2 (Willmer et al. 2006;

Faber et al. 2007), and the zCOSMOS (Zucca et al. 2009).

Our current astrophysical understanding of what shapes

the GLF is evolving rapidly, as our understanding of how

galaxies form also rapidly improves (Kauffmann & Charlot

1998; Kauffmann et al. 1999; Cole et al. 2000; Benson et al.

2003; Croton et al. 2006; Bower et al. 2010). This in part

owes to the large spatial volumes that can now be simu-

c ? 0000 RAS

Page 2

2

Robert E. Smith

lated with sufficiently high enough spatial resolution to fol-

low the growth of dark matter haloes which may host faint

galaxies (Springel et al. 2005). One important insight that

has emerged is that there is a quantity more fundamen-

tal than the GLF, and that is the conditional luminosity

function (hereafter CLF) (Yang et al. 2003; Cooray 2006).

This informs us that the probability of obtaining a galaxy

of luminosity L, is conditioned on the mass M of the host

halo. This idea is supported by the results that the GLF is

different between dense and void regions (see for example

Beijersbergen et al. 2002; Croton et al. 2005). This galaxy–

halo connection then provides us with a means for connect-

ing the estimates of the GLF with the underlying large-scale

structures (LSS).

Whilst the astrophysics that shapes the GLF has been

widely studied, our understanding of the statistical signifi-

cance of GLF estimates is far from understood. As we enter

an era where the parameterisations of ‘good galaxy forma-

tion models’ are to be compared one needs a more concrete

way of assessing the goodness of fit. Moreover, we would

also like to be able to compare results from different sur-

veys, to make conclusions about the evolution of the galaxy

population. Again, this requires us to have a more con-

crete method for interpreting features and differences. In

this paper we aim to provide a theoretical framework within

which one can calculate how large-scale structures impact

not only the shape of the GLF, but also how it shapes the

statistical properties of the errors. In passing, we note that

Trenti & Stiavelli (2008) explored how cosmic variance im-

pacts the GLF parameters for deep high redshift surveys. We

also note that Robertson (2010) explored a Fisher matrix ap-

proach to forecasting the expected GLFs for future high red-

shift surveys. He showed that sample covariance could cor-

relate the galaxy counts in different magnitude bins. How-

ever, as we will show, these authors failed to capture the full

story. We believe that the formalism presented herein, goes

someway beyond these earlier approaches.

The paper breaks down as follows: in §2 we present an

overview of some commonly used GLF estimators. In §3 we

examine the expectation and covariance of the GLF estima-

tor for volume limited samples. In §4 we we do the same but

for flux limited samples. In §5 we describe empirical results

for the GLF. We also describe the SAM galaxy catalogues

that we use and also the CLF model that we employ. Here

we also explore the luminosity dependence of galaxy clus-

tering. In §6 we present our results for the error properties

of the GLF in volume and flux-limited surveys. In §7 we

explore the importance of including the full data covariance

matrix for model fitting and parameter estimation. Finally,

in §8 we summarise our findings and draw our conclusions.

2 ESTIMATING LUMINOSITY FUNCTIONS

2.1

ΛCDM paradigm

Let us begin our theoretical development by following the

standard paradigm for galaxy formation in a ΛCDM uni-

verse: we assert that galaxies can only form inside dark mat-

ter haloes, and that halo formation, and hence galaxy forma-

tion, takes place hierarchically. Thus, massive galaxies are

assembled through the accretion and merger of smaller ones.

Thus, given a dark matter halo of mass M, the detailed the-

ory of galaxy formation will tell us important information

such as, the number, luminosity and types of galaxies that

form inside such haloes. This of course will be a stochastic

process and the exact number will vary between haloes.

2.2Overview of estimators

One of the most basic observational tools for testing our

understanding of galaxy formation models is through the

GLF. Over the years there have been many approaches to

constructing estimators for the GLF. The simplest is to com-

pute:

E1 :

?φ1(Lµ) =Ng(Lµ)

Vs∆Lµ

, (1)

where Ng(Lµ) is the number of galaxies of luminosity Lµ in

the bin ∆Lµ, and Vs is the total sampled survey volume.

For flux limited surveys this proves to be a biased es-

timator, since for faint galaxies the volume out to which

one may observe these objects is significantly smaller than

for the case of bright galaxies. This can be corrected for by

adopting the Vmaxestimator of Schmidt (1968):

E2 :

?φ2(Lµ) =

≡ Vmax(Lµ) is the maximum volume that a

galaxy with a luminosity Lµ could have been found in, given

the flux limit of the survey mlim(for further details see §4).

For a discussion of estimators E1 and E2 see Felten (1976)

and references therein.

It was noted that for shallow and narrow surveys esti-

mators E1 and E2 would be ‘biased’ by the presence of large-

scale over/underdense regions. Subsequently, a further set of

estimators were developed to try and remove this so called

bias (Turner 1979; Sandage et al. 1979; Kirshner et al. 1979;

Efstathiou et al. 1988). At the heart of these approaches

is the assumption that the joint probability of obtaining

a galaxy with luminosity Lµ in interval ∆Lµ, and spatial

position in the volume element d3x, is the product of two

independent probability density functions (PDF):

1

∆Lµ

Ng(Lµ)

?

µ=1

1

Vmax

µ

, (2)

where Vmax

µ

p(Lµ,x)dLµd3x = p(Lµ)p(x)dLµd3x , (3)

where the 1-point luminosity PDF is

p(L) =

φ(L)

Φ(Lmin);

Φ(L) ≡

?∞

L

dL′φ(L′) . (4)

where Lmin is the lowest luminosity galaxy detectable in the

sample volume, given selection criteria. If x is the location

of a random point then the probability of finding a galaxy

in a cell of volume δV is given by:

P(x) = p(x)d3x = NδV/Vs = ¯ nδV .(5)

However, if one pre-selects a cluster region centred on xc,

then the probability is enhanced P(x|xc) = ¯ nδV [1+ξgc(r)],

where r = |x − xc| and ξgc(r) is the cross-correlation func-

tion between the cluster centre and galaxies in the cluster

(Peebles 1980). Then, for example for estimator E1, the lu-

minosity function estimate would be:

E1 :?φ(Lµ) =Ng(Lµ)

Vs∆Lµ

= ?φ(Lµ)??1 + σ2?

c ? 0000 RAS, MNRAS 000, 000–000

, (6)

Page 3

How covariant is the galaxy luminosity function?

3

where

σ2≡

Ng(Lµ)

?

i=1

ξgc(ri)/Ng(Lµ) . (7)

Turner (1979) saw that, under the assumption of

Eq.(3), if one constructed the following quantity, then the

environmental dependence of the counts would drop out:

E3 :

dNg(Lµ)

Ng[> Lµ,χ ? χmax(Lµ)]

=

φ(Lµ)dLµ× p(x)Vs

?∞

φ(Lµ)dLµ

?∞

LµdL′φ(L′) × p(x)Vs

=

LµdL′φ(L′), (8)

where Ng[> Lµ,χ ? χmax(Lµ)] denotes the total number of

galaxies brighter than Lµ with distance less than χmax(Lµ).

Unfortunately, the estimator E3 is also biased – the real

world is more complicated (see Cole 2011, for additional dis-

cussion of this). The bias can be attributed to the fact that

p(L,x) is not separable: bright/faint galaxies tend to inhabit

high/low density environments (Norberg et al. 2002). To il-

lustrate how this bias operates, let us consider the following

toy example. Suppose our survey consists of two clusters

at the same distance from the observer, and let cluster one

contain galaxies of luminosity L1 and be of mass M1, and

let cluster two contain galaxies of L2 > L1 and be of mass

M2 > M1. Then since higher mass dark matter haloes have

more extended profiles and also are more biased with respect

to the underlying dark matter than lower mass haloes, then

we have: ξgc(r|L2) > ξgc(r|L1). On construction of Turner’s

estimator we find:

dN(L1)

N[> L1,χ ? χmax(L)]

=

Vs∆L?φ(L1)??1 + σ2

?1 + σ2

1(L1)?

Vs∆L{?φ(L1)?[1 + σ2

?

?φ(L1)?

where in the above we have defined

1(L1)] + ?φ(L2)?[1 + σ2

2(L2)?

2(L2)]}

= 1 +?φ(L2)?

[1 + σ2

1(L1)]

?−1

, (9)

σ2

j(L) ≡

1

N(L)

N(L)

?

i=1

ξgc(xi− xc,j|L) .(10)

Thus we see that, in this toy-model case, the estimator is

biased low for the lower luminosity galaxies.

In fact as we will show in the following sections the bias

associated with estimators E1 and E2 approaches zero, pro-

vided that the sample volume is sufficiently large. Whereas

for estimator E3 one can see that owing to the fact that

ξgc(r|L2) ?= ξgc(r|L1), the estimator is biased. We shall re-

serve a more detailed study of Turner’s estimator and the

bias induced by neglecting density-luminosity correlations

for future study.

3VOLUME LIMITED GALAXY SAMPLES

Let us consider the simplest estimator E1, which one may

apply to volume limited surveys. We are interested in com-

puting the expectation and covariance.

3.1Expectation of estimator

Consider some large cubical patch of the Universe, of volume

Vs, and containing Ncclusters that possess some distribu-

tion of masses. Let us subdivide this set of clusters into a

set of Nm mass bins, and where the αth mass bin contains

Nc

luminosities between Lµ − ∆Lµ/2 and Lµ + ∆Lµ/2, that

are hosted by the ith halo of the αth mass bin, by Ng

With the above definitions, the GLF estimator E1 for

volume limited samples can be written:

αclusters. We shall denote the number of galaxies with

i,α,µ.

E1 : ?φ(Lµ) =

1

Vs∆Lµ

NM

?

α=1

Nc

?

α

i=1

Ng

i,α,µ. (11)

We now wish to compute the expectation of this estimator.

We shall write this as,

??φ(Lµ)

?

=

1

Vs∆Lµ

NM

?

α=1

?Nc

i=1

α

?

Ng

i,α,µ

?

g,P,s

,(12)

where in the above ?...?g,P,srepresents an averaging over

the ensemble: the subscript g denotes an averaging over the

sampling distribution for placing galaxies into haloes; the

subscript P denotes an averaging over sampling clusters into

the given realization of the density field; and the subscript

s denotes an averaging over the density fluctuations within

the volume.

We shall assume that the number of galaxies occupying

a given dark matter halo is a Poisson process:

P(Ng

i,α,µ|λα,µ) =λNg

i,α,µexp[−λ]

Ng

i,α,µ!

. (13)

where λ ≡ Ng(Mα,Lµ) is the expected number of galaxies

in the Lµ luminosity bin, and for a halo of mass Mα. Actu-

ally, the above sampling distribution is not of great concern,

but what will be of importance will be the independence

of the distributions, i.e. the number of galaxies occupying a

given cluster depends only on the physical properties of that

cluster.

One immediate consequence of this is that we may com-

pute the average over the galaxy population separately, and

hence write E1 as:

?Nc

i=1

??φ(Lµ)

?

=

1

Vs∆Lµ

NM

?

NM

?

α=1

α

?

?Ng

i,α,µ

?

g

?

P,s

=

1

Vs∆Lµ

α=1

Ng(Mα,Lµ)?Nc

α?P,s,(14)

where in the last line we identified Ng(Mα,Lµ) ≡?Ng

ity in the interval [Lµ−∆Lµ/2,Lµ+∆Lµ/2] that occupies

a cluster of mass M.

In order to proceed further, we need to compute the

expected number of clusters in the αth mass bin, ?Nc

This may be done following the procedure described in

Smith & Marian (2011) (summarised in Appendix A for

convenience). Following this procedure gives:

i,α,µ

?

g,

which tells us the expected number of galaxies with luminos-

α?P,s.

?Nc

α?P,s= Vsnα ,(15)

c ? 0000 RAS, MNRAS 000, 000–000

Page 4

4

Robert E. Smith

where the number density of clusters in the αth mass bin is

?Mα+∆Mα/2

and where n(M)dM is the abundance of dark matter haloes

in the mass interval [M − dM/2,M + dM/2]. On inserting

this expression into Eq.(14) we find:

nα ≡

Mα−∆Mα/2

dMn(M) , (16)

??φ(Lµ)

?

=

1

Vs∆Lµ

NM

?

α=1

Ng(Mα,Lµ)Vsnα .(17)

On taking the limit of small mass bins and assuming that

the mass function varies slowly across the bins, then from

the mean value theorem, we have

nα ≈ n(Mα)∆Mα . (18)

and we may convert Eq.(17) to an integral. Finally, on using

the CLF model of Yang et al. (2003), for which Φ(Lµ|Mα) ≡

Ng(Mα,Lµ)/∆Lµ, then we have:

??φ(Lµ)

Thus for volume limited samples, estimator E1 is unbiased.

?

=

?

dMn(M)Φ(Lµ|M) .(19)

3.2Estimator covariance

Let us compute the covariance matrix that we would expect

for estimator E1. The covariance matrix is defined to be,

??φµ?φν

where from now on we make use of the compact notation

φµ ≡ φ(Lµ). Focusing on the first term on the right-hand-

side, and on inserting Eq.(11), we find

Cµν ≡

?

−

??φµ

???φν

?

,(20)

??φµ?φν

?

=

1

∆Lµ∆LνVs2

NM

?

α=1

NM

?

β=1

×

?Nc

i=0

α

?

Nc

?

β

j=0

?Ng

i,α,µNg

j,β,ν

?

g

?

P,s

,(21)

where again we have used the fact that the average over the

galaxies can be separated from the cluster sample. Consid-

ering the contents of the inner bracket, we see that this may

be rewritten as

?Ng

+δK

i,α,µNg

j,β,ν

?

g

=˜ ǫij˜ ǫαβ˜ ǫµν

?Ng

i,α,µ

?

g

?Ng

j,β,ν

?

g

ij˜ ǫαβ˜ ǫµν

?Ng

?(Ng

i,α,µ

?

g

?Ng

j,β,ν

?

g

+

··· + (5terms)

δK

+

ijδK

αβδK

µν

i,α,µ)2?

g, (22)

where in the above we have made use of a modified Levi-

Cevita symbol ˜ ǫij = 1 if i ?= j and 0 otherwise, and we

have used the independence of the sampling distributions to

separate the expectations of the products. Consider the final

term in the above expression, on using Eq.(13), we see that

this piece can be rewritten as,

?(Ng

=Ng(Mα,Lµ)[1 + Ng(Mα,Lµ)] .

i,α,µ)2?

g

=

?Ng

i,α,µ

?2

g+?Ng

i,α,µ

?

g

(23)

On inserting this back into Eq.(22), we may resum all terms

and find that the expression simplifies to be,

?Ng

+Ng(Mα,Lµ)δK

i,α,µNg

j,β,ν

?

g

=Ng(Mα,Lµ)Ng(Mβ,Lν)

i,jδK

α,βδK

µ,ν.(24)

If we now return to Eq.(21), then on using the above rela-

tion, we find:

??φµ?φν

?

=

1

∆Lµ∆LνVs2

??Nc

+ ?Nc

NM

?

P,sNg(Mα,Lµ)Ng(Mβ,Lν)

α=1

NM

?

β=1

×

αNc

β

?

α?P,sNg(Mα,Lµ)δK

α,βδK

µ,ν

?

. (25)

In order to proceed further we require an expression for the

product

β

lowing the arguments presented in Smith & Marian (2011)

(summarised in Appendix A). Thus we have,

?Nc

The first term takes into account the excess variance above

random in the number counts, which arises due to the spatial

correlations of the clusters:

?Nc

αNc

?

P,s. Again, this may be obtained by fol-

αNc

β

?

P,s≡ Sαβ+ Vs2nαnβ+ Vsnαδk

α,β.(26)

Sαβ ≡ Vs2nαnβbαbβσ2

V , (27)

where in the above we have defined the effective bias of the

clusters in the αth mass bin to be,

?Mα+∆Mα/2

and also introduced the volume variance

?

where W(k) is the survey window function and P(k) is the

matter power spectrum.

Substituting Eq.(26) into Eq.(25), gives

× Ng(Mβ,Lν)

NM

?

Using Eqs(27)–(29) in the above expression, gives

??φµ?φν

× nαnβ

1

∆Lµ∆LνVs

bα =

1

nα

Mα−∆Mα/2

dMb(M)n(M)(28)

σ2

V≡

d3k

(2π)3|W(k)|2P(k) ,(29)

??φµ?φν

?

=

1

∆Lµ∆LνVs2

NM

?

?

α=1

NM

?

β=1

Ng(Mα,Lµ)

Sαβ+ Vs2nαnβ+ Vsnαδk

α,β

?

+

α=1

VsnαNg(Mα,Lµ)δK

α,βδK

µ,ν

?

.(30)

?

=

1

∆Lµ∆Lν

?

?bαbβσ2

α,β

Ng(Mα,Lµ)Ng(Mβ,Lν)

V+ 1?

?

NM

?

+

α

nαNg(Mα,Lµ)Ng(Mα,Lν)

+

1

∆Lµ∆LνVs

α=1

nαNg(Mα,Lµ)δK

µ,ν.(31)

Again, if the mass bins are sufficiently narrow, then we may

use the mean value theorem to make the following approxi-

mations: nα ≈ n(Mα)∆Mα and bα ≈ b(Mα). This allows us

c ? 0000 RAS, MNRAS 000, 000–000

Page 5

How covariant is the galaxy luminosity function?

5

to transform the above expression into integrals over cluster

mass. Next, if we subtract off the second term on the right

hand side of Eq.(20), this gives us the covariance matrix of

the GLF. Note, that this simply removes the +1 from the

first term in square brackets in Eq.(31). Thus we find,

??

× b(M1)b(M2)σ2(Vs)Ng(M1,Lµ)Ng(M2,Lν)

+1

Vs

1

Vs

Cµν

=

1

∆Lµ∆Lν

dM1

?

dM2n(M1)n(M2)

?

dM1n(M1)Ng(M1,Lµ)Ng(M1,Lν)

+

?

dM1n(M1)Ng(M1,Lµ)δK

µ,ν

?

.(32)

The above expression may be written in a more compact

way by introducing the following expressions: the effective

bias of galaxies in luminosity bin Lµ,

bg

µ≡ bg

µ(Lµ) ≡

1

¯ ng

µ

?

dM1n(M1)b(M1)Ng(M1,Lµ) , (33)

and the effective number density of galaxies in the luminos-

ity bin Lµ,

¯ ng

µ≡ ¯ ng(Lµ) ≡

?

dM1n(M1)Ng(M1,Lµ) .(34)

On using these definitions in Eq.(32), we find:

Cµν = φ(Lµ)φ(Lν)bg(Lµ)bg(Lν)σ2(Vs) +φ(Lµ)δK

+1

Vs

µ,ν

Vs∆Lµ

?

dM1n(M1)Ng(M1,Lµ)

∆Lµ

Ng(M1,Lν)

∆Lν

(35)

Finally, we may reexpress our result in terms of the CLF of

galaxies Φ(Lµ|M), as

Cµν = φµφνbg

µbg

νσ2

V+φµδK

Vs∆Lµ+ Σµν ,

µ,ν

(36)

where we defined the ‘halo occupancy covariance’ to be

?

Σµν ≡

1

Vs

dM1n(M1)Φ(Lµ|M1)Φ(Lν|M1) .(37)

Closer inspection of Eq.(36) reveals several interesting

points. The first term informs us that the presence/absence

of large-scale structures in the survey volume will en-

hance/suppress the number of galaxies in our estimates and

that this will lead to bin-to-bin correlations in the estimates

of the GLF. The second term is the standard Poisson er-

ror term, which dominates in the limits of rare counts. The

third term is interesting, and tells us that, if our understand-

ing of galaxy formation is correct and galaxies only appear

inside haloes, then, even in the absence of structure, GLF

estimates are correlated. This owes to the fact that, if we

have a halo, then it most likely comes with a set of φ(L|M)

galaxies and so the presence of one galaxy is correlated with

the presence of additional galaxies. Finally, we note that

Robertson (2010) wrote down terms similar to the first two

in our Eq.(36). However, owing to his over-simplistic model

for the number of galaxies hosted by a halo of a given mass,

he failed to obtain the halo occupancy covariance term.

3.3Luminosity function correlation matrix

A short corollary to this section is that we may now con-

struct the correlation matrix from the covariance matrix:

rµν ≡

Cµν

?CµµCνν

.(38)

This obeys the inequality |rµν| ? 1.

Inserting our expression for the covariance matrix given

by Eq.(36) into the above definition, we find

rµν

=

φµφνbg

µbg

νσ2(Vs) +

φµδK

Vs∆Lµ+ Σµν

µ,ν

?

i={µ,ν}

?

φ2

i[bg

i]2σ2(Vs) +

φi

Vs∆Li+ Σii

?1/2

.(39)

Let us now factor out the Poisson error terms from the nu-

merator and denominator of Eq.(39). Note that the term in

the numerator may be rewritten as

?φµφν

Whereupon,

?Ng

i={µ,ν}

Ng

φµδK

Vs∆Lµ

µ,ν

=

?Vs∆LµVs∆Lν

δK

µ,ν.(40)

rµν

=

µNg

νbg

?

µbg

νσ2(Vs) +?Σµν + δK

µ,ν

?

i(bg

i)2σ2(Vs) +?Σii+ 1

?1/2

(41)

and in the above we have defined the total number

of surveyed galaxies in the luminosity bin Lµ to be,

Ng

?Vs∆LµVs∆Lν

On manipulating the above expression, we find that it may

also be written as,

?dMn(M)N(Lµ|M)N(Lν|M)

i={µ,ν}

µ≡ φµ∆LµVs, and where we have defined:

?Σµν ≡

?φµφν

Σµν .(42)

?Σµν =

Several cases of interest may be noted. If, for the mo-

ment, we neglect the halo occupancy covariance, i.e?Σµν →

?

Ng

≫

In the first, the errors are dominated by the Poisson sam-

pling of the galaxies and the covariance matrix is uncor-

related. In the second case, the matrix is dominated by the

sample covariance, and the matrix can become perfectly cor-

related:

?Ng

?

??dMn(M)N(Li|M)?1/2.(43)

0, then we note the two cases:

Ng

µNg

νbg

µbg

νσ2(Vs)

≪

1 ;(44)

?

µNg

νbg

µbg

νσ2(Vs)1 . (45)

rµν =

µNg

νbg

µbg

i)2σ2(Vs)]1/2→ 1 .

νσ2(Vs)

?

i={µ,ν}[Ng

i(bg

(46)

We may also make the important point that, taking Vs → ∞

and hence σ(Vs) → 0, does not guarantee that the correla-

tion between different luminosity bins is negligible. As the

above equations clearly show, it is the quantity Vsσ2(Vs)

that is required to vanish for negligible correlation to occur.

Indeed, for a power-law power spectrum, we would have that

Vsσ2(Vs) ∝ R3R−(3+n)∝ R−n, which can only be made to

vanish for n > 0. For CDM we have a rolling spectral in-

dex, and n > 0 for k ? 0.01hMpc−1, which implies that

Vs ? 0.5h−3Gpc3for the covariance to diminish with in-

creasing volume.

c ? 0000 RAS, MNRAS 000, 000–000