Available via license: CC BY 4.0
Content may be subject to copyright.
MNRAS 522, 5701–5739 (2023) https://doi.org/10.1093/mnras/stad1062
Measurement of parity-odd modes in the large-scale 4-point correlation
function of Sloan Digital Sky Sur v ey Baryon Oscillation Spectroscopic
Sur v ey twelfth data release CMASS and LOWZ galaxies
Jiamin Hou,
1 , 2 ‹Zachary Slepian
1 , 3 and Robert N. Cahn
3
1
Department of Astronomy, University of Florida, Gainesville, FL 32611, USA
2
Max-Planck-Institut f
¨
ur Extr aterrestisc he Physik, Postfach 1312, Giessenbachstrasse 1, D-85748 Garching, Germany
3
Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
Accepted 2023 April 1. Received 2023 April 1; in original form 2022 June 11
A B S T R A C T
A tetrahedron is the simplest shape that cannot be rotated into its mirror image in three-dimension (3D). The 4-point correlation
function (4PCF), which quantifies excess clustering of quartets of galaxies o v er random, is the lowest order statistic sensitive to
parity violation. Each galaxy defines one v erte x of the tetrahedron. Parity-odd modes of the 4PCF probe an imbalance between
tetrahedra and their mirror images. We measure these modes from the largest currently available spectroscopic samples, the
280 067 luminous red galaxies (LRGs) of the Baryon Oscillation Spectroscopic Surv e y (BOSS) twelfth data release (DR12)
LOWZ (
¯
z = 0 . 32) and the 803 112 LRGs of BOSS DR12 CMASS (
¯
z = 0 . 57). In LOWZ, we find 3.1 σevidence for a non-zero
parity-odd 4PCF, and in CMASS we detect a parity-odd 4PCF at 7.1 σ. Gravitational evolution alone does not produce this effect;
parity-breaking in LSS, if cosmological in origin, must stem from the epoch of inflation. We have explored many sources of
systematic error and found none that can produce a spurious parity-odd signal sufficient to explain our result. Underestimation of
the noise could also lead to a spurious detection. Our reported significances presume that the mock catalogues used to calculate
the cov ariance suf ficiently capture the cov ariance of the true data. We have performed numerous tests to explore this issue. The
odd-parity 4PCF opens a new avenue for probing new forces during the epoch of inflation with 3D large-scale structure; such
exploration is timely given large upcoming spectroscopic samples such as Dark Energy Spectroscopic Instrument and Euclid.
Key words: methods: data analysis – galaxies: statistics –(cosmology:) early Universe –(cosmology:) large-scale structure of
Universe – cosmology: observations.
1 INTRODUCTION
The laws of nature respect certain symmetries; the physical processes
go v erned by them are invariant under the corresponding transforma-
tions. Parity transformation (P), which reverses the sign of each
coordinate axis, had been thought to be such a symmetry. Indeed,
the electromagnetic and strong interactions are invariant under P.
Ho we ver, this symmetry is broken in the weak interaction (Lee &
Yang 1956 ; Wu et al. 1957 ). Sakharov ( 1967 ) showed that the matter–
anti-matter asymmetry of the Universe requires that the combination
CP of P and charge-conjugation (C) symmetry be broken. The
currently known CP violation is inadequate to explain the observed
matter–anti-matter asymmetry. Whatever additional CP violation is
responsible may involve pure P violation as well.
Most of the cosmological studies of parity invariance to date have
focused on cosmic microwave background (CMB) polarization (Lue,
Wan g & Kamionk owski 1999 ; Kamionk owski & Souradeep 2011 ;
Shiraishi et al. 2011 ; Minami & Komatsu 2020 ) or on gravitational
waves (Saito, Ichiki & Taruya 2007 ; Yunes et al. 2010 ; Jeong &
Kamionkowski 2012 ; Wan g et al. 2013 ; Zhu et al. 2013 ; Nishizawa
E-mail: jiamin.hou@ufl.edu
& Kobayashi 2018 ; Orlando, Pieroni & Ricciardone 2021 ). A recent
CMB study of parity violation reported 2.4 σevidence for cosmic
birefringence (where the two polarization states of a wave propagate
differently) (Minami & Komatsu 2020 ). Eskilt & Komatsu ( 2022 )
refined this analysis and found 3.6 σevidence.
A number of mechanisms producing parity violation at cosmolog-
ical scales have been presented in the literature. For instance, one can
add a Chern–Simons coupling to the standard cosmological paradigm
at early or at late times. This term typically describes an interaction
between a pseudo-scalar field and a spin-1 field (Barnaby, Namba
& Peloso 2011 ; Sorbo 2011 ;
¨
Ozsoy 2021 ) or a spin-2 field (Jackiw
& Pi 2003 ; Alexander & Yunes 2009 ; Soda, Kodama & Nozawa
2011 ; Dyda, Flanagan & Kamionkowski 2012 ). The pseudo-scalar
field can be axion-like and if present at late times, can play the
role of dark matter or dark energy. In this case, the Chern–Simons
coupling can rotate the polarizations of initially linearly polarized
CMB photons.
1
In contrast, if the axion-like field plays the role of
1
Helical primordial gravitational waves could in principle leave an imprint
on the cross-spectrum between the E and B modes of CMB polarization, or
between B modes and temperature fluctuations. Ho we ver, these observ ables
are suppressed by the two-dimensional nature of the CMB (Masui, Pen &
Turo k 2017 ).
© 2023 The Author(s).
Published by Oxford University Press on behalf of Royal Astronomical Society. This is an Open Access article distributed under the terms of the Creative
Commons Attribution License ( http://cr eativecommons.or g/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium,
provided the original work is properly cited.
Downloaded from https://academic.oup.com/mnras/article/522/4/5701/7169316 by guest on 15 June 2023
5702 J. Hou, Z. Slepian, and R. N. Cahn
MNRAS 522, 5701–5739 (2023)
Figure 1. Parity transformation applied to a tetrahedron formed by a quartet of galaxies. Each v erte x represents a galaxy. Choosing one galaxy (red dot) as
our primary, the quartet is defined by the three vectors to the remaining vertices, r
1
, r
2
, and r
3
. For a quartet, the subscripts are fixed by requiring that r
1
≤r
2
≤r
3
. When viewing the tetrahedron from the primary (red) looking down along each vector r
i
, the direction in which one reads going from smallest to largest
side ( r
1
to r
3
) defines a handedness, either clockwise or counterclockwise. Here, the tetrahedron on the left, as viewed from the primary, is clockwise. Parity
transformation in 3D is a reflection about a plane and then a 180
◦rotation about the vector perpendicular to that plane, and converts the clockwise tetrahedron
at left to the counterclockwise one at right. When one averages over rotations, as in this work, only the mirroring matters.
the inflaton, the coupling can give rise to non-vanishing parity-odd
polyspectra of the primordial curvature perturbations (Bartolo et al.
2015 ; Shiraishi 2016 ). Since the curvature perturbations seed the
subsequent formation of large-scale structure, the primordial parity-
odd polyspectra would produce the same in the late-time distribution
of galaxies.
Recently, Cahn, Slepian & Hou ( 2021 ) made the no v el proposal
of using the galaxy 4-point correlation function (4PCF) to probe
parity violation in three-dimensional (3D) large-scale structure. Four
galaxies can be taken as the vertices of a tetrahedron, the lowest order
3D shape that cannot be rotated into its mirror image, rendering
the 4PCF sensitive to parity violation. An illustration of a galaxy
quartet and how we define parity on it, is shown in Fig. 1 . Following
Cahn et al. ( 2021 ), we expand the 4PCF in the isotropic (i.e.
rotation-averaged) basis functions of Cahn & Slepian ( 2020 ). In
the standard inflationary paradigm (Albrecht & Steinhardt 1982 ;
Linde 1982 , 1983 ), we would not expect a parity-odd 4PCF. The
initial density fluctuation is a Gaussian random field (GRF, Bardeen
1980 ; Starobinsky 1982 ), which then evolves under gravity and
forms galaxies at late times. Gravity, and even the baryonic physics
of galaxy formation, is parity-conserving. Hence, the detection of
a parity-odd 4PCF of cosmological origin would be evidence that
parity violation was present before the known forces dominated the
evolution of the matter distribution.
We present here a measurement of the parity-odd modes of the
4PCF measured using the Baryon Oscillation Spectroscopic Surv e y
(BOSS; BOSS collaboration 2017 ) of Sloan Digital Sky Survey
(SDSS)-III (Eisenstein et al. 2011 ; Dawson et al. 2013 ). Philcox,
Hou & Slepian 2021 presented the parity-even 4PCF measurement
on the same data set and found an 8.1 σdetection of a non-Gaussian
4PCF (expected in the standard picture of gravitationally evolved
structure formation, e.g. Bernardeau et al. 2002 ). A progenitor of
the algorithm and approach here used w as emplo yed to measure the
3-point correlation function (3PCF) of BOSS twelfth data release
(DR12) CMASS (Slepian et al. 2017a , b , c , Sugiyama et al. 2019 ,
2021 ), and extended to use Fourier transforms in Slepian & Eisenstein
( 2015c ) and Portillo et al. ( 2018 ) and to the anisotropic 3PCF in
Friesen et al. ( 2017 ), Slepian & Eisenstein ( 2018 ), and Garcia &
Slepian ( 2022 ). The 4PCF has been measured before in just a few
works (in 2D; Fry & Peebles 1978 ; averaged over internal angles
(Sabiu et al. 2019 ), as well as in Fourier space and with a degree
of compression (integrated trispectrum; Gualdi & Ver d e 2022 ),
but never separated into parity-odd modes; the history of N-point
correlation functions (NPCFs) is re vie wed in Peebles ( 2001 ).
Given that no detection of parity-odd physics in large-scale
structure has yet been made and that a number of proposed theoretical
models can produce it, in this work we pursue a model-independent
analysis. Hence we lack an a priori expectation for the shape
of the signal. While using such a model could strengthen any
detection significance by correlating data in different modes, it would
inevitably tie our detection significance to a particular model, which
we wish to a v oid. In contrast to typical analyses, e.g. of the 2PCF for
baryon acoustic oscillations (BAO) or the 3PCF for BAO or galaxy
biasing, in this work we cannot identify systematic errors simply by
observing departures from an expected template for the cosmological
signal. We must thus pay especial attention to systematics. We
make e xtensiv e use of both mock catalogues and analytics to assess
whether any systematics can produce spurious parity-odd 4PCF
modes.
Our fiducial cosmology here matches that adopted by BOSS
(BOSS collaboration 2017 ). In particular, we take a geometrically
flat CDM model with redshift-zero matter density (in units of the
critical density) m
= 0.31, baryon density (in units of the critical
density) b
= 0.048, Hubble constant h ≡H
0
/(100 km s
−1
Mpc
−1
)
= 0.676, root-mean-square density fluctuations (within 8 h
−1
Mpc
spheres) of σ8
= 0.8, scalar spectral tilt n
s
= 0.96, and sum of the
neutrino masses
i
m
ν, i
= 0.06 eV.
This work is organized as follows. Section 3 outlines the multiple
analyses of varying complexity used to increase confidence in our
measurements. Section 2 re vie ws the basis used to decompose the
4PCF . W e also present a toy model to illustrate the relation between
parity and tetrahedra. Section 4 describes the data, simulations, and
covariance matrix. Section 5 then presents two different paths to
obtain a detection significance and their outcomes. We also present
an analysis of cross-correlations between spatially separated regions.
Section 6 outlines our systematics tests on mocks as well as analytic
work. Section 7 concludes. A number of appendices present details
of the work.
Downloaded from https://academic.oup.com/mnras/article/522/4/5701/7169316 by guest on 15 June 2023
Parity-odd 4PCF detection 5703
MNRAS 522, 5701–5739 (2023)
2 METHOD FOR 4PCF MEASUREMENT
The 4PCF estimator, indicated by a hat, is
ˆ
ζ( r
1
, r
2
, r
3
) ≡
δ( s ) δ( s + r
1
) δ( s + r
2
) δ( s + r
3
)
=
d s
V
δ( s ) δ( s + r
1
) δ( s + r
2
) δ( s + r
3
) , (1)
where angle bracket denotes an ensemble average of the density
fluctuations field δ( s ) ≡ρ( s ) / ¯ρ−1, with ρ( s ) the density field
and ¯ρthe average density.
2 Invoking ergodicity, the ensemble
average may be replaced by an integral of spatial position s over
the volume V . This integration results in a function that depends
only on the relative separation vectors r
1
, r
2
, and r
3
. We also
av erage o v er joint rotations of these v ectors. In practice, the density
fluctuation field is computed from discrete galaxy data, appropriately
weighted.
Since the 3D distribution of galaxies is assumed to be isotropic
on cosmological scales [ignoring redshift-space distortions (RSDs)],
the isotropic basis (Cahn & Slepian 2020 ) is an efficient means of
systematically extracting cosmological information. The isotropic
basis functions required to measure an NPCF are given by products
of ( N −1) spherical harmonics Y
m
(
ˆ
r ) combined according to angular
momentum addition. In particular, for the 4PCF ( N = 4) we require
the three-argument basis functions, which are
P
1
2
3
(
ˆ
r
1
,
ˆ
r
2
,
ˆ
r
3
) =
m
1
m
2
m
3
C
1
2
3
m
1
m
2
m
3
Y
1
m
1
(
ˆ
r
1
) Y
2
m
2
(
ˆ
r
2
) Y
3
m
3
(
ˆ
r
3
) . (2)
The factorizability of these functions is important to the speed-up of
the 4PCF algorithm (Philcox et al. 2022 ); in practice, it enables us to
compute the 4PCF as a sum o v er the spherical harmonic coefficients
a
i
m
i
of the density field about a given primary galaxy at s .
Each unit vector
ˆ
r
i
is associated with one total angular momentum
i
, with z-component m
i
.
3 The key point is simply that spherical
harmonics are two-index tensors, and conventionally the total angular
momentum and its z-component are chosen to represent them. This
point is further discussed in Cahn & Slepian ( 2020 ) around their
equation 2. The weight is
C
1
2
3
m
1
m
2
m
3 ≡( −1)
1
+
2
+
3
1
2
3
m
1 m
2 m
3
. (3)
The 3- j symbol enforces the triangular inequality |
1
−
2
| ≤
3
≤
1 +
2
, because the total angular momentum must be zero for an
isotropic function.
4
Under the parity operator, denoted
ˆ
P , a spherical
harmonic Y
m
transforms as ( −1)
, so the three-argument isotropic
functions transform as
ˆ
P
P
1
2
3
(
ˆ
r
1
,
ˆ
r
2
,
ˆ
r
3
)
≡P
1
2
3
( −ˆ
r
1
, −ˆ
r
2
, −ˆ
r
3
)
= ( −1)
1
+
2
+
3
P
1
2
3
(
ˆ
r
1
,
ˆ
r
2
,
ˆ
r
3
)
= P
∗
1
2
3
(
ˆ
r
1
,
ˆ
r
2
,
ˆ
r
3
) . (4)
Thus, the basis functions are real if
1
+
2
+
3
is even and imaginary
if the sum is odd.
2
The 4PCF after rotation-averaging has six degrees of freedom, so we will
only require certain combinations of the arguments on the left-hand side of
the estimator.
3
Given the rotational symmetry of the system, the choice of z-axis is arbitrary.
4
We recall that a 3- j symbol with zeros in the bottom row demands that
1
+
2
+
3
be eve n, but with non-zero m
i
there is no such requirement, allowing
odd sums of the
i
and hence parity-odd basis functions.
The isotropic functions also satisfy an orthonormality relation,
which follows from that of the spherical harmonics. We have
d
ˆ
r
1
d
ˆ
r
2
d
ˆ
r
3
P
1
2
3
(
ˆ
r
1
,
ˆ
r
2
,
ˆ
r
3
) P
∗
1
2
3
(
ˆ
r
1
,
ˆ
r
2
,
ˆ
r
3
)
= δK
1
1
δK
2
2
δK
3
3
. (5)
The Kronecker delta δK
i
i
is unity when its subscripts are equal and
zero otherwise.
The 4PCF estimator defined in equation ( 1 ) can be expanded into
the basis of isotropic functions (see equation 2 ), where the expansion
coefficients depend only on the r
i
and are given by orthogonality as
ˆ
ζ
1
2
3
( r
1
, r
2
, r
3
) =
d s
V
δ( s )
d
ˆ
r
1
d
ˆ
r
2
d
ˆ
r
3
δ( s + r
1
)
×δ( s + r
2
) δ( s + r
3
) P
∗
1
2
3
(
ˆ
r
1
,
ˆ
r
2
,
ˆ
r
3
) . (6)
To a v oid an o v ercomplete basis, the radial arguments r
i
are ordered
as r
1
≤r
2
≤r
3
, as further discussed in Cahn & Slepian ( 2020 ).
To construct a density fluctuation field from the discrete galaxy
counts, and also to account for the surv e y geometry, we use a
generalized Landy–Szalay estimator (Landy & Szalay 1993 ; Szapudi
& Szalay 1998 , see also Kerscher, Szapudi & Szalay 2000 ) as first
outlined for the angular momentum basis in Slepian & Eisenstein
( 2015b ) and further developed in Philcox et al. ( 2021 , 2022 ). It is
ˆ
ζ( r
1
, r
2
, r
3
) =
N ( r
1
, r
2
, r
3
)
R ( r
1
, r
2
, r
3
)
, (7)
where N ≡( D −R)
4
and R ≡R
4
, and these powers are shorthand
for expanding by the binomial theorem and letting each D and R be
e v aluated at a different spatial position. D means a particle drawn
from the ‘data’ and R means a particle drawn from the ‘random’
catalogue (a spatially uniform catalogue cut by the surv e y geometry).
As outlined in Slepian & Eisenstein ( 2015b ), we may estimate the
numerator and denominator separately (i.e. compute each separately
av eraging o v er the whole surv e y). Doing so giv es optimally weighted
estimates of each in the shot-noise limit, as discussed in Slepian &
Eisenstein ( 2015b , section 4, equations (24–26) and surrounding
text). Multiplying equation ( 7 ) through by R , expanding each side
of the resulting relation in the isotropic basis, reducing a product of
two isotropic basis functions to a sum o v er single ones, and finally
taking an inverse to solve the linear system so obtained (Slepian &
Eisenstein 2015b ; Philcox et al. 2022 ), we find the edge-corrected
4PCF estimator as
ˆ
ζ
1
2
3
( r
1
, r
2
, r
3
)
=
1
2
3
M
−1
1
2
3
,
1
2
3
( r
1
, r
2
, r
3
)
N
1
2
3
( r
1
, r
2
, r
3
)
R
000
( r
1
, r
2
, r
3
)
. (8)
We note that there is no mixing in the radial variables; surv e y
geometry does not change lengths. Our notation indicates a given
element of M
−1
. This latter is the inverse of the coupling matrix M
describing how surv e y geometry breaks the orthogonality of our basis
functions, much as Fourier modes are orthogonal only on an infinite
domain. N denotes a measurement from the ‘data-minus-random’
catalogue, and R
000
is the
1
=
2
=
3
= 0 expansion coefficient of
R ≡R
4
, i.e. the randoms’ 4PCF. N and R are e v aluated by replacing
δin equation ( 6 ) with their definitions, given below equation ( 7 ).
Downloaded from https://academic.oup.com/mnras/article/522/4/5701/7169316 by guest on 15 June 2023
5704 J. Hou, Z. Slepian, and R. N. Cahn
MNRAS 522, 5701–5739 (2023)
Figure 2. Left-hand panel: A box with side length L
box
= 1000 h
−1
Mpc filled with tetrahedra (upper left, small panel). Each of them has a unique primary
(black) from which the three sides with respective lengths r
1
∼10 h
−1
Mpc (red), r
2
∼20 h
−1
Mpc (yellow), and r
3
∼30 h
−1
Mpc (blue) extend. The larger panel
on the left is a zoom-in on the full box to display the tetrahedra more clearly. Right-hand panel: A sketch of a ‘clockwise’ tetrahedron and its ‘counterclockwise’
mirror image. The primary is in red. Our convention on clockwise and counterclockwise is detailed in Fig. 1 . On the left, the red point is closer to us than all the
others. Thus, the tetrahedron on the left is clockwise, as looking down from the primary, we go clockwise as we mo v e from the smallest side to the largest. On
the right, the primary (in red) is behind the other galaxies, so looking down from it towards them will reverse the handedness. Thus, the rightmost tetrahedron is
counterclockwise as viewed from the primary.
The coupling matrix has elements
M
1
2
3
,
1
2
3
( r
1
, r
2
, r
3
) = (4 π)
−3 / 2
( −1)
1
+
2
+
3
×
L
1
L
2
L
3
R
L
1
L
2
L
3
( r
1
, r
2
, r
3
)
R
000
( r
1
, r
2
, r
3
)
×
3
i= 1
D
P
i
L
i
i
C
i
L
i
i
000 ⎧
⎨
⎩
1 L
1
1
2 L
2
2
3 L
3
3
⎫
⎬
⎭
, (9)
with the coefficient
D
P
1
2
3 =
(2
1
+ 1)(2
2
+ 1)(2
3
+ 1) (10)
which depends on the product of the primary (hence the superscript
‘P’) angular momenta.
5 The matrix in curly brackets in equation
( 10 ) is a Wigner 9- j symbol. The factor C
i
L
i
i
000
(defined in equation 3
preceding it guarantees that
i
,
i
, and L
i
can be combined to make a
zero total angular momentum state. It also requires that
i
+
i
+ L
i
is even.
Regarding the edge correction, we note that formally M is infinite,
but we have found in practice truncating it at one angular momentum
beyond that used for the physical analysis is suitable (Slepian &
Eisenstein 2015b , Philcox et al. 2022 ). In this work, we use
max
= 4
for our analysis but work to = 5 on all
i
for the edge correction.
6
Further details regarding the suitability of, when performing the edge
correction, truncating at an one abo v e that used for the analysis,
are in Slepian & Eisenstein ( 2015b ). Ultimately this suitability stems
from the rough tri-diagonality of the edge-correction matrix (see their
section 4.2 and our Fig. 28 ).
5
Wer e one to measure an NPCF for N ≥5, one would require isotropic basis
functions of four arguments or more, and these basis functions require specifi-
cation of intermediate angular momenta fixing how the primary momenta are
coupled (further detailed in Cahn & Slepian 2020 ). The distinction between
primary and intermediate angular momenta is not needed in this work, but
for consistency, we retain the superscript ‘P’.
6
This truncation does not induce spurious parity-odd modes; if it did, we
would see them when we edge-correct our mock catalogues.
2.1 Illustration with toy tetrahedra
To understand the parity-odd measurement more intuitively, we study
cubic boxes of side length L
box
= 1000 h
−1
Mpc with tetrahedra
tuned to produce particular parity signals. To fill the boxes as fully
as possible, yet at the same time have tetrahedra with a minimum
side length of order 10 h
−1
Mpc , which is similar to the situation in
our BOSS data set, we choose the three sides extending from the
primary to be roughly r
1
∼10 h
−1
Mpc , r
2
∼20 h
−1
Mpc , and r
3
∼
30 h
−1
Mpc . We require that the minimum separation between pri-
maries be twice the longest side of the tetrahedron (i.e.60 h
−1
Mpc )
in order to minimize any overlap between tetrahedra. Finally, we
have N
tets
∼1500 tetrahedra within each cubic box. Fig. 2 shows an
example box and also an example tetrahedron and its partner under
parity transformation.
In 3D, parity transformation is equi v alent to a mirror reflection
across a 2D plane (which flips the sign of the coordinate axis
perpendicular to that plane) plus a 180
◦rotation around this latter
axis. For simplicity, in Fig. 2 , we depict the P simply as a mirror
reflection, since our basis, being isotropic, al w ays av erages o v er 3D
rotations and is thus insensitive to a 180
◦rotation. Put another way,
only the mirroring fundamentally alters the shape of a tetrahedron;
the 180
◦rotation only changes its orientation in absolute space.
7
In particular, one can imagine the mirroring as taking one side and
‘pulling it through’ the tetrahedron from being on one side of the
plane formed by the other two sides, to being on the other side of
this plane.
In practice, we must allow all four vertices of each tetrahedron
a chance to be the primary (Philcox et al. 2022 ). Ho we ver, for this
toy tetrahedron illustration we restricted to bins in side length (radial
bins) such that 9 h
−1
Mpc < r
i < 30 h
−1
Mpc for all r
i
so that nearly
al w ays only one of the four vertices will satisfy the radial bins
required for each tetrahedron. This means that the contribution to
the signal will stem only from the isotropic function e v aluated on
7
In general parity transformation and mirror reflection are distinct operations.
Downloaded from https://academic.oup.com/mnras/article/522/4/5701/7169316 by guest on 15 June 2023
Parity-odd 4PCF detection 5705
MNRAS 522, 5701–5739 (2023)
Figure 3. Here, we display 4PCF measurements from the illustrative toy boxes. The left- and right-hand panels each show a different channel, as indicated in
their titles. In both panels, the ‘clockwise-only’ box results are in blue, the ‘counterclockwise-only’ box results are in brown, and the ‘mixed’ box results are in
grey. Left-hand panel: Projection onto P
111
. As discussed in the main text, we expect a negative projection for the clockwise box and a positive projection for
the counterclockwise box. The mixed will have on average zero projection. These expectations are borne out and indeed analytic calculation yields agreement
with the measured signals. Right-hand panel: Projection onto P
122
(
ˆ
r
1
,
ˆ
r
2
,
ˆ
r
3
) ∝ i
ˆ
r
1
·(
ˆ
r
2
׈
r
3
)(
ˆ
r
2
·ˆ
r
3
). By construction the scalar triple product
ˆ
r
1
·(
ˆ
r
2
׈
r
3
)
is unity for all three of our illustrative boxes. The amplitudes on the right-hand side are not strictly zero because there are still a small number of tetrahedra
formed by connecting secondaries around one primary with secondaries around another. Additionally, the amplitude fluctuations divide into two envelopes at
bin index 36. This bin index corresponds to raising the index for r
1
(the smallest side and the slowest varying one).
unit vectors extending from a single primary; this renders it easier to
understand the measured signal in this toy model.
We set ˆ
r
1 to be along ˆ
x , ˆ
r
2 to be along ˆ
y , and ˆ
r
3 to be along ˆ
z .
The tetrahedron is therefore clockwise at the only primary allowed
in this toy model; our convention is presented in Fig. 1 . A parity
transformation simply means that we interchange the sides, so that
ˆ
r
2 aligns with the x -axis and ˆ
r
1 aligns with the y -axis, while ˆ
r
3 is
unchanged. Again, characterization as ‘clockwise’ or ‘counterclock-
wise’ depends on at which v erte x one sits, as discussed in Fig. 1 ,
but here is unambiguous. By restricting the radial bins we use, we
force only one galaxy to be the primary; the other choices of primary
will not lead to sides extending from them that fall within our chosen
bins. Thus, we can ensure that the tetrahedra for this test are perfectly
counterclockwise as viewed from the single primary allowed.
We produce toy boxes in three configurations. The first is filled
with only counterclockwise tetrahedra (enforced by the radial bin
restriction); the second has only clockwise tetrahedra, and the third
has an equal mix. To render each toy box somewhat more realistic,
we randomly rotate each v erte x by an angle θ∈ [ −180
◦, 180
◦]
around each of the three Cartesian coordinates. We also add random
numbers r
1
∈ [0, 1], r
2
∈ [ −2, 2], and r
3
∈ [ −1, 0] (in h
−1
Mpc )
to, respectively, r
1
, r
2
, and r
3
. We choose these ranges such that we
al w ays have r
1
< r
2
< r
3
. Hence, the parity of a given tetrahedron
will not be flipped by these additions.
Fig. 3 shows the 4PCF of these illustrative boxes. We may
approximate each tetrahedron as a sphere about the primary of radius
roughly 20 h
−1
Mpc , with volume V
tet
, to estimate the expected
4PCF amplitude. We define n as the local number density due
to a given tetrahedron, n = 1/ V
tet
, and ¯
n as the average number
density in the box, ¯
n = N
tets
/L
3
box
. We find δ4
=
[
n/
¯
n −1
]
4
≈
2 ×10
5
. The lowest lying parity-odd isotropic basis function is
P
111
= −3 i/ [
√
2 (4 π)
3 / 2
]
ˆ
r
1
·(
ˆ
r
2
׈
r
3
) (Cahn & Slepian 2020 ). As
expected, the counterclockwise-only box has a positive projection
onto this function, while the clockwise-only box has a ne gativ e
projection. The mixed box is consistent with zero projection onto
P
111 on average. We can analytically predict the ratios among the
4PCF coefficients for different channels
1
,
2
, and
3
. We compare
the mean ratios of the measured 4PCF coefficients for several
combinations to these predictions and find good agreement.
8 This
also serves as an additional test of our code (the code is further
discussed in Section 4.1 ).
2.2 Internal cancellation
F or a giv en tetrahedron, in practice (but not in our illustrativ e box es
abo v e), each of the four vertices gets a chance to serve as the primary
about which the isotropic basis function expansion is computed.
Some of these vertices will be ‘clockwise’, and some ‘counterclock-
wise’. Hence, if co-added into the same channel and triple-bin there
will be ‘internal cancellation’ and consequent reduction of any parity-
odd signal. Ho we ver, if the radial bins are made fine enough then
each v erte x, in virtue of the presumably unique lengths of the sides
extending from it, will be accumulated to a different triple-bin.
9
Thus, finer binning can reduce the internal cancellation and increase
the signal. This is much the same as in a configuration-space BAO
search, where fine enough bins must be chosen that the BAO feature
in the 2PCF is not averaged out by all being added into a single bin.
3 GUIDING PRINCIPLES FOR THE ANALYSIS
To isolate the potentially parity-violating component of the 4PCF, we
expand the correlation function in two distinct sets of isotropic basis
functions, one that is parity-even and one that is parity-odd. These are
constructed from products of three spherical harmonics with angular-
momentum indices
1
,
2
, and
3
. Isotropy requires that the
i
satisfy
the triangular inequality. If the sum of the
i
is even, the product is
8
As an example, the analytically predicted ratio for angular momenta { 1,
1, 1 } and { 1, 3, 3 } is ζ111
/ ζ133 = −1.07, and we measure from the data
¯
ζ111
/
¯
ζ133
= −1 . 07, with
¯
ζdenoting the bin-averaging.
9
Save for isosceles or equilateral tetrahedra, which we exclude from our
analysis in any case.
Downloaded from https://academic.oup.com/mnras/article/522/4/5701/7169316 by guest on 15 June 2023
5706 J. Hou, Z. Slepian, and R. N. Cahn
MNRAS 522, 5701–5739 (2023)
parity-even and if the sum is odd, the product is parity-odd. In this
analysis, only the parity-odd elements are used. Each basis element
is a function of three radial distances, the length of the sides from a
chosen v erte x among the four defining a tetrahedron. In practice, the
radial distances are binned.
Ideally, one w ould lik e to capture as much information as possible,
using many narrow radial bins, and working to some high value of the
i
. This would increase the difficulty of evaluating all the independent
amplitudes, but in fact, the technique of Slepian & Eisenstein ( 2015b )
and Philcox et al. ( 2022 ) makes this manageable, as we discuss in
Section 2 . The real challenge is in determining the covariance matrix.
This is especially critical when looking for parity violation, where
establishing a statistically significant non-zero signal is the crux and
hence understanding the inevitable statistical fluctuations is essential.
A fairly modest choice of separating the radial variable into 10
bins and including
i
only up to
max
= 4 results in 23 ×120 = 2760
independent amplitudes; we also consider 18 radial bins, which
produces 816 ×120 = 18 768 4PCF amplitudes. Determining the
covariance matrix is thus a formidable challenge. In order to invert a
sampling covariance matrix, as required for calculating χ2
, we need
at least many mocks as the dimensionality of the data vector, which
is in excess of the 2000 available to us for BOSS.
We have chosen three ways to obtain the covariance matrix. First,
the NPCF covariance matrix can be calculated analytically under the
assumption of a GRF as shown in Hou et al. ( 2021a ). The GRF does
not on average hav e an y parity-violation at the signal level, but it can
still have non-zero fluctuations in parity-odd modes. Therefore, the
GRF is still the leading-order contribution to the covariance of the
parity-odd modes. This is simply the statement that a signal may be
zero but its root-mean-square may not. We fit the analytic template
covariance matrix by varying the number density and volume with
respect to the covariance matrix derived from the mocks.
With this analytic template in hand, we may (i) directly compute
the χ2 of the data using the adjusted analytic covariance matrix.
An alternative (ii) is to compress the data vector to reduce its
dimensionality. The eigenvectors of the analytical covariance matrix
with the smallest eigenvalues represent the linear combinations of
basis functions that have the smallest statistical uncertainties. We
then expand the measured 4PCF using just the N
eig
best expansion
functions. We may also determine the N
eig
×N
eig
covariance matrix
directly from the mocks. Since N
eig is much less than the number
of mocks this covariance matrix is invertible. Finally, we may (iii)
use the empirical covariance matrix from the mocks directly, with
no involvement of the analytic template at all, by considering many
fewer channels than in (i) and (ii) (we lower
max
to be
max
= 2). A
substantial reduction of the number of channels is required to enable
inverting the empirical covariance matrix employed in this approach,
and so we lose statistical power. We thus treat (iii) as a test rather than
as giving us the main result of our analysis. The reliability of all three
approaches abo v e can be assessed using the mocks themselves by
verifying that their χ2
(or T
2
) values match the expected distribution.
4 DATA SET AND COVARIANCE
We use the final galaxy catalogue of the BOSS, from the DR12
of the SDSS-III. The catalogue is split into the NGC and the
SGC. The catalogue contains two samples, CMASS and LOWZ,
which were selected via the SDSS multicolour photometry and
co v er a redshift range of 0.15 < z < 0.7. CMASS and LOWZ use
similar target selection algorithms (Eisenstein et al. 2001 ; Cannon
et al. 2006 ). The target selection algorithm provides samples
that are mainly composed of luminous red galaxies (LRGs). For
Figure 4. The number density n as a function of redshift z for the two BOSS
samples used in this work. The LOWZ (0.2 < z < 0.4) North Galactic Cap
(NGC) is brown and South Galactic Cap (SGC) is orange. For CMASS (0.43
< z < 0.7), the NGC is in purple and the SGC is in lavender. We intentionally
do not allow redshift o v erlap between the two samples, and the redshift gap
z = 0.03 corresponds to a comoving radial separation of 73 h
−1
Mpc . This
separation means that the samples are fairly independent; the 2PCF ξbetween
a point in LOWZ and CMASS would be of order 1 per cent at this scale. The
covariance between two ‘worst-case’ tetrahedra, one in each sample (where
each is very close to the respective edge), is of order ξ4
. Few tetrahedra are
near enough on either edge to be significantly correlated with those in the
other slice. The plot shows that LOWZ has both a more uniform selection
function and a somewhat higher average number density than CMASS. It
also lacks the strongly decaying tail with an increasing redshift that CMASS
displays. These points are important when assessing the possible impact of
systematics on each sample and when addressing any differences between
the detection significances in the two samples.
CMASS the selection algorithm is further tuned to select massive
objects uniformly in redshift (Reid et al. 2016 ), which results
in an approximately mass-limited sample down to a stellar mass
M ∼10
11 . 3
h
−1
M
(Maraston et al. 2013 ; Leauthaud et al. 2016 ;
Saito et al. 2016 ; Bundy et al. 2017 ). The majority of CMASS is
LRGs ( ∼74 per cent), while the rest is late-type spirals (Masters
et al. 2011 ). LOWZ consists primarily of LRGs (Parejko et al.
2013 ). Despite the difference in target selection, the LRGs in the
two samples have similar stellar mass distribution (Maraston et al.
2013 ). We apply a redshift cut of 0.43 < z < 0.7 to CMASS, which
results in a redshift tail from LOWZ at redshift 0.43 < z < 0.5. This
tail ( ∼15 per cent of the entire ‘CMASS’ sample) slightly raises the
purity of the CMASS sample by adding more LRGs. To ensure that
the LOWZ sample as used here is independent of the CMASS one, we
apply a redshift cut of 0.2 < z < 0.4 to the former. This cut produces
a separation of about 70 h
−1
Mpc between the lower edge of CMASS
and the upper edge of LOWZ. Fig. 4 shows the number density for
each sample as a function of redshift. Finally, we note that the early
LOWZ target selection was not uniform due to the use of different
iterations of the galaxy-star separation algorithm (Reid et al. 2016 ).
Therefore, we do not include those early chunks in this analysis.
4.1 4PCF from data and mocks
For our main analyses, we considered tetrahedra with side lengths
r
1
, r
2
, and r
3
, ranging from 20 to 160 h
−1
Mpc inclusive, split into
10 linearly spaced bins (see Figs 5 and 6 for the CMASS and the
LOWZ samples, respectively). In addition, we also studied a finer
binning scheme, with side lengths ranging from 20 to 164 h
−1
Mpc
Downloaded from https://academic.oup.com/mnras/article/522/4/5701/7169316 by guest on 15 June 2023
Parity-odd 4PCF detection 5707
MNRAS 522, 5701–5739 (2023)
Figure 5. Upper panel: The parity-odd 4PCF for the BOSS CMASS data with 10 radial bins, with NGC in red and SGC in blue, showing only the six lowest
lying channels. The error bars are the root-mean-square of the PATCHY mocks (o v er our set of 2000). Here for legibility we focus on the 10-bin results; the
18-bin results are in Appendix G , as are the remaining channels for the 10-bin analysis. We have mapped the three radial bins to a single index, with r
3
varying
the fastest and r
1
the slowest. This is done in all similar plots that follo w. Lo wer panel: The mean of 2000 PATCHY mocks (solid curves) for both NGC (red) and
SGC (blue). The shaded region is the rms expected for a single mock; this set of panels is intended to display the region around zero that the 4PCF of a data set
with no true parity-odd signal might inhabit.
inclusive, split into 18 linearly spaced bins. Furthermore, we explored
a coarser binning (six bins), presented as a test in Appendix E . As
a result, the sides of the tetrahedron that do not include the primary
can range from 0 to 320 h
−1
Mpc . We expand the parity-odd 4PCF
in the 23 angular channels with
i ≤4 given in Cartesian form
in Appendix A . For the edge corrections, we include all functions
with
i ≤5, as further discussed in Section 2 . We also compute
the even-parity modes (for which we do not reproduce the basis
functions here), as they are needed within the edge-correction,
despite that our actual analysis focuses solely on the parity-odd
ones.
To each galaxy, we apply a total weight w given by
w = w
fkp
w
sys
( w
noz
+ w
cp
−1) , (11)
with the systematic weight w
sys (a combined weight for stellar
density and seeing), the redshift failure weight w
noz
, and the fiber
collision weight w
cp (subscript ‘cp’ for ‘close pairs’). The FKP
weight (Feldman, Kaiser & Peacock 1994 ) is w
fkp
= [1 + n ( z) P
0
]
−1
,
with P
0
= 10
4
[ h
−1
Mpc ]
3
; n ( z) is the weighted number density at
the given redshift. We use the public random catalogue provided by
BOSS,
10
which is 50 times the size of the data catalogue. We split it
into 32 chunks such that each chunk is about 1.5 to 2 times the size of
the data; the reason is discussed in Slepian & Eisenstein ( 2015b ) and
Philcox et al. ( 2021 ) though the detailed splitting does not affect the
4PCF algorithm speed notably. Each chunk is first randomly shuffled
and then normalized so that its weighted sum matches the sum of the
completeness weights w
c
of the data.
We use 2000 public MultiDark PATC H Y light-cone mocks ( PAT C H Y
hereafter; Kitaura et al. 2016 ; Rodr
´
ıguez-Torres et al. 2017 ). These
mocks use second-order Lagrangian perturbation theory plus a
spherical collapse prescription calibrated on full N -body simulations,
and the mocks are additionally calibrated to match the 2-point and
some 3-point statistics of the observed BOSS CMASS and LOWZ
samples. The PATC H Y mocks include realistic surv e y geometry as
well as many observ ational ef fects, further detailed in Rodr
´
ıguez-
Torres et al. ( 2017 ).
10
https:// data.sdss.org/ sas/ dr12/boss/ lss/
Downloaded from https://academic.oup.com/mnras/article/522/4/5701/7169316 by guest on 15 June 2023
5708 J. Hou, Z. Slepian, and R. N. Cahn
MNRAS 522, 5701–5739 (2023)
Figure 6. First and second rows: Measurement of the parity-odd 4PCF using 10 radial bins for the BOSS LOWZ data in the NGC (brown) and SGC (blue).
We show the same six channels as in Fig. 5 , and the error bars are again the rms o v er the mocks. Again for legibility, we focus on the 10-bin results; the 18-bin
results are in Appendix G , as are the remaining channels for the 10-bin analysis. Third and fourth rows: The mean of 2000 PATCHY mocks (solid curves) for
both NGC (brown) and SGC (blue). The shaded region is the rms expected for a single mock.
The 4PCF measurement for the BOSS data catalogue and the 32
( D −R ) catalogues alone takes 0.33 GPU-h for the measurement
where we use 10 radial bins, and 0.65 GPU-h for the measurement
where we use 18 radial bins; both timings are using 69 NVIDIA
A100 GPUs (further discussed below) and include edge-correction,
which is done on the CPU linked to the GPU and takes negligible
extra time.
All calculations were done with the GPU-accelerated NPCF code
CADENZA (Slepian et al., in preparation), built on the CPU-based
NPCF code ENCORE (Philcox et al. 2022 ). CADENZA uses the
representation of the basis functions in spherical harmonics, for
reasons outlined in Section 2 . Tes t s of ENCORE are described in
Philcox et al. ( 2022 ), and CADENZA was verified to obtain exactly
the same outputs on a fiducial data set used for testing. As noted
already, we used 69 NVIDIA A100 GPUs simultaneously on the
HiPerGator cluster at the University of Florida. Overall, all of the
computations performed for this w ork w ould have taken roughly
7.5 million CPU h, or about 20 months of continuously running
on a cluster of 500 CPU cores. In practice, since CADENZA is
about 140 ×faster than ENCORE for the 4PCF, we required about
54 000 GPU-h, or about 1 month if one ran continuously on 69
GPUs.
4.2 Co v ariance matrix
4.2.1 Analytic covariance
The covariance matrix for the 4PCF may be computed analytically
if we assume that the density is a GRF (Hou et al. 2021a ). It is
expressed in terms of f -functions:
f
1
2
3
( r
1
, r
2
, s) ≡ k
2
d k
2 π2
P ( k ) j
1
( k r
1
) j
2
( k r
2
) j
3
( k s) , (12)
with j
( kr ) the spherical Bessel function of order and P ( k ) the
galaxy power spectrum. The power spectrum is taken to be isotropic;
the effect of RSD (after averaging over the line of sight) is simply to
boost the amplitude, which is embedded in the input power spectrum
and also partially absorbed in our later fitting for the best amplitude.
Fast approaches do exist to e v aluate these f -integrals (e.g. Slepian
et al. 2019 ) but here we use direct computation.
The analytic covariance matrix couples each vertex of an (un-
primed) tetrahedron to a v erte x in another (primed) tetrahedron.
Since in our approach, the primary v erte x is distinguished from the
other three, in the covariance expression there are two pieces: In
one the two primary vertices, unprimed and primed, are paired with
each other; in the other, they are not. The final result will be a sum
Downloaded from https://academic.oup.com/mnras/article/522/4/5701/7169316 by guest on 15 June 2023
Parity-odd 4PCF detection 5709
MNRAS 522, 5701–5739 (2023)
given by products involving Wigner- nj symbols and the f -functions
of equation ( 12 ). Below we simply reproduce the final result derived
in Hou et al. ( 2021a ), for brevity here we define ≡{
1
,
2
,
3
}
and
≡
1
,
2
,
3
.
Cov
,
( r
1
, r
2
, r
3
; r
1
, r
2
, r
3
) = Cov
I
,
( r
1
, r
2
, r
3
; r
1
, r
2
, r
3
)
+ Cov
II
,
( r
1
, r
2
, r
3
; r
1
, r
2
, r
3
) . (13)
Cov
I
,
( r
1
, r
2
, r
3
; r
1
, r
2
, r
3
) = (4 π)
4
G
( −1)
( )(1 −E
G
) / 2
×
L
1
L
2
L
3
D
P
L
1
L
2
L
3
C
L
1
L
2
L
3
000 ⎧
⎨
⎩
G 1
G 2
G 3
1
2
3
L
1 L
2 L
3
⎫
⎬
⎭
× s
2
d s
V
3
i= 1
( −1)
( −
Gi
−
i
+ L
i
) / 2
D
P
i
i
L
i
×C
Gi
i
L
i
000
ξ( s) f
Gi
i
L
i
( r
Gi
, r
i
, s)
, (14)
Cov
II
,
( r
1
, r
2
, r
3
; r
1
, r
2
, r
3
) = (4 π)
4
G,H
( −1)
(
)(1 −E
H
) / 2
×
L
1
L
2
L
3
D
P
L
1
L
2
L
3
C
L
1
L
2
L
3
000 ⎧
⎨
⎩
G 1
G 2
G 3
H1
H2
H3
L
1 L
2 L
3
⎫
⎬
⎭
× s
2
d s
V
3
i= 1
( −1)
( −
Gi
−
Hi
+ L
i
) / 2
D
P
Gi
Hi
L
i
C
Gi
Hi
L
i
000
×f
G 1
0
G 1
( r
G 1
, 0 , s) f
0
H1
H1
(0 , r
H1
, s) f
G 2
H2
L
2
( r
G 2
, r
H2
, s)
×f
G 3
H3
L
3
( r
G 3
, r
H3
, s) . (15)
In Case I, the sum o v er G includes all six permutations, while in Case
II the sum o v er G includes only cyclic permutations. The sum over H
includes all six permutations. E
G
is the Le vi–Ci vita symbol, gi ving
a ne gativ e sign if the permutation G is odd. (
) ≡
1
+
2
+
3
.
The coefficients C
1
2
3
m
1
m
2
m
3
and D
P
1
2
3
are given in equations ( 3 ) and
( 10 ), respectiv ely. F or the parity-even modes of the 4PCF, the phases
in equations ( 14 ) and ( 15 ) are identically unity because the sum of
the three angular momenta is al w ays e ven. Ho we ver, these phases do
matter for the parity-odd 4PCF.
We also note that, apart from the leading Gaussian contribution, the
covariance should also contain non-Gaussian contributions as well as
the coupling of the redshift-space power spectrum multiple moments
to different angular channels of the co variance. An y combination of
multipoles that after multiplying out is rotation-invariant (i.e. has
total angular momentum zero) can contribute. An expression for this
RSD-added analytic covariance is in Hou et al. ( 2021a ) appendix E,
where it is also shown that these redshift-space induced coupling can
also increase the off-diagonals of the covariance. Hou et al. ( 2021a )
showed that the rms of the off-diagonals after comparison of the
analytic covariance to the mock-based covariance is exactly the same
in real-space and redshift-space, both for lognormal mocks and for
QUIJOTE simulations (cf. figs 6, 7 and 10, 11). Therefore, off-diagonal
enhancement by redshift-space-induced coupling is in principle not
significant enough to greatly affect our analysis. Furthermore, some
of the redshift-space-induced effects can be absorbed in the o v erall
normalization, as much of their effect after isotropic averaging is to
rescale the o v erall po wer spectrum. Ne vertheless, we caution that
missing these other terms could in principle reduce the accuracy of
our analytic covariance relative to that from the mocks.
Fitting the volume and number density to match the mocks’
covariance can be done without inverting this latter, important to
a v oid as that inverse will be biased unless the number of degrees
Tab l e 1. Summary of the data sets and binning schemes used. We give
the ef fecti ve redshift z
eff
and the fitted ef fecti ve volume V
eff, fit
and number
density ¯
n derived by fitting our analytic covariance to that from the mocks.
The fitted ef fecti ve volume and number density are dif ferent for each number
of radial bins because we re-fit the covariance for each radial binning scheme.
In the last row, we also list the ef fecti ve volume using equation (50) from
Reid et al. ( 2016 ) for comparison.
CMASS LOWZ
NGC SGC NGC SGC
Bins z
eff
= 0.57 z
eff
= 0.32
18 V
eff, fit 2 .50 0 .79 0 .31 0 .12
( h
−1
Gpc )
3
¯
n ×10
4 1 .4 1 .4 2 .0 2 .0
( h
3
Mpc
−3
)
10 V
eff, fit 1 .71 0 .54 0 .23 0 .09
( h
−1
Gpc )
3
¯
n ×10
4 1 .8 1 .8 2 .6 2 .6
( h
3
Mpc
−3
)
6 V
eff, fit 1 .36 0 .44 – –
( h
−1
Gpc )
3
¯
n ×10
4 2 .2 2 .2 – –
( h
3
Mpc
−3
)
V
eff 2 .50 0 .95 0 .66 0 .29
( h
−1
Gpc )
3
of freedom is much smaller than the number of mocks. For the
fitting, we maximize a likelihood based on the Kullback–Leibler
div ergence (K ullback & Leibler 1951 ). One can show that with
the likelihood equation (55) of Hou et al. ( 2021a ), the optimal
volume at a given number density is uniquely determined, so the
optimization is only a 1D problem, making it efficient. There is no
guarantee that the best-fitting number density and volume for the
covariance of the even-parity 4PCF (e.g. as found in Philcox et al.
2021 ) and for the covariance of the parity-odd 4PCF should be the
same. We summarize the values obtained for each sample and each
radial binning in Ta b l e 1 . For comparison, we also give the ef fecti ve
volume estimated from BOSS using equation (50) of Reid et al.
( 2016 ); the surv e y area used in that equation is taken from table 2
of Reid et al. ( 2016 ). As discussed abo v e, the template co variance
employs several approximations, including the assumption that the
density field is purely Gaussian random, and the omission of higher
power spectrum multipole moments (with respect to line of sight).
Such simplifying assumptions may cause an underestimate of the true
covariance. To address this issue, an overall prefactor is introduced
and is interpreted as the ef fecti ve volume V
eff
, where this fitted
ef fecti ve volume could be lower than the one reported in Reid et al.
( 2016 ). Ho we ver, we want to point out that this volume-like prefactor
is only a highly approximated estimation. To fully account for the
ef fecti ve volume requires including geometry effects in the analytic
covariance template.
In Fig. 7 , we compare for CMASS NGC the parity-odd 4PCF
analytic covariance matrix to that from PAT C H Y for 10 radial bins. We
have mapped the channel and triple-bin to a 1D index, so the covari-
ance may be plotted as a 2D matrix (one 1D index for the unprimed
channels and triple-bins, another for the primed). In the left-hand
panel, we show a direct comparison of the correlation matrix, which
is the covariance matrix normalized by its diagonal. The similarity
between the analytic correlation matrix (upper triangle) and the
mock correlation matrix (lower triangle) shows that the analytic
approach can well capture the covariance’s features. In the right-hand
panel, we compare the diagonal elements of the covariance matrices
and also show the ratio between the mock and analytic covariance
Downloaded from https://academic.oup.com/mnras/article/522/4/5701/7169316 by guest on 15 June 2023
5710 J. Hou, Z. Slepian, and R. N. Cahn
MNRAS 522, 5701–5739 (2023)
Figure 7. Comparisons of the analytic parity-odd 4PCF covariance with that from the PATCHY mocks, all for CMASS NGC and 10 radial bins. For greater
legibility, we show only 11 of the 23 channels used in our analysis, and we have mapped each unprimed combination of angular momenta and radial bins to
a 1D index, and the same for each primed; the full set of variables and indices involved is given in equation ( 13 ), and the volume and number density here
used are given in Tab l e 1 . Left-hand panel: Analytic correlation matrix (upper triangle) and correlation matrix from the PATCHY mocks (lower triangle). The
sub-blocks represent the covariance between a pair of channels; each is 120 ×120 as for 10 radial bins, there are 120 radial bin combinations in each channel.
These sub-blocks show how the covariance changes as the side lengths of each tetrahedron vary at fixed channel pairs. The similarity between the upper and
lo wer triangles sho ws that the analytic cov ariance captures the cross-cov ariance between dif ferent radial bins as well as between different channels. Right-hand
panel: Diagonals of the PATCHY covariance matrix (solid black) and the analytic covariance (dashed red) for CMASS NGC. The average ratio is 1.05.
Figure 8. Here, we show the same comparisons as made in Fig. 7 but for the LOWZ NGC. Again, in the left-hand panel, we see that the analytic and mock
correlation matrices have very similar patterns. The average ratio between the mock and analytic covariance diagonals is 1.01 as indicated by the dotted black
horizontal line in the lower right panel.
diagonals. Again this uses our mapping of the angular momenta
and radial bins into a 1D index. Despite the similarity between the
two diagonals in their behaviour with increasing index, there is non-
ne gligible variation. On av erage, the mean of the ratio between the
mock and analytic covariance diagonal elements is roughly 1.05 for
CMASS NGC. Fig. 8 shows the same comparisons for the LOWZ
NGC. Again, we see that the analytic covariance can well describe
that of the mocks. The mean of the ratio is 1.01 for the LOWZ
NGC.
To further quantify the similarity between the analytic and mock
covariance, we define a half-inverse test matrix as
≡C
−1 / 2
analytic
C
patchy
C
−1 / 2
analytic
−1 , (16)
where C
−1 / 2
analytic
is the square root of the inverse of the analytic
covariance matrix, and C
patchy is the covariance from the mocks.
We use half-inverses so that the test is symmetric. Were the two
covariance matrices identical, would be the zero matrix (we note
that the identity matrix is subtracted). One can show that the optimal
volume at a given number density (as discussed before) will imply
that is traceless.
Fig. 9 shows the lower half of the matrix for the half-inverse
test of CMASS NGC (left-hand panel) and LOWZ NGC (right-hand
panel), with the standard deviation shown on the upper triangle. The
standard deviation for all elements is σall
= 0.06 for both CMASS
NGC and LOWZ NGC, which is expected as it scales as 1 /
√
N
mocks
(we used N
mocks
= 2000 for this test). Each element of the covariance
matrix follows a Wishart ( 1928 ) distribution, and the variance of
the diagonal elements for this distribution is two times that of the
off-diagonal elements. We do find that the standard deviation σdiag
is
approximately two times that of the of f-diagonal, σnon-diag
. Ho we ver,
we also find residuals in the diagonal elements in for both the left-
and right-hand panels, and larger residuals for LOWZ NGC than
for CMASS NGC. These indicate that fitting the volume and number
density to best match the mock covariance is imperfect. As mentioned
in Section 3 , one of our analysis approaches (‘compressed’) will
only require a smooth estimate of the covariance and turns out to
be insensitive to its amplitude. Furthermore, even when employing
the analytical covariance matrix directly, we may compare the χ2
from data only to the distribution of χ2
values for the mocks, both
computed using the same covariance matrix. This comparison should
minimize any bias in the detection significance computed using the
analytic covariance.
Downloaded from https://academic.oup.com/mnras/article/522/4/5701/7169316 by guest on 15 June 2023
Parity-odd 4PCF detection 5711
MNRAS 522, 5701–5739 (2023)
Figure 9. Half-inverse test (equation 16 ) of the analytic covariance matrix at { ,
} = { 111, 111 } for CMASS NGC (left-hand panel) and LOWZ NGC
(right-hand panel). We show only half of the matrix in the lower triangle and indicate the standard deviation of the matrix elements in the upper triangle; the test
is symmetric so there is no information lost by deleting the latter. Here, we compare the analytic covariance matrix to that from 2000 PATCHY mocks; the o v erall
standard deviation is roughly 1 /
√
N
mocks
≈0 . 02. It is expected that the standard deviation of the diagonal is double, as further discussed in the main text; we
see a slight deviation from this relation in both panels.
4.2.2 Mock covariance matrix
In this work, we assume that the PATC H Y mocks match well the
covariance structure of the observed data. This is a reasonable
assumption as the PATC H Y mocks are calibrated to match the 2-point
and some 3-point statistics measured from BOSS. Here, we outline
in detail aspects of PAT C H Y that could cause a mismatch between the
mock covariance and the data’s true covariance. In the end, we do
not find any strong reason to believe that such a mismatch is present,
but that is after performing a number of tests moti v ated by the below
points.
(i) PAT C H Y uses approximate methods to simulate non-linear
structure formation, and underestimates small-scale power ( s
20 h
−1
Mpc ), especially for galaxies with high stellar mass (Kitaura
et al. 2016 ; see their fig. 4). Lacking power at smaller scales could
have two impacts on the covariance. First, it reduces the ‘squeezed’
tetrahedra (where the side lengths from the primary are close to each
other in size and hence galaxies can become very close to each other
as well as the angles at the primary v erte x become small). This could
produce an underestimate of the covariance on small scales. Second,
this lack of small-scale power also may reduce the derived covariance
at all scales, as the covariance comes from averaging the product
of unprimed and primed tetrahedra o v er all possible separations
between them, including small separations wherein points on the
primed one may be close to those on the unprimed one. Indeed, this
is the origin of s in the integral equation ( 12 ).
To assess the impact of the abo v e, we do an analysis where we force
each side to be different in length from the others by at least one full
bin width; this at least addresses the first point in (i). We still find a
significant detection of parity-odd 4PCF in all samples. The second
point is more challenging to deal with, although we did not have a
direct test against it; its impact can be inferred from the parity-even
connected 4PCF measurement (which tracks only the contribution
due to non-linear evolution). The level of agreement between PATCHY
and BOSS CMASS suggests that PATC H Y actually does well-capture
the non-linearity on the scales rele v ant to our analysis.
(ii) PAT C H Y may not reproduce all systematics that may be present
in data. We computed the 2PCF and found that the LOWZ SGC has
an amplitude that is higher on average than the mean of the PATCHY
mocks by 1 σ. Ho we ver, it is quite consistent with the spread we see in
the measured 2PCF for a set of 1000 mocks. In addition, as discussed
in Ross et al. ( 2016 ) and Kitaura et al. ( 2016 ), there are deviations
between PATC H Y and the BOSS 2PCF at scales greater than roughly
100 h
−1
Mpc . These 2PCF bins are highly correlated though and the
actual deviation may be less severe than the visual impression. Again
we found that the spread of the 2PCF of 1000 mocks co v ered the
BOSS data well. Nonetheless, to make our analysis robust, we also
performed our parity-odd analysis with a cut on the maximum side
lengths from the primary so that no side of the tetrahedron could
exceed 160 h
−1
Mpc . We still found evidence for a parity-odd 4PCF
consistent with what one would expect given the significant reduction
in the constraining power in this analysis due to the much smaller
number of triple-bins it permits.
We also briefly consider that the number density of the surv e y is not
equal to the true number density of the Universe; this would be a
failure of the total integral constraint. Such a failure would produce
a correction to the observed power spectrum, such that a term P
ic
≥
0 is subtracted (see equation 29 of Beutler et al. 2014 ). The PAT C H Y
mocks would not have this correction, and hence have a larger power
spectrum than the observed data. Thus, the covariance estimated
from them would be larger than it should be. Since this error, if
present, would cause an overestimation of the covariance (and thus
a spuriously decreased detection significance), and also since given
BOSS’s large volume it is expected to be a small effect, we do not
explore it further.
4.2.3 Consistency of parity-even modes
Impact of possible systematics
As will be discussed further in Section 6 , distortions in the radial
or the angular directions that are not captured in the mocks can mean
that a mock-based covariance underestimates the true covariance.
Downloaded from https://academic.oup.com/mnras/article/522/4/5701/7169316 by guest on 15 June 2023
5712 J. Hou, Z. Slepian, and R. N. Cahn
MNRAS 522, 5701–5739 (2023)
Here, we study the sensitivity of the parity-even sector to such
contamination. For a distortion of the radial selection function,
we apply the contamination directly to the mocks and find their
mean detection significance of a connected-even parity 4PCF (when
computed with an undistorted covariance, as would be the case if
the data had an unaccounted-for change) is shifted by ∼0.6 σfor 10
radial bins and ∼2.4 σfor 18 radial bins. We also explore angular
contamination as described in Section 6.1.5 ; this increases the even-
parity detection significance difference between data and mean of
the mocks to 0.9 σfor the 10 bin and 1.8 σfor the 18 bin.
Self-calibration of co v ariance
Our detection significance relies on correctly estimating the co-
variance. Since the PATC H Y mocks have no intrinsic parity-violating
mechanism, they cannot be used to check our pipeline regarding
signal, but only to estimate the covariance. Yet it is possible that the
mocks do not contain all the systematics present in the real data, and
this lack could result in mis-estimation of the covariance.
Ho we v er, an y unknown data-only systematic might well impact
the parity-even modes as well as the parity-odd modes. Hence, we can
see if the mocks and data are consistent in the parity-even sector at the
signal level, where both do have a signal due to non-linear evolution.
If we see that the even-parity signal from the data is consistent
with that from the mocks, this suggests that there are unlikely to
be unaccounted-for systematics. It also would argue that we have
reasonably estimated the ev en-sector co variance, as an underestimate
of it would show up as a severe tension between the signal in the
mocks and in the data. Since our procedures for obtaining the even
and odd-sector covariances are exactly the same, such a finding would
in turn build confidence that our odd-sector covariance is correct.
Furthermore, as a highly conserv ati ve approach, we can ask what
factor the covariance in the even-sector would need to be rescaled
by to force the data to agree within e.g. 3 σor 1 σwith the mocks.
We can then apply this rescaling to our odd-parity covariance and
ask how it impacts our detection significance in the odd sector. We
emphasize that this is a highly conserv ati ve check, not a correction
procedure; it is perfectly possible that the even-parity data signal
is e.g. 3 σdifferent from that in the mocks just by chance, without
implying that there is any error that should be corrected in either
sector. Thus, while we summarize the results of this idea in Ta b l e 3 ,
one should not take these rescalings too literally.
For CMASS with 10 radial bins, we found excellent agreement
between the data and mock distribution both using the analytic
covariance and the compressed method (also see Philcox et al. 2021 ,
where a slightly different sample selection was applied). Therefore,
we do not present any rescaling factor for the detection significance
for the parity-odd modes.
For CMASS with 18 radial bins, we found a discrepancy in the
even-parity sector between the data and the mock distribution of 4.9 σ
when using the analytic covariance. Rescaling the even covariance
such that the data and mocks agree at the 3 σlevel, we then propagate
the same factor to the parity-odd measurement. We also repeat this
procedure to enforce the agreement at the 1 σlevel in the even sector.
We summarize these results in Tab le 3 .
Regarding the discrepancy found in the 18-bin test, we ascribe it
to the possible breakdown of the GRF assumption of the analytic
covariance as one goes to smaller scales, especially for the even
sector, as discussed in Section 4.2 . In particular, the parity-even
covariance requires including the leading higher order covariance and
connected estimator contribution. These contributions are naturally
absent when estimating the parity-odd covariance. Further, when
using the compressed method with N
eig = 800, the discrepancy is
reduced to 1.3 σ.
5 RESULTS ON CMASS AND LOWZ DATA
Here, we first compute the significance of our observed parity-odd
4PCF using the methods briefly outlined in Section 3 , and then
explore the cross-correlation of NGC and SGC in each sample.
5.1 Statistical significance
We quantify the statistical significance of the parity-violating ampli-
tudes using three approaches: a direct analysis harnessing the analytic
covariance calibrated against PAT CHY (Section 5.1.1 ), a compressed
analysis using the analytic covariance just to select a reduced basis
with few enough degrees of freedom that the mocks can then provide
the covariance (Section 5.1.2 ), and finally, by lowering
max
enough
so that we may use the direct approach but with covariance from the
mocks (Section 5.1.5 ).
5.1.1 Direct analysis
With the analytic covariance, adjusted in density and volume to best
fit the covariance given by the mocks, we evaluate
χ2
=
ˆ
ζ−ζmodel
T
C
−1
ˆ
ζ−ζmodel
. (17)
where ˆ
ζand ζmodel are, respecti vely, the v arious parity-violating
amplitudes, ˆ
ζ
1
2
3
( r
1
, r
2
, r
3
) and ζmodel is a model for them. To
investigate the null hypothesis that there is no parity violation, we set
all elements of ζmodel
to zero. C
−1
is the inv erse co variance matrix.
If the underlying data have Gaussian behaviour with N
d
degrees of
freedom, the resulting distribution is the χ2
distribution:
f ( χ2
) =
1
2
N
d
/ 2
( N
d
/ 2) χ2
N
d
/ 2
e
−χ2
/ 2
. (18)
The adequacy of the covariance obtained by fitting the analytic
covariance to the data sets can be assessed by examining the
distribution of χ2 values from the mocks when run through our
analysis, and seeing if it matches the predicted χ2
distribution.
11
5.1.2 Compressed analysis
The alternative means of obtaining an inv ertible co variance matrix
by reducing the dimensionality of the problem, the ‘compressed’
analysis, is described in Section 3 . This scheme was introduced by
Scoccimarro, Couchman & Frieman ( 1999 ) to quantify detection
significance, and the same method was applied to the even-parity
connected 4PCF analysis of Philcox et al. ( 2021 ). We diagonalize
the analytic covariance, writing C = O DO
−1
where D is diagonal
and O is an orthogonal matrix (we note that O
−1
= O
T
).
The columns of O then provide an eigenbasis for the analytic
covariance. We w ould lik e to select the subset of these eigenvectors
that will be most useful in detecting a signal. If one has a model for
the expected signal, one can rank the eigenvectors by their S/N, and
choose N
eig
of them with the best S/N. In the absence of an anticipated
signal, we choose the N
eig
eigenvectors associated with the smallest
eigen values. These eigen v ectors hav e the lowest noise. The analysis
is then restricted to the space spanned by these eigen vectors. W ith
N
eig
N
mocks
, the covariance matrix for this restricted analysis can
be determined entirely from the mocks, while retaining the domain
with the greatest information.
11
We expect that radially binning the 4PCF amplitudes results in a multi v ariate
Gaussian distribution on each triple-bin. Thus, a χ2
distribution is appropriate
here.
Downloaded from https://academic.oup.com/mnras/article/522/4/5701/7169316 by guest on 15 June 2023
Parity-odd 4PCF detection 5713
MNRAS 522, 5701–5739 (2023)
Tab l e 3. Here, we summarize the rescaled detection significance in the parity-odd modes by forcing
agreement in the parity-even sector at the levels indicated in the leftmost column. These results are for
CMASS and 18 radial bins; see Section 4.2.3 . We explore this rescaling both for our analytic covariance
analysis and our compressed analysis with 800 eigenvalues; in the last line, no rescaling is required
to attain 3 σconsistency in the even sector. Generally, it is notable that an odd detection more or less
remains in all cases even after rescaling.
Consistency in CMASS Rescaling Rescaled
even-parity sector 18 bins factor odd detection significance
1 Standard Analytic 0.88 2.0 σ
deviation covariance
N
eig
= 800 0.98 4.0 σ
3 Standard Analytic 0.94 4.6 σ
deviation covariance
N
eig
= 800 –4.4 σ
Figure 10. T
2
and χ2
distributions for CMASS data using 10 radial bins. The first five plots are obtained using the data compression scheme of Section 5.1.2
with different numbers of eigenvalues N
eig
as indicated in the plot titles. If there is no parity-breaking, we expect a T
2
distribution (equation 20 ) with N
eig
degrees
of freedom (solid black). The vertical lines in the first five plots are the T
2
values from CMASS for the NGC (red), SGC (blue), and joint NGC and SGC (black).
For comparison, we show histograms of the T
2
values from the PATCHY mocks for each case. We interpret the increasing deviations of the mocks’ histograms
from the T
2
distribution as N
eig
rises as evidence that the covariance becomes less well-determined for larger N
eig
. This is expected as for N
eig
= 800, there are
only about three mocks per degree of freedom (we use 2000 mocks). In the lower rightmost plot, we use the direct approach of Section 5.1.1 , which employs
the analytic covariance. Here, the expected distribution if there is no parity-breaking is a χ2
distribution (equation 18 ; solid black). We show a histogram of the
mocks’ χ2
values for comparison (orange) as well as a Gaussian fit to it (dot–dashed). This histogram is wider than the predicted χ2
distribution (which the
mocks should match as they have no parity-breaking). We interpret this as an indication that the analytic covariance may not be unbiased. To be conserv ati ve,
we compute a detection significance σG
by comparing the data χ2
values to the width of the Gaussian fit to the mocks. In fact, we quote σG
in all panels abo v e,
but it generally agrees with that computed from the T
2
distribution, σT
2
. We do not quote a detection significance computed with respect to the χ2
distribution
(as would naively be appropriate in the lower rightmost panel) because the distribution of mocks is clearly mismatched with a χ2
distribution.
While this procedure provides an invertible covariance matrix, the
inverse of the mock covariance matrix is unbiased only when N
eig
N
mocks
, where here N
eig
is the number of degrees of freedom, i.e. plays
the role of N
d
in Section 5.1.1 . Consequently, the χ2
-distribution we
used in Section 5.1.1 is modified. To distinguish this distribution from
the χ2
distribution, we refer to the variable T
2
, defined analogously
to χ2
, as
T
2
=
ˆ
ζ−ζmodel
T
C
−1
ˆ
ζ−ζmodel
. (19)
whose distribution is (Sellentin & Heavens 2016 )
f ( T
2
) =
(
N
mocks
/ 2
)
( N
eig
/ 2)
[ N
mocks
−N
eig
] / 2
×( N
mocks
−1)
−N
eig
/ 2
( T
2
)
N
eig
/ 2 −1
( T
2
/ ( N
mocks
−1) + 1)
N
mocks
/ 2
. (20)
This reduces to the χ2
distribution for N
mocks
→ ∞ .
Since the data from CMASS and LOWZ in the NGC and SGC
are statistically independent given the large physical separations of
both NGC and SGC and of CMASS and LOWZ, we can simply add
Downloaded from https://academic.oup.com/mnras/article/522/4/5701/7169316 by guest on 15 June 2023
5714 J. Hou, Z. Slepian, and R. N. Cahn
MNRAS 522, 5701–5739 (2023)
Figure 11. Similar to Fig. 10 but for LOWZ; again using 10 radial bins. As for CMASS, in the compressed approach, the detection significance first increases
with an increasing number of eigenvalues and then drops. In general, the detection significance is comparable to CMASS for both compressed and direct analyses
and for split and joint sky. Again in the direct method, we see a noticeable deviation of the mocks’ histogram from a χ2
distribution; this again suggests that the
analytic covariance may not be unbiased. We also see noticeable deviation at N
eig
= 800 (and even at N
eig
= 500), to a much greater extent than for the same
N
eig
values in CMASS (cf . Fig. 10 ).
their values of χ2
or T
2
, while increasing the number of degrees of
freedom correspondingly.
5.1.3 Detection significances from direct and compressed methods
Fig. 10 displays the T
2
or χ2
distributions for CMASS (both NGC
and SGC) for the two approaches described abo v e (direct and
compressed). The detection significance changes as we vary the
number of eigenvalues, but this is expected: As N
eig rises, more
information is added (unless the S/N is pathological), but at the same
time the potential bias of the inv erse co variance is increasing. This
latter can be assessed by looking for when the distribution of the
mocks’ T
2
values begins to deviate from the T
2
distribution expected
(equation 20 ). Fig. 11 shows the same information but for LOWZ,
which generally shows comparable detection significances compared
to CMASS both at each number of eigenvalues in the compressed
analysis, and in the direct analysis.
5.1.4 Using finer binning
Since two cancelling contributions can occur from a single tetrahe-
dron when they fall into the same radial bins (Section 2.2 ), increasing
the number of bins could increase the power of the analysis, at the
cost of further enlarging the covariance matrix. So moti v ated, we
now use 18 linearly spaced radial bins from r
min
= 20 h
−1
Mpc to
r
max
= 164 h
−1
Mpc , leading to a bin width of 8 h
−1
Mpc , roughly
double the radial resolution of our previous analyses. We refit the
volumes and number densities for the covariance matrices of CMASS
NGC and SGC and LOWZ NGC and SGC as in Section 4.2 , and
these numbers are in Tab l e 1 . Other than these new inputs to the
covariance matrix, all else is the same in our analysis, and we display
the results in Fig. 12 (CMASS) and Fig. 13 (LOWZ). Generally, for
each sample, the detection significance rises at each N
eig
relative to
the 10-bin analysis, and the same is true for the ‘direct’ significances
(where the analytic covariance was used). This increase is expected
given our arguments about ‘internal cancellation’. We summarize the
significances in Ta b l e 2 .
5.1.5 Reduced-
max
analysis
A pure-mock covariance is ideal for deriving the detection
significance, yet the high-dimensional data vector leads to a
non-invertible sampling matrix due to the limited number of mocks.
Reducing the number of channels considered makes it possible to
use a covariance matrix taken directly from the mocks, with no need
for the analytic one.
For
max = 1, there is just one parity-odd channel, and for
max = 2 there are four. Fig. 14 shows the detection significances
for CMASS and LOWZ with
max = 1 in the first and second
rows and
max = 2 in the third and fourth rows. For
max =
1 the histograms of the mocks’ T
2 values agree well with the
expected T
2 distribution, both for the compressed and the pure
mock-based covariance approaches. However, the histogram of the
mocks’ χ2
values from the direct approach with analytic covariance
deviates from the expected χ2
distribution, which indicates that the
analytic covariance is imperfect. Nonetheless, all three methods show
consistent detection significance. For
max = 2, all three methods
are still consistent. Ho we ver, we notice the changes in detection
significance of CMASS and LOWZ are different when going from
max = 1 to
max = 2. We suspect this is due to the difference in
the two samples, given their redshifts and number density, which we
will leave for future exploration. After all, this reduced analysis
serves as a robustness check of the pure analytic covariance matrix
approach.
Downloaded from https://academic.oup.com/mnras/article/522/4/5701/7169316 by guest on 15 June 2023
Parity-odd 4PCF detection 5715
MNRAS 522, 5701–5739 (2023)
Figure 12. Similar to Fig. 10 but for CMASS with 18 bins. Using finer binning we find even higher detection significance, both in our compressed and direct
analyses (cf . Fig. 10 ). Unlike the 10-bin CMASS analysis, which peaked at N
eig
= 200, here the detection significance monotonically increases with N
eig
. This
behaviour is likely due to how the covariance matrix structure and signal-to-noise (S/N) in each channel and bin combination balance with the penalty paid for
adding more degrees of freedom if they do not contain much additional S/N. Given that this balance can change depending on the radial binning, we would not
hav e e xpected that the behaviour of the 10-bin and 18-bin compressed analyses with N
eig
would be uniform.
5.2 Cr oss-corr elation between NGC and SGC in each sample
A cosmological parity-odd correlation is expected to appear isotrop-
ically when decomposed using our choice of basis function. This
would manifest itself as a correlation between the signals in the
SGC and NGC. Noise in the observed 4PCF, would ho we ver not
be correlated between hemispheres or between samples, and would
reduce any underlying cross-correlation. We compute the Pearson
correlation coefficient r
p
between the signals in the SGC and NGC,
with some modifications which we no w describe. The nai ve Pearson
coefficient does not incorporate covariance between different points
within each data stream being cross-correlated, nor does it account
for varying errors from point to point. For instance, two neighbouring
triple-bins within a given channel in NGC are likely to be much more
correlated with each other than two very different triple-bins. Thus,
if it happens that one such triple-bin is highly correlated with its
analogue in SGC, then the neighbouring combination in NGC is
also likely correlated with the neighbouring combination in SGC.
This is not unphysical, but it should be taken as less strong evidence
for correlation than if two highly independent triple-bins in NGC
each showed a strong correlation with their analogues in SGC. The
Pearson coefficient would not distinguish. This motivates us to rotate
our measured data streams from NGC and from SGC to a basis
where each data vector ( within the NGC or SGC) is independent; we
thus work in the eigenbasis of the analytic covariance matrix. This
procedure will be affected by any issues with the covariance matrix
estimation, which is present for both analytic and mock covariance.
Another issue with naive Pearson correlation is that some channels
and triple-bins have larger statistical errors than others. They may
therefore more easily be outliers and drive a spurious Pearson
correlation. To correct this, once we are in the ‘independent’ basis
as abo v e, taking the square root of the associated covariance matrix
eigenvalue as its error bar, we divide each value by its error bar.
F or the abo v e procedure, if we had the true co variance, rotating
into the true eigenbasis would preserve any correlations perfectly.
Ho we ver, as discussed in Section 4.2 the analytic covariance is likely
imperfect. We believe it to be close enough to the true covariance
to preserve correlations within a given sample (CMASS or LOWZ)
after rotation, but that trying to rotate into the different eigenbases
implied by the covariances of the two different samples and then
cross-comparing would not be robust. In particular, the different
number densities in CMASS and LOWZ (see Tab l e 1 ) will non-
trivially impact the eigenvectors for each, likely randomizing any
underlying correlations. Given this issue (explored more fully in
Appendix C ), we do not report the results of a CMASS cross LOWZ
analysis in this work.
Having outlined how we render our measured 4PF coefficients
suitable for a cross-correlation analysis, we now define the statistic
we use, the Pearson correlation coefficient r
p
. It is
12
r
p =
N
d
i= 1
d
1 ,i
−
d
1
all
d
2 ,i
−
d
2
all
N
d
i= 1
d
1 ,i
−
d
1
all
2
N
d
i= 1
d
2 ,i
−
d
2
all
2
, (21)
where d
i is the i th element of the 4PCF data vector d
i with i = 1
for the NGC and i = 2 for the SGC. Both are in the decorrelated,
normalized basis for their respective hemisphere. . . .
all
denotes the
12
The Pearson correlation coefficient is best for quantifying bi v ariate nor-
mally distributed data. In the scaled, decorrelated basis, the data vectors
indeed satisfy this assumption. We also quote Spearman’s rank correlation
coefficient, which identifies correlations between any two monotonic data sets
by searching for correlations in the space of their ranks. It is more general
than Pearson but also more difficult to manipulate analytically (as we do with
the Pearson correlation coefficient in Appendix B ). The results of both are
consistent.
Downloaded from https://academic.oup.com/mnras/article/522/4/5701/7169316 by guest on 15 June 2023
5716 J. Hou, Z. Slepian, and R. N. Cahn
MNRAS 522, 5701–5739 (2023)
Figure 13. Similar to Fig. 11 but for LOWZ with 18 bins. As is the case for CMASS, the detection significance increases when using the finer binning (cf.
Fig. 11 ). Ho we ver, the increase is more moderate than the increase for CMASS. The detection significance rises monotonically with N
eig
, in agreement with the
trend for the 10-bin LOWZ analysis of Fig. 11 .
Tab l e 2. The detection significances using the analytic covariance for the
split sky (into NGC and SGC) and joint sky (NGC + SGC). The latter treats a
given channel and radial bin combination in N and S as two separate degrees
of freedom, i.e. we assume vanish ing covariance between N and S. All of
the quoted detection significances are from comparing the χ2
of the data to
a Gaussian fit to the histogram of the mocks’ χ2
; see lower rightmost panels
of Figs 10 –13 (10 and 18 bin results) and of Fig. E1 (six bin results, CMASS
only).
CMASS LOWZ
Bins σeff
, Detection NGC SGC NGC SGC
significance
18 Split 4.7 5.4 2.5 2.0
Joint 7.1 3.1
10 Split 3.3 2.6 1.8 1.9
Joint 4.0 2.7
6 Split −0.7 1.2 – –
Joint 0.4 –
av erage o v er the elements of the v ector, and N
d is the number of
degrees of freedom.
The PDF for the Pearson correlation coefficient if there is no
correlation is (Kenney 1947 ; Hotelling 1953 )
f ( r
p
) =
(1 −r
2
p
)
( N
d
−4) / 2
B
(
1 / 2 , ( N
d
−2) / 2
)
, (22)
where B is the beta function.
Fig. 15 shows the correlation analysis results for CMASS using
10 bins. The left-hand panel is a scatter plot of the (decorrelated
and scaled) data for CMASS NGC and SGC. The histograms
in the extended panels can be well-fit by a Gaussian, showing
this assumption of the Pearson coefficient is satisfied. We find no
statistically significant correlation at the 95 per cent confidence level.
We also report Spearman’s rank coefficient, r
s
, in the figure (see
footnote 10). The right-hand panel shows the PDF for r
p
(black curve;
see equation 21 ) if the two data sets are both normally distributed.
As noted abo v e, a correlation at some level might be expected if
the signal were of cosmological origin. Meanwhile, there are some
systematics where such a correlation would not be expected, and
the same goes for some instrumental artefacts. On the other hand,
artefacts at the telescope could correlate NGC and SGC.
Fig. 16 is similar to Fig. 15 but for the LOWZ sample. Fig. 17
shows the correlation analysis results using 18 bins, where the data
here also follow a Gaussian distribution (although we do not show
them explicitly). In the right-hand panel, for LOWZ, the correlation
coefficient is a ne gativ e number at more than 95 per cent confidence.
Ho we ver, we also find that the correlation coefficients are sensitive
to the covariance matrix. Changing the number density used in
the analytic covariance matrix by 10 per cent can flip the LOWZ
correlation coefficient to a positive number (but one consistent with
zero). Overall, we thus conclude that there is no evidence for a
positive correlation between the N and S for LOWZ or for CMASS
when using 18 bins. Our correlation study results for 10 and 18 bins
and LOWZ and CMASS are summarized in Ta b le 4 .
5.2.1 Linear toy model in the eigenbases to explore χ2
and r
p
To gain an intuition for the high detection significance and the low
correlation between the NGC and SGC in the LOWZ, we construct
a toy model. Henceforth, we use the Pearson correlation coefficients
and χ2
obtained from 10 bins results. The purpose is not to constrain
real physics, but simply to gain an intuition for whether the values
of χ2 and r
p found in our data are plausibly consistent with each
other.
We assume the signal is given by a toy model that is linear in
the eigenbases of the analytic covariance matrices parametrized by a
slope m and an intercept b as s = mx + b , where x is the index
of each eigenvalue. We produce two sets of fake data. Each is
drawn randomly from a Gaussian distribution, the mean of which
Downloaded from https://academic.oup.com/mnras/article/522/4/5701/7169316 by guest on 15 June 2023
Parity-odd 4PCF detection 5717
MNRAS 522, 5701–5739 (2023)
Figure 14. Reduced
max
analysis for 10 radial bins. We use three different methods of obtaining the covariance matrix but al w ays with
max
restricted to 1 or
2. The left column uses the compression scheme. The central column uses the covariance matrix obtained directly from the mocks, which is possible for this
reduced
max
analysis. The third column uses the covariance obtained from adjusting the analytical covariance to fit, as well as possible, the purely mock-based
covariance. The solid curves show the expected distributions for T
2
or χ2
while the histograms show the distributions obtained from the mocks. The solid lines
indicate results from the data. There is o v erall good agreement in detection significance between the purely mock-based covariance (central column) and the
analytic covariance (right column).
is the linear model, and the width of which is the square root of the
eigenvalue of the analytic covariance matrix for either CMASS or
LOWZ as appropriate.
We separately maximize the log-likelihood for the data vectors
d
cmass
= { r
p , cmass
, χ2
cmass
} and d
lowz
= { r
p , lowz
, χ2
p , lowz
} , respectively.
If there is a true primordial signal, we might expect that the parity-
odd 4PCF in LOWZ has an o v erall amplitude 2 ×that in CMASS,
folding in the difference in growth rates and linear biases. Therefore
we search o v er different ranges in the slope and the intercept
for each: m
cmass ∈ [1 ×10
−8
, 2 ×10
−7
], the intercept b
cmass ∈
Downloaded from https://academic.oup.com/mnras/article/522/4/5701/7169316 by guest on 15 June 2023
5718 J. Hou, Z. Slepian, and R. N. Cahn
MNRAS 522, 5701–5739 (2023)
Figure 15. Left-hand panel: Scatter plot for the 4PCF with 10 bins measured from CMASS in the decorrelated basis (described in Section 5.2 ) and scaled by the
cov ariance matrix eigenv alues, with both Pearson ( r
p
) and Spearman ( r
s
) correlation coefficients given in the le gend. The e xtended panes in the left-hand panel
display the histogram o v erplotted with a Gaussian fit. Right-hand panel: density function f ( r ) (black curve; equation 22 ) for the Pearson correlation coefficient
and the Pearson and Spearman correlation coefficients obtained from PATCHY (orange and blue histograms). The 95 per cent confidence level (dashed vertical
lines) corresponds to an r
p
= 0.037. Our Pearson r
p
of −0.014 corresponds to a p -value of 0.460, so there is no statistically significant correlation detected.
Figure 16. Same as Fig. 15 but for LOWZ, again with 10 bins . Again we detect no statistically significant correlation.
[1 ×10
−5
, 5 ×10
−4
] for CMASS and m
lowz
∈ [2 ×10
−8
, 4 ×10
−7
],
the intercept b
lowz ∈ [2 ×10
−5
, 1 ×10
−3
]. In Fig. 19 is the log-
likelihood plotted for slope as a function of intercept. We find
the best-fitting slope and intercept (the bluest region in the log-
likelihood): m
cmass = 5.3 ×10
−8 and b
cmass = 1.1 ×10
−4 for
CMASS; m
lowz = 1.5 ×10
−7 and b
lowz = 2.7 ×10
−4 for LOWZ.
In Fig. 20 , we show a comparison of the correlation coefficients
and χ2 values obtained from the real data to the distribution of
those from the f ak e data. The f ak e data distribution in different
colours is generated with (i) a zero signal, (ii) a signal with
the best-fitting parameters, and (iii) a signal with both a steeper
slope and a higher intercept. This plot shows that it is possible to
explain the low correlation and noticeable detection significance
with a simple linear model.
13 This model’s purpose is not to
13
The linear model can be regarded as a first-order Taylor expansion of some
true physical model. We imagine some true model in the space of the physical
4PCF amplitudes; then rotate this model into the eigenbasis of the covariance
and take a Taylor series for it there.
constrain any non-standard inflationary physics. Rather we wish to
demonstrate that it is generically possible to have high χ2
and low
correlation coef ficients, e ven with a simple functional form for the
model.
6 POTENTIAL SYSTEMATICS
Parity-breaking in large-scale structure is not expected in the standard
cosmological paradigm. Hence, it is important to assess whether
systematic errors could produce an apparent parity-breaking signal.
We must consider systematics both at the signal level and at the
cov ariance le vel. These latter must be considered because they
could cause an underestimation of the covariance, thus leading to
an o v erestimation of the detection significance.
We divide these systematics into three cate gories: surv e y-related
ef fects, observer-related ef fects, and procedure-related ef fects. We
discuss these in, respectively, Sections 6.1 , 6.2 , and 6.3 . We have
already considered the possible impact of systematics on the covari-
ance matrix in Section 4.2.2 .
Downloaded from https://academic.oup.com/mnras/article/522/4/5701/7169316 by guest on 15 June 2023
Parity-odd 4PCF detection 5719
MNRAS 522, 5701–5739 (2023)
Figure 17. PDF of f ( r
p
) for 18 bins ; the left-hand panel displays results using CMASS, the right-hand using LOWZ. The correlation coefficient for CMASS is
consistent with zero, but we see a ne gativ e correlation for LOWZ; this is further discussed in the main text.
Figure 18. Exploration of a linear model for the underlying signal (versus the eigenvalue index) for both CMASS and LOWZ (with 10 radial bins). Left-hand
panel: To y models for CMASS with m = 5.3 ×10
−8
, b = 1.1 ×10
−4
(dashed red) and LOWZ (solid purple) with m = 1.5 ×10
−7
, b = 2.7 ×10
−4
(these
are our best-fitting values; see Fig. 19 for discussion of the fitting and lik elihood). Middle panel: To y model (black curve; ‘f ak e signal’) transformed back to
the observed basis plotted with the measured CMASS 4PCF for
1
=
2
=
3
; upper row is NGC and lower is SGC. Right-hand panel: Analogue of the middle
panel but for LOWZ.
Tab l e 4. Correlation coefficients between NGC and SGC for CMASS and
LOWZ samples; see Section 5.2 and Figs 15 –17 . We do not see significant
cross-correlation between NGC and SGC in either binning scheme or sample;
the possible reason is further discussed in the main text, as is the reason we
cannot perform a CMASS ×LOWZ analysis.
CMASS LOWZ
Bins Correlation NGC SGC NGC SGC
18 r
p 0.002 −0.017
p -value 0.752 0.017
10 r
p −0.014 0.000
p -value 0.460 0.985
6.1 Sur v ey-related effects
Spectroscopic surv e ys pro vide galaxies’ 3D positions, but estimating
the correlation functions requires the excess probability over random
of finding an N-tuplet of galaxies in a given configuration. This in
turn demands precise knowledge of the selection function.
6.1.1 A toy model for errors in the selection function
Despite that great care has been taken in BOSS to model the impact
of astrophysical foregrounds (e.g. stellar density), or observational
conditions on the surv e y selection function, we still seek an intuition
for the impact of an imperfectly modelled selection function on
the parity-odd modes. In particular, our goal is to see if an even-
parity 4PCF is converted into an odd parity one by an improperly
corrected selection function. For this goal, it is sufficient to begin
with a spatially unclustered density field (which would produce only
an even-parity 4PCF).
The selection function encapsulates how our selection criteria (e.g.
colour and magnitude cuts) samples the underlying distribution of
Downloaded from https://academic.oup.com/mnras/article/522/4/5701/7169316 by guest on 15 June 2023
5720 J. Hou, Z. Slepian, and R. N. Cahn
MNRAS 522, 5701–5739 (2023)
Figure 19. Log-likelihood for our two-parameter linear toy model s = mx + b (‘f ak e signal’). We search for m and b so as to make the observed χ2
and r
p
consistent with each other in each sample. The fitting is independent for CMASS and LOWZ. In the eigenbasis of the 4PCF analytic covariance for either
CMASS or LOWZ, we produce our linear model and then add Gaussian noise drawn using these eigenvalues to produce 200 realizations of our ‘f ak e signal’,
each with an associated χ2
and r
p
. This then lets us obtain the covariance matrix of χ2
and r
p
in this toy model. We may then compute, for a given m , and b ,
their likelihood given the actual observed χ2
and r
p
from the data. The blue region is the highest likelihood. We display these results for both CMASS NGC
cross SGC and LOWZ NGC cross SGC. It is notable that the best-fitting parameters for LOWZ are roughly two times those for CMASS; this did not have to be
the case, as the fitting is independent for each sample, but is in fact the scaling we would naively expect were the parity-odd signal truly of primordial origin.
Our interpretation of this linear toy model is further discussed in Section 5.2 .
galaxies in the Universe. The underlying density field can be inferred
from the observed number density n ( s ) and the selection function
f
obs
( s ). Since we have no access to the underlying distribution, we
cannot disentangle how our selection criteria shape the sample we end
up observing from any underlying variation in the universe (usually
specifically along the line of sight). In this discussion, we allow the
deviation of the selection function to be general and depend on 3D
position s .
Our estimate of the true number density of objects is
ˆ
n
true
( s ) =
n ( s )
ˆ
f
obs
( s )
, (23)
where n is the observed number density and
ˆ
f
obs
our estimate of the
selection function. The actual true number density is
n
true
( s ) =
n ( s )
f
obs
( s )
, (24)
where f
obs
is the true selection function.
Hence, we may relate our estimated true number density to the
actual true number density as
ˆ
n
true
( s ) =
f
obs
( s )
ˆ
f
obs
( s )
n
true
( s ) ≡g( s ) n
true
( s ) . (25)
Our density fluctuation field is then
δ( s ) =
ˆ
n
true
( s ) −¯
n
¯
n = g( s ) −1 . (26)
We have assumed that we correctly estimated ¯
n and also that
n
true
( s ) ≡¯
n , i.e. in this simple toy model, there is no underlying
clustering. We require that the integral of g o v er the surv e y will be
unity; we use this condition to normalize g shortly.
We now make a toy model for g to enable assessment of its possible
impact. We take it that it is factorizable as
g( s ) = g
x
( x) g
y
( y) g
z
( z) (27)
and that along each direction it is a power law as
g
x = I
−1 / 3
(
x
0
+ x
)
p
x
, g
y = I
−1 / 3
(
y
0
+ y
)
p
y
,
g
z = I
−1 / 3
(
z
0
+ z
)
p
z
, (28)
where we have written x , y , and z as displacements around a
central point ( x
0
, y
0
, z
0
) and have defined the normalization I as
I ≡L
p
x
+ 1
x
L
p
y
+ 1
y L
p
z
+ 1
z
( p
x
+ 1)( p
y
+ 1)( p
z
+ 1)
. (29)
This normalization ensures that the integral of g o v er the surv e y is
unity and the L
i
are the lengths of the box sides in each direction.
The primary galaxy for our computation of the spherical harmonic
coefficients a
m
(see Section 2 , as well as Philcox et al. 2022 for a
detailed discussion of how these coefficients enter our algorithm) is
at s
0
and each secondary galaxy is displaced by r
i
from the primary.
If we now assume that the true density field is uniform, its change
due to our modulation will be (for the primary)
δ( s
0
) = g( s
0
) −1 . (30)
For the secondaries (in this case, the three field positions around
the primary, which, with it, build up that primary’s contribution to
a given 4PCF channel) we have, by taking a Taylor series for g to
leading order:
δ( s
i
) = g( s
0
) + ∇g( s ) |
s
0
·r
i
−1 , (31)
with r
i ≡s
i −s
0
. We now assess whether this systematic can
contribute to the parity-odd 4PCF. When we multiply out the three
secondaries, we can produce terms involving one, two, or three
factors of ∇g . All of the parity-odd basis functions are proportional
to
ˆ
r
1
·(
ˆ
r
2
׈
r
3
), so only the term involving three factors of ∇g may
possibly contribute.
Downloaded from https://academic.oup.com/mnras/article/522/4/5701/7169316 by guest on 15 June 2023
Parity-odd 4PCF detection 5721
MNRAS 522, 5701–5739 (2023)
Figure 20. The Pearson correlation coefficients r
p
(left-hand column) and χ2
values (right-hand column) observed in our data (black lines) shown o v er the
distribution of the same from the ‘f ak e signal’ generated by our ‘linear-in-eigenbasis’ toy model. We show results from m = b = 0 in blue, from the best-fitting
m and b in red (Fig. 18 displays these models and gives the best-fitting values in the caption, and the likelihood used is shown in Fig. 19 ), and from a larger
‘f ak e’ signal in purple to indicate the possible range of outcomes. The f ak e data at each index (see Fig. 18 ) is drawn from a Gaussian distribution with mean
given by the linear model and width given by the square-root of the eigenvalue of the analytic covariance matrices. CMASS results are shown in the upper row
and LOWZ in the lower. We see that there exists a ‘f ak e signal’ that does explain the observed r
p
and χ2
from the real data. While our ‘f ak e signal’ should not
be taken too literally, this does indicate that, generically, our observed χ2
and r
p
are not necessarily in tension with each other.
We now compute the projection of this term onto our basis.
Denoting it ζg
, we have
ζg =
d
ˆ
r
1
d
ˆ
r
2
d
ˆ
r
3
( ∇ g( s ) |
s
0
·r
1
)( ∇ g( s ) |
s
0
·r
2
)
×( ∇g( s ) |
s
0
·r
3
)
ˆ
r
1
·(
ˆ
r
2
׈
r
3
)
= r
1
r
2
r
3
∇ g( s ) |
s
0
·( ∇ g( s ) |
s
0
×∇ g( s ) |
s
0
) = 0 , (32)
where we used a vector integral identity to obtain the second line.
Though the abo v e shows that on av erage (in an infinite volume),
a power-law modulation would produce no spurious parity-odd
signal, such modulation could increase the fluctuations observed.
If such a systematic were present in the data but not the mocks,
then the covariance used (from mocks or calibrated by mocks)
would be an underestimate, and this could produce an apparent
detection. Hence, we explore this power-law toy model numerically
to assess if such an effect might occur. We construct 100 mocks
whose points have a spatially random distribution, each in a cubic
box with side length L
box
= 500 h
−1
Mpc and number density
¯
n = 3 ×10
−4
h
−1
Mpc
−3
. The distribution of the mock number
density along the x -, y -, and z-axes is shown in Fig. 21 . We investigate
two cases, p
x
= 1, p
y
= 1, p
z
= 1 and p
x
= 1, p
y
= 2, p
z
= 3. We also
create a set of 100 mocks with the same box size and number density
but with a uniform distribution along the three ax es. F or all cases,
we use the same random catalogue with a uniform distribution at all
spatial positions to convert the density field into a density fluctuation
field (the only rele v ant part of edge correction for a periodic box).
Fig. 22 shows the 4PCF coefficients measured from the no-
clustering mocks. From left- to right-hand panel, we plot the channels
{
1
= 1,
2
= 1,
2
= 1 } , {
1
= 2,
2
= 3,
2
= 4 } , and {
1
= 4,
2
=
4,
2
= 3 } . In each panel, we compare a parity-odd 4PCF coefficient
for an unmodulated sample and two differently modulated samples.
As expected, the 4PCFs measured from the uniform sample have
very small amplitudes. The χ2
for the uniform sample is comparable
to those modulated samples. Overall, this means that the power-law
modulation would not produce a spurious detection of parity-odd
modes at the signal level.
6.1.2 Density distortions, selection function, and fiber collisions
In theory, one can expect parity-odd signals to be induced solely by
3D modulations of the density field because such modulation could
Downloaded from https://academic.oup.com/mnras/article/522/4/5701/7169316 by guest on 15 June 2023
5722 J. Hou, Z. Slepian, and R. N. Cahn
MNRAS 522, 5701–5739 (2023)
Figure 21. Normalized number count distribution along the x -, y -, and z-axes
of mocks with a spatially random underlying distrib ution b ut modulated by a
power law in each direction (Section 6.1.1 , equation 28 ), as indicated in the
legend; the function f shows the modulation along each axis, and the variable
after the colon which axis is plotted. We show two cases along each of the
three axes; powers p
x
= p
y
= p
z
= 1 and then p
x
= 1, p
y
= 2, p
z
= 3.
conv ert a parity-ev en mode into a parity-odd mode. A 2D screen
uncorrelated with the underlying density field can be separately
av eraged o v er rotations, and the rotation will force an y parity-odd
piece of the screen to vanish. An example of an uncorrelated screen is
that given by the veto mask, which was designed to remove angular
regions contaminated by foreground bright stars, plate holes, etc.,
and is not correlated with the underlying 3D galaxy density.
Another 2D effect is fiber collisions. BOSS uses fibres to guide the
light of the observed objects from the focal plane to the spectrograph.
Each fibre has a limited angular diameter (62 arcsec ) and multiple
galaxies that fall within this radius cannot be resolved within one
exposure. In the data, fiber collisions are corrected by upweighting
the nearest angular neighbour (Ross et al. 2012 ; Reid et al. 2016 ). This
approach can potentially introduce long wavelength mode-coupling
because it identifies neighbours only on an angular basis. Although
one can model the fibre-collision effect as a top-hat function (Hahn
et al. 2017 ), the exact physical scale could still be redshift-dependent
(see Hou et al. 2021b for quasars, which span a wide redshift range).
To demonstrate the impact of these angular effects on the parity-
odd modes, we turn off the veto mask weights implemented in the
PATC H Y mocks as well as the fiber collision weights. By switching on
and off the fiber collision weight we can see the impact of the nearest
neighbour upweighting on the parity-odd detection significance. The
left-hand panel of Fig. 24 shows the distribution of the resulting T
2
values compared to that expected. When the covariance is inferred
from the contaminated mocks themselves, there is good agreement
between the contaminated mocks’ T
2 distribution and that for the
standard case. We , therefore, conclude that these effects do not induce
parity-odd modes at the signal level.
We also explored the effect of the fiber collision weights on the
signal as found in the real CMASS data. We set these weights to be
identically unity on the data, thus undoing the upweighting of the
nearest angular neighbour of a galaxy lost to fiber collision. We also
make this same change to the fiber collision weights on the mocks,
and then remeasure their 4PCF and refit our analytic covariance
matrix template using it.
We found that with this set-up, the 10-bin parity-odd detection
significance increased by 6 σ. We also computed the parity-even con-
nected 4PCF, to assess if an unaccounted-for error of this type in fiber
collision weights on the data would create tension between data and
mocks. Indeed, we found that with all fibre-collision weights on the
data set to unity (as if one had made a gross error in the data weights),
but with standard weights on the mocks (and also in fitting the
covariance template), there would be a 5 σdisagreement in the parity-
even sector between mocks and data. This argues that any serious
error in the fibre-collision weights that impacts the parity-odd sector
can be controlled by enforcing consistency in the parity-even sector.
We now briefly discuss what we regard as the most likely reason
the parity-odd significance in the data increased in the test abo v e.
We know from the tests on the mocks alone that an error in the fibre-
collision weights cannot induce a true parity-odd signal where there
was initially none (left-hand panel of Fig. 24 ). Thus, the increased
detection significance in the data is presumed to result from the
change in the fiber collision weights’ increasing the variance in the
data more than it does in the mocks used to estimate the covariance.
We suspect this disproportionate increase is due to the following.
In the real surv e y, each observational tile is allowed to be o v erlapped
with others to maximize the fraction of targets that can be assigned
fibres. Thus, uncorrected fiber collisions likely have a lower impact
on the data than on the mocks. In particular, even with the fiber
collision weights set to unity, there is likely better sampling of the
densest regions on the sky in the data than in the mocks. This would
tend to increase the data’s variance relative to that of the mocks in
the situation where neither are corrected for fiber collision.
In addition to the angular effects, we also distort the radial selection
function by introducing a 10 per cent scatter in the n ( z) of the
randoms. At each redshift in Fig. 23 , we draw a value from a Gaussian
with mean 10 per cent and width 5 per cent, and then decide randomly
and with equal probability whether to give it a plus or a minus sign.
Figure 22. Several parity-odd 4PCF coefficients as measured from our spatially random power-law-modulated mocks (Section 6.1.1 ). We show results from 100
mocks with number density
¯
n = 3 ×10
−4
( h
−1
Mpc )
−3
for no modulation (tan), x
1
y
2
z
3
(red), and xyz (grey). We find that the χ2
for the power-law-modulated
mocks is smaller than that for the unmodulated ones, implying that this modulation does not produce a significant parity-odd 4PCF at the signal level.
Downloaded from https://academic.oup.com/mnras/article/522/4/5701/7169316 by guest on 15 June 2023
Parity-odd 4PCF detection 5723
MNRAS 522, 5701–5739 (2023)
Figure 23. PDF of the radial selection function for PATCHY NGC. The selection function applied to the PATCHY mocks is designed to match that from the
BOSS data for both the mock data catalogue (black histogram) and the random catalogue (grey histogram). To test whether an imperfect inference of the radial
selection function can lead to a spurious parity-odd signal, we deliberately distorted the n ( z) for the random catalogue (red histogram), by shifting it by 10
per cent on average, where the percentage shift is drawn from a Gaussian with that mean and width of 5 per cent. We show the distribution of T
2
values for
PATCHY mocks analysed using these ‘shuffled’ randoms in Fig. 24 ; it is consistent with no detection of a parity-odd 4PCF.
We then use this value to set a new probability of a random particle
at that redshift being selected. This procedure results in the red curve
in Fig. 23 for the shuffled randoms’ n ( z). The FKP weights for the
randoms are also recomputed after this.
The distribution of T
2
values for the PATCH Y mocks with shuffled
n ( z) when we use the compression scheme with the covariance
inferred from the contaminated mocks agrees well with that obtained
using the standard weights (black histogram) and also with the
predicted T
2 distribution (black curve) as shown in the left-hand
panel of Fig. 24 . Ho we ver, the radially distorted randoms lead to
an increase in the covariance, which is interestingly in contrast to
mocks with angular contamination. As a result, the mock distribution
shifts towards higher χ2
when we use an analytic covariance matrix
calibrated with respect to the standard mocks (right-hand panel of
Fig. 24 ). In other words, had this systematic been present in our BOSS
data, the covariance matrix inferred from the standard PATC H Y mocks
would have been underestimated. In Ta ble 5 , we estimate how much
this effect could have been affecting our detection significance for
10 and 18 bin tests, respectively.
6.1.3 Magnification bias
We consider magnification bias qualitatively. If the underlying
density field produces non-zero amplitudes only in even-parity 4PCF
channels, it is sufficient to ask whether magnification bias will
preferentially affect tetrahedrons of one handedness. Much like
the method of image charges, we may consider a notional pair: a
tetrahedron and its mirror image. As long as we can pro v e that for
any such pair, there is no effect, no parity-breaking will be introduced.
Consider for simplicity a tetrahedron for which the three galaxies
defining a triangular base are at the same redshift, and the fourth point
is at a higher redshift. If the three ‘base’ points are closely enough
concentrated, they may magnify the fourth galaxy behind them
and on average make it more likely to be selected than otherwise.
Ho we ver, this will be the case also for the mirror image of this
tetrahedron. Indeed, magnification will be invariant to the sign-flip
of the two plane-of-sky coordinates; at most, it can be parity-odd in
z, but any such contribution from the z behaviour alone will vanish
after rotation-averaging.
6.1.4 Splitting the sky into angular patches
To assess if our detection is coming from one part of the sky
preferentially, which might indicate a systematic, we split the sky
into eight angular patches, each of roughly 40 ×35 deg
2 (right
ascension, RA, times declination, DEC), and run our analysis on
each independently. Each angular patch implies a subvolume (hence
SV) as we extend it out in the redshift direction. The right-hand
panel shows some examples of the 4PCF measurement and compares
them to the standard deviation of each subvolume. We estimate
the covariance of each subvolume by splitting the PATC H Y mocks
in the same way. Since the subvolumes receive modulations from
Fourier modes larger than them, ‘beat-coupling’ leads to additional
corrections to the covariance matrix (Putter et al. 2012 ; Li, Hu &
Takada 2014 ). Thus, a naive rescaling of the covariance by its volume
relative to the total survey volume would likely give an underestimate
and thence produce artificially high T
2
values.
Fig. 26 shows the T
2 computed from the entire BOSS CMASS
sky (thick dashed vertical black line) and the eight subvolumes
(thin coloured vertical lines). The expected T
2 distribution is the
black curve. We combine the eight subvolumes by taking an inverse-
variance weighted sum of the T
2
values. Using the diagonals of the
covariance matrix (see the right-hand panel of Fig. 26 ) we find the
combined T
2
= 128, which differs from the one from the entire BOSS
CMASS sky T
2
= 138. We repeat this using the full covariance of the
subvolumes and find T
2
= 129, quite similar to the result using just the
diagonal. We see that subvolume 2 and 3 have the highest detection
significance compared to the av erage; giv en the error bar of the χ2
shown in the right-hand panel of Fig. 26 , this corresponds to a 1.8 σ
difference. We also note that combining the signal at the estimator
level is not a simple linear operation due to the edge correction (see
also equation 8 ). Moreo v er, since we use the same analytic covariance
for different subvolumes, the amount of information picked out by the
ranked eigenvalues could differ for each subvolume. We , therefore,
Downloaded from https://academic.oup.com/mnras/article/522/4/5701/7169316 by guest on 15 June 2023
5724 J. Hou, Z. Slepian, and R. N. Cahn
MNRAS 522, 5701–5739 (2023)
Figure 24. Histograms of statistics for PATCHY mocks’ 4PCFs analysed with different weighting schemes. Left-hand panel: T
2 values calculated with 50
eigenvalues using our data compression scheme (Section 5.1.2 ) with the covariance matrix inferred from the systematics-integrated mocks themselves. The
black curve is the expected T
2
distribution. The histogram for the PATCHY mocks with the standard weighting scheme is in black, with the close pair weight
turned off to assess the impact of fibre collisions in blue, with the veto mask turned off to assess the impact of the angular mask in orange, and finally, in
red, the results of shuffling the randoms’ redshift distribution by 10 per cent as described in Fig. 23 . None of these alterations to the 4PCF analysis produce a
distribution of T
2
values that are inconsistent with that expected under the null hypothesis of no-parity-odd 4PCF (black curve). Right-hand panel: χ2
values
calculated using the analytic covariance calibrated to mocks analysed with the standard weights. Here we can see that the χ2
distribution has a higher mean for
the shuffled mocks than for the standard mocks, suggesting that although such a systematic does not induce a parity-odd signal , it might lead to underestimation
of the covariance.
Figure 25. Left-hand panel: Footprint of CMASS NGC sho wing ho w we split the data into eight patches, each roughly 40
◦in right ascension and 35
◦in
declination. ‘SV’ stands for subvolume, as each patch extends into the redshift direction. Right-hand panel: For six parity-odd channels, we display the difference
between the 4PCF coefficients of the eight subvolumes created by extending these angular patches in the redshift direction, and those from the full NGC sky. We
di vide these dif ferences by the standard de viation of the PATCHY results. The colours for the eight subvolumes match in the left- and right-hand panels. Overall,
we see that no one angular region stands out as producing a great deal more parity-odd signal than another. We offer a more rigorous demonstration of this claim
in Fig. 26 , which shows the χ2
for each subvolume.
Tab l e 5. Here, we summarize the shifts in detection significance, in units of σ, due to various systematics
for both the parity-odd (top two rows) and parity-even (bottom two rows) modes, including redshift failure
(Section 6.1.5 ), the impact of the fiducial cosmology (Section 6.3.1 ), and a distorted selection function
(Section 6.1.2 ). We used the N + S of CMASS and the analytic covariance for these calculations.
Parity Number of Redshift failure Fiducial Distortion in
bins cosmology selection function
Odd 10 1.5 ∼1.5 ∼1.0
18 2.5 ∼2.5 ∼1.7
Even 10 0.9 ∼0.7 ∼0.6
18 1.8 ∼1.9 ∼2.4
Downloaded from https://academic.oup.com/mnras/article/522/4/5701/7169316 by guest on 15 June 2023
Parity-odd 4PCF detection 5725
MNRAS 522, 5701–5739 (2023)
Figure 26. Left-hand panel: Here, we show the T
2
values from our angular patch test as described in Fig. 25 . The T
2
value for the entire CMASS NGC is the
thick black line, the eight subvolumes’ T
2
values are in thin, coloured lines, and the combined T
2
from the eight subvolumes are in dashed black. The predicted T
2
distribution (under the null hypothesis of no parity-odd signal) with 100 degrees of freedom is the black curv e; we hav e used the compressed analysis described
in Section 5.1.2 . Right-hand panel: Covariance for the T
2
values of the eight subvolumes, estimated using the PATCHY mocks. This covariance is needed to
properly co-add the T
2
values for the eight subvolumes to report the combined T
2
given in the left-hand panel (dashed line there). The slight non-vanishing
off-diagonal indicates correlations between the eight subv olumes, b ut this turns out not to affect substantially the combined T
2
value. We may observe that the
combined T
2
value differs from that for the whole NGC by about 10 per cent; we believe this is due to missing the produced by quartets that are split across more
than one subvolume, which go uncounted in our subvolume analysis. We would expect that adding them in (as the whole NGC analysis does) would indeed
raise the χ2
.
expect that there is a difference between analysing the data on the
whole sky or splitting it and then combining it.
We also repeat this same test on the SGC and on the LOWZ sample.
Overall, we do not find any subvolume that has a particularly high
or low detection significance compared to the average.
6.1.5 Redshift failure
The BOSS fibres are not completely randomly positioned in an
observing tile. The resolving power of the fibres is impacted by
their positions on the CCD cameras. Those fibres near the edge of
the spectrograph slit are less likely to obtain a high-quality spectrum
than are more central fibres (see section 4.2.2 of Smee et al. 2013 ).
Consequently, redshift failure occurs in a way that depends on 2D
position on the telescope plate. This dependence is not wholly sym-
metric under a parity transformation of the x - and y -axes on the plate.
Now, when a redshift failure occurs, the nearest angular neighbour
of the failed galaxy is upweighted. The neighbour’s redshift should
be a fair draw from the redshift distribution of the survey. Ho we ver,
redshift failure is statistically slightly more likely for a galaxy at
higher redshift, as it will on average be fainter, and hence more
affected by the reduction in spectrum quality from being at the edge
of a CCD. Thus, replacing it by upweighting a galaxy fairly drawn
from the surv e y’s redshift distribution can actually produce a bias .
Ef fecti vely, we will systematically be pulling galaxies towards lower
redshift as a function of 2D position on the plate. This 2D function,
as already noted, is not fully symmetric under ( x , y ) → ( −x , −y ) and
hence, in combination with the coupling to z produced by the abo v e,
can in principle imprint a parity-odd signal on the density.
We now seek to explore this issue. CMASS has 1.8 per cent of its
objects undergo redshift failures, while LOWZ has only 0.3 per cent
do so (Ross et al. 2012 ). Thus, we mainly focus on CMASS. Since
the redshift failure galaxies are already remo v ed from the catalogue,
we explore this effect by creating more redshift failures and seeing
whether it appreciably alters our parity-odd signal.
We first infer the redshift efficiency according to the inverse of
redshift failure weight, then we apply a fit and assign each fibre a
redshift efficiency rate accordingly. We apply a magnitude cut with
mag
cut = 19.20 and from which a galaxy subsample is selected
weighted by the fibre efficiency rate such that the subsample is
1.8 per cent of the total galaxies. We then identify the faint and
bad fibre galaxies’ neighbours and select the ones that are closest in
angular distance. Finally, we upweight those galaxies’ redshift failure
weights by one and renormalize the random catalogue’s weights such
that it matches the total weights of the new data catalogue. We repeat
the abo v e steps for both NGC and SGC.
Here, we do not transfer the total systematic weights, as they
cannot be simply propagated to the nearest neighbours. Assuming
the total systematic weights to be unity for those galaxies has an
impact on the total weights of only 0 . 1 per cent , so we do not expect
an issue to arise due to this. We also retain the current estimation
of the sector completeness, as this cannot be easily redone at the
catalogue level.
We found that doubling the redshift failures increased the total
parity-odd detection on the joint sky by 2.5 σwhen using 18 radial
bins and 1.5 σwhen using 10 radial bins, both with the analytic
covariance; these results are summarized in Table 5 . This increase
could be due to the 3D modulation of the density field producing
a parity-odd mode as discussed abo v e, but it could also simply
be due to increased variance in the data (i.e. that any parity-odd
signal so produced would average to zero in the infinite limit). This
increased variance could stem from our inability to correct the sector
completeness to account for the removal of more objects.
Given that the angular separation between galaxies and their
nearest neighbours is typically much smaller than the sky coverage
of a sector, and most of the upweighted galaxies and the remo v ed
galaxies are still within the same sector and this, we might expect
that the impact of uncorrected sector completeness in our test set-
up is small. Ho we ver, we need to quantify its impact. In order
to do so, we applied a uniform selection in redshift and selected
Downloaded from https://academic.oup.com/mnras/article/522/4/5701/7169316 by guest on 15 June 2023
5726 J. Hou, Z. Slepian, and R. N. Cahn
MNRAS 522, 5701–5739 (2023)
Figure 27. Parity-odd 4PCF channels {
1
= 1,
2
= 1,
3
= 1 } , {
1
= 2,
2
= 3,
3
= 4 } , and {
1
= 4,
2
= 4,
3
= 3 } . We compare the measured 4PCF
coefficients from 99 PATCHY mocks in real space (grey) and in redshift space (red) with uniform geometry. The error bar is inferred from the standard deviation
of the 99 mocks.
1.8 per cent of those galaxies that are close to bad fibres. We proceed
by upweighting their neighbours. There we observed 2.2 σand 0.5 σ
increase in detection significance for the joint sky using 18 bins and
10 bins, respectively. Hence, the increase in the detection significance
is actually dominated by the imperfect compensation between the
random catalogue and the real data produced by the uncorrected
sector completeness in our test set-up, which leads to a spurious
variation in the density field. This is qualitatively similar to what we
find with the power-law toy model in Section 6.1.1 . Furthermore,
redshift failures also impact the parity-even modes; this impact (for
our ‘doubling’ test) is given in Tabl e 5 . The good agreement between
this mode in the true BOSS data and in the distribution of the mocks
(Philcox et al. 2021 ) demonstrates that redshift failure at the level it
is actually occurring in BOSS should not have a strong impact on the
detection significance.
6.2 Obser v er-related effects
6.2.1 Redshift-space distortions
Galaxies’ radial distances are inferred from their redshifts, ho we ver,
their peculiar velocities caused by the local gravitational environment
add a component in addition to their cosmological redshifts. This
distortion in galaxies’ positions is known as RSDs.
In order to understand the impact of the RSD on parity-odd
modes, we first consider the ‘global’ line of sight approximation
and take the ˆ
z as the line of sight. RSD modulates the density by
the usual Kaiser ( 1987 ) factor, which is even-parity with respect to
the line of sight. The primary galaxy itself is at the origin and so
does not carry any angular momentum; multiplying it with three
density fields, with each of them being parity-even, will be parity-
e ven. Gi ven that rotation averaging does not alter the parity, RSD
considered in the ‘global’ line of sight approximation cannot produce
parity-odd modes. We can easily extend this logic to a ‘local’ line
of sight by taking the line of sight to a quartet to be that to the
primary. The modulation in the o v erdensity field due to the RSD
will be proportional to (
ˆ
n ·ˆ
k )
2
, but again results only in parity-even
combinations. Given that rotation averaging does not alter the parity,
certainly, RSD considered in the ‘local’ line of sight approximation
cannot produce parity-odd modes.
14
We also note Jeong & Schmidt
14
Beutler, Castorina & Zhang ( 2019 ) find parity-odd modes in the power
spectrum multipoles (i.e. odd ; with respect to the line of sight) but that
is an artificial consequence of the way that the Yamamoto estimator, which
selects one of the two galaxies to define the line of sight, is implemented
in Fourier space. Even in configuration space, the Yamamoto estimator does
( 2019 ) have considered parity-odd bispectrum induced by galaxies’
radial velocity by introducing explicit coupling to the line of sight
degree of freedom, by doing so the total rotational invariant system
includes the additional line of sight vector, which by itself has
no constraint on parity as long as the combined system has total
zero angular momentum. In our work, instead, the line of sight
dependencies are already averaged over and we are not sensitive to
the parity-odd mode due to the RSD as in Jeong & Schmidt ( 2019 ).
15
To investigate the impact of RSD, we also use the PATC H Y mocks
in real and redshift space. For this test, the PAT C H Y mocks used are
in a cubic box of side length L
box
= 2500 h
−1
Mpc and with number
density ¯
n = 6 . 6 ( h
−1
Mpc )
−3
at redshift z = 0.57. The galaxies are
shifted using the exact 3D velocities of each galaxy in the mock.
A comparison of the 4PCF in real-space and in redshift-space is
shown in Fig. 27 . We do not observe an o v erall enhancement in the
amplitude for the 4PCF.
6.2.2 Tidal alignment
The large-scale tidal field impacts galaxies’ orientations, which might
lead to an anisotropic galaxy selection. For instance, in the case of
spirals, an imaging surv e y might be more likely to detect a face-on
rather than edge-on disc. Even LRGs are stretched ellipsoids, with
aspect ratios of order 2:1, so orientation could impact the selection,
though the effect would likely be smaller. For LOWZ, the sample is
quite pure (LRGs), but for CMASS, the sample is roughly 26 per cent
discy blue galaxies. We might thus expect any anisotropic selection
would manifest more strongly in CMASS.
Ho we ver, gi ven that the observer’s orientation is random with
respect to the tidal field (and thus the orientation of the galaxies
being selected), we suspect that such an effect could not produce
parity-odd modes. If it did, we might expect a stronger detection of
them in CMASS for the reason noted abo v e.
Ho we ver, for completeness, we here work out what a tidal model
might do if the selection depended upon it. Following Hirata ( 2009 ),
the observed galaxy density fluctuation can be expressed as
δobs
g + 1 =
δtrue
g + 1
(
1 + (
ˆ
n | x )
)
, (33)
not produce parity-odd modes with respect to the line of sight, as shown in
Slepian & Eisenstein ( 2015a ).
15
We have calculated that each term in their galaxy bias model that sources
an odd-parity bispectrum will not lead to an odd-parity 4PCF after averaging
with respect to the line of sight.
Downloaded from https://academic.oup.com/mnras/article/522/4/5701/7169316 by guest on 15 June 2023
Parity-odd 4PCF detection 5727
MNRAS 522, 5701–5739 (2023)
Figure 28. Left-hand panel: Edge-correction matrix from PATCHY mocks with CMASS NGC geometry at a given radial bin (
¯
r
1
,
¯
r
2
,
¯
r
3
) = (27 , 55 , 97) [ h
−1
Mpc ].
Each = {
1
,
2
,
3
} denotes a specific angular channel for both even and odd parity. The non-vanishing off-diagonal features indicate that the surv e y geometry
couples even and odd modes. Right-hand panel: The odd coefficients of the randoms normalized by R
000
. We have mapped the odd angular momentum triplets
given by to a 1D index (vertical axis), and show the radial bin also so mapped on the horizontal axis. Here, we see the dimensionless parity-even-to-parity-odd
mixing is of order 10
−6
, showing that the dominant change to the odd-parity modes from edge-correction is due to R
000
, i.e. o v erall renormalization from
dividing by the mean number density.
where is the selection function at position x , and it depends on
the viewing direction ˆ
n . The average of the selection function
o v er the unit sphere is zero. By treating the large-scale tidal field
as a perturbation, the lowest order Fourier-space contribution (after
Taylor-expanding) is (Hirata 2009 , equation 30)
(
ˆ
n | k ) = A
(
ˆ
n ·ˆ
k )
2
−1
3
δm
( k ) , (34)
where A is an amplitude and δm is the matter o v erdensity. We see
that the selection function is parity-even, and so multiplying it
onto a single density does not alter the parity of any spherical
harmonic expansion coefficient associated with that density field.
Hence, if there is on average signal in only parity-even channels in
the underlying galaxy 4PCF, modulating each density point by (
ˆ
n | k )
abo v e will not convert them into a signal in parity-odd channels.
6.3 Pr ocedur e-r elated effects
6.3.1 Fiducial cosmology
In order to convert the sky coordinates to Cartesian coordinates, we
need to assume a fiducial cosmology. Using a fiducial cosmology
that differs from the true cosmology may lead to distortions in the
clustering.
At the signal level, if any galaxy quadruplets have their parity
flipped by such an error, it is equally likely to happen for a given
tetrahedron and for its mirror image. Therefore, on av erage, ev en if
a ‘parity-flip’ occurs, it will be cancelled out after av eraging o v er
all galaxies in the surv e y. The radial bin separation r we choose
is 14 h
−1
Mpc when we use 10 radial bins and 8 h
−1
Mpc when
we use 18 radial bins. Beginning with the cosmology of Planck
Collaboration VI ( 2020 ) and moving ±3 σ, the change in the angular
diameter distance is less than 1 per cent, which will not be resolved by
our radial bin choice. At the covariance level, a different cosmology
could lead to a different estimate of the best-fitting number density
and volume (for the analytic covariance) and would also change the
power spectrum shape. Even though we use the same cosmology
to convert the galaxies in the real data and the PAT C H Y mocks,
the statistical fluctuations do not necessarily scale the same way
as we assume a fiducial cosmology that differs from the true one.
We can see a 6 per cent change in the volume when distorting
our fiducial cosmology by 3 ×the constraints on m obtained in
Planck Collaboration VI ( 2020 ). Such a difference corresponds to an
o v erestimation of the detection significance by ∼2.5 σ(see Table 5 ).
Ho we ver, if there is an obvious difference between our choice of
fiducial cosmology and the underlying one, it will also be visible
in the comparison of the parity-even 4PCF measurement. The fact
that we saw good agreement between the observ ed parity-ev en 4PCF
and the mock distribution suggests wrong fiducial cosmology is not
rele v ant here.
6.3.2 Survey geometry and edge correction
As discussed in Section
2 , the 4PCF measurement requires correction
for the surv e y geometry as given in equation ( 8 ). In principle, there
can be a mixing between parity-even and parity-odd channels due
to the surv e y geometry. The left-hand panel of Fig. 28 shows the
coupling matrix M
1
2
3
,
1
2
3
as in equation ( 9 ) from PATC H Y mocks
with CMASS NGC geometry at a radial bin combination ( r
1
, r
2
,
r
3
) = (27, 55, 97) ( h
−1
Mpc). This coupling matrix has off-diagonal
elements, which means that the parity-even mode can couple to
the parity-odd mode. To assess the correction of the parity-odd
modes due to the surv e y geometry, we plot the ratio of the random
coefficients to the lowest order one, R
1
2
3
/ R
000
(right-hand panel
of Fig. 28 ). These ratios are of order 10
−6
; the dominant edge
correction is thus simply R
000
, which does not mix parity. In addition,
the fact that the PATC H Y mocks (which also get edge-corrected)
are consistent with the predicted distribution indicates that surv e y
geometry correction is not inducing spurious parity-odd modes.
We also may develop a toy model for the edge-correction factors
based on Slepian & Eisenstein ( 2015b , section 4.3), where the impact
of a planar edge on a uniform density sphere was considered. We
consider a sphere of radius R some distance away from a planar edge
of the surv e y, and orient the z-axis perpendicular to that plane. We
may define a critical angle cosine μc
; for μ< μc
, a point will fall
outside the surv e y. We may compute the a
m as a function of μc
,
then form the 4PCF estimate, and finally average over all possible
Downloaded from https://academic.oup.com/mnras/article/522/4/5701/7169316 by guest on 15 June 2023
5728 J. Hou, Z. Slepian, and R. N. Cahn
MNRAS 522, 5701–5739 (2023)
μc
, which corresponds to averaging over how far away the sphere’s
centre is from the edge. The choice of z-axis means that only m = 0
a
m
are non-zero. The set-up is described more e xtensiv ely in Slepian
& Eisenstein ( 2015b , section 4.3).
Proceeding in this way, we may find a general form for a
0
, which
is given in Slepian & Eisenstein ( 2015b ). Here, we will only compute
the first few leading edge correction factors, so we find it more useful
to simply quote the first few a
0
e xplicitly. We hav e a
00
=
√
π[ μc
+
1], a
10
=
√
3 π[ μ2
c
−1] / 2, and a
20
=
√
5 π[ μ3
c
−μc
] / 2. Cubing
a
00
, we find that R
000
= π3 / 2
[15 / 4]; the 3- j symbol involved in
R
000
is unity . Similarly , we obtain R
011
= −π3 / 2
√
3 [7 / 40], R
112
=
−π3 / 2
√
3 / 2 / 32, and R
022
= −π3 / 2
√
5 (33 / 1120). These lead to,
respectively, f
011
= 8 per cent , f
112
= −1 per cent , and f
022
=
2 per cent , where these are the ratios of the R with the indicated
angular momenta to R
000
. Furthermore, we note that only of order
20 per cent of spheres in the surv e y impinge on a boundary at all for
the BOSS footprint; hence these factors would be expected to scale
down by roughly a factor of five. These estimates show that the edge
correction is not a substantial alteration to the measured 4PCF, and
indeed that the dominant effect is that of R
000
, with the higher order
edge-correction factors contributing of order a few per cent at most
(after division by the factor of five).
7 CONCLUSION
In this work, we have used a no v el, recently suggested method
(Cahn et al. 2021 ) to search for parity-violation in 3D large-scale
structure. This method exploits the fact that the lowest order shape
that cannot be rotated into its mirror image in 3D is a tetrahedron,
and thus counting tetrahedra and looking for excesses of one type
o v er another can reveal parity violation. We have detected strong
evidence for it in both the CMASS and LOWZ samples of BOSS,
and in both the NGC and SGC of each, summarized in Ta b l e 2 . We
hav e e xplored numerous systematics to assess whether an y could
produce either a spurious parity-odd signal, or a substantial change
to the covariance matrix used such that we would underestimate the
statistical error bars so as to make a spurious detection. We have
not found any systematics that seems to do either. We have also
performed a number of tests of our covariance matrix and pursued a
number of analysis variations both as regards covariance matrix and
radial binning, scales, and angular momenta used.
Ov erall, the compact takea way from this work is that CMASS with
18 radial bins has a 7 σdetection of parity-odd modes when using the
analytic covariance with all degrees of freedom, and a roughly 4 σ
detection when using an order of 500 eigenvalues plus a covariance
estimated directly from the mocks. Given a large number of degrees
of freedom in our analytic-covariance analysis, underestimating the
noise by 20 per cent in every degree of freedom would be enough to
produce this detection spuriously. In the ‘compressed’ analysis with
500 eigenvalues, we would have had to underestimate the covariance
by 30 per cent in ev ery de gree of freedom to spuriously produce
our 4 σresult. Such a substantial underestimate seems unlikely but
given that the result, if cosmological in origin, would be significant,
caution remains warranted.
16
We now briefly revisit the key technical aspects of our analysis
and of systematics search.
16
We chose 500 eigenvalues for discussion here because that strikes a balance
between detection significance yet has several mocks per element in the
covariance. This has 1000 degrees of freedom because we treat eigenvalues
from north and south as separate degrees of freedom.
7.1 Radial binning
In this work, we tried three different radial binning schemes
and found different detection significances. We hypothesize that
the increasing detection significance is due to the reduction of
internal cancellation. The finest binning we use in this work has
s = 8 h
−1
Mpc , which is roughly half of the average inter-particle
separation for BOSS. In this work, our main goal is to explore a
possible parity-odd signal; we leave the exploration of the optimal
radial binning to future work.
7.2 CMASS versus LOWZ detections
CMASS and LOWZ are both mainly composed of LRGs, but the
two samples behave differently with regard to i) sensitivity of the
detection significance to varying radial binning and ii) sensitivity of
the detection significance to varying
max
. These different behaviours
may be due to differences in the number density, mean redshift,
galaxy population, and redshift failure rates of the two samples.
7.3 Corr elation acr oss the sky
We do not observe a statistically significant positive cross-correlation
between the NGC and SGC in each data set, and we find that this test is
sensitive to radial binning. We explored a simple phenomenological
model to understand if high detection significance but low cross-
correlation is generically possible, and it seems so. These two
measures are algebraically independent so this finding was not
outside the realm of expectation.
7.4 Systematics
While we have shown that nearly all the systematics considered
do not produce parity-odd modes at the signal level (i.e. they
would contribute vanishingly in the infinite-volume limit), some can
enhance the variance of the observed data relative to the covariance
from the mocks if they are present in the data but not the mocks.
If substantial enough, this in principle could produce a spurious
detection. In practice, we have not found any systematic that, even if
grossly larger than what we believe to be present in the data, could
increase the variance enough to explain our results. We have also
looked for consistency in the even-parity sector as a way to guard
against any systematics’ unbeknownst to us increasing the variance.
7.5 Implications
Giv en the une xpected nature of the detected signal, it is premature
to speculate on the implications. Following the advice of Newton:
hypotheses non fingo. .
17
Nonetheless, it is worth placing our results
in the context of possible theoretical models that could explain
them, should the signal here be of genuinely cosmological origin.
Besides Chern–Simons-like interactions (Deser, Jackiw & Temple-
ton 1982a , b ; Witten 1989 ) with couplings of different forms and
various extensions (Freidel, Minic & Takeuchi 2005 ; Alexander,
Peskin & Sheikh-Jabbari 2006 ; Liu et al. 2020 ; Li & Zhao 2022 ),
string-sourced perturbations (Pogosian & Wyman 2008 ), primordial
vortices (V ilenkin 1978 ; V ilenkin & Leahy 1982 ; Brizard, Murayama
& Wu rtele 2000 ), broken symmetry during phase transition (’t Hooft
17
‘I contrive no hypotheses’, from Newton’s ‘General Scholium’ added to
the Principia of 1713.
Downloaded from https://academic.oup.com/mnras/article/522/4/5701/7169316 by guest on 15 June 2023
Parity-odd 4PCF detection 5729
MNRAS 522, 5701–5739 (2023)
1974 ; Quashnock, Loeb & Spergel 1989 ; Baym, B
¨
odeker & McLer-
ran 1996 ), and anomalies in gravity (Alvarez-Gaume & Witten 1984 ;
Contaldi, Magueijo & Smolin 2008 ; Kuntz 2019 ) can all produce
parity-violation. Among these mechanisms, the simplest rele v ant for
us is where there is a non-vanishing parity-odd trispectrum (Fourier-
space analogue of the 4PCF) in the scalar sector (Shiraishi, Komatsu
& Peloso 2014 ; Shiraishi 2016 ), as curvature perturbations directly
seed the galaxy density perturbations. Extending such models to
include the full wav e-v ector dependence as well as ‘late-time’ effects
such as galaxy biasing and RSDs will be an important step on the
road to comparing models to data and determining if the signal here
observed is consistent with any such models. Especially given that
the detection significance we report here is a cumulative result from
hundreds to thousands of degrees of freedom, a model comparison
will provide much more constraining power as it will enable the use
of shape information.
7.6 Outlook
There are two main directions we suggest on the data side to
further pursue the parity-odd sector. First, it is worth validating the
measurement on different surv e ys with different designs. Upcoming
surv e ys such as Dark Energy Spectroscopic Instrument (DESI;
DESI Collaboration 2016 ), Euclid (Laureijs et al. 2011 ), and Rubin
(LSST Science Collaboration 2009 ), which will provide much-larger
samples and with different systematics, will offer expanded insight
into the signal’s true origin. In particular, if our signal is of truly
cosmological origin, it would likely be detectable in DESI at much
higher statistical significance due to the larger volume. Thus, while
in this paper, making an error in our covariance matrix at the 20–
30 per cent level would produce a spurious detection, in DESI, we
expect one would have to make a much larger error in the covariance.
Since this is unlikely, finding evidence for this signal in DESI would
be a strong indicator that either the signal is real, or that it is due to
some as yet-undisco v ered parity-breaking systematic shared between
BOSS and DESI. Yet, given that DESI has a different imaging
surv e y, a different instrument, and a method for fibring objects
in taking spectra, and is on a different telescope than BOSS, this
latter possibility also seems unlikely. Second, it is worth pursuing
further approaches to search out systematics. F or e xample, assuming
a linear relationship between the systematics and the observed galaxy
density field, one can study cross-correlations between them to
reveal residual systematics (Ho et al. 2012 ). Alternatively, likelihood-
based forward modelling can also be applied to search for unknown
systematics (Lavaux, Jasche & Leclercq 2019 ). The first approach
would require a non-trivial extension of our current NPCF algorithm
in order to compute cross-correlations between fields, and the second
requires injecting inferred systematics maps into mock catalogues.
We leave this for future work.
Whether the detection of the current work ultimately turns out to be
cosmological in origin or not, we hope that the careful and e xtensiv e
treatment of systematics here offers a well-delineated avenue forward
for future studies of parity-breaking early-Universe physics using 3D
large-scale structure.
ACKNOWLEDGEMENTS
We thank K. Dawson, H. Ding, D.P. Finkbeiner, A. Ginsburg, R.
Guzm
´
an, D. Jeong, A. Krolewski, E. Lada, W. Perci v al, O. Philcox,
S. Portillo, D. Richardson, A.J. Ross, U. Seljak, D.J. Schlegel, M.
Slepian, D. Slepian, D. Spergel, C. Steinhardt, C. Tel e sco , and W.
Xue for useful conversations. We especially thank D.J. Eisenstein,
A. de Mattia and A.G. S
´
anchez for e xtensiv e comments on the paper.
We also thank A. Chuang, F.S. Kitaura, and C. Zhao for providing
access to the PATC H Y mocks. We thank E. Deumens of UF Research
Computing for assistance using HiPerGator. Finally, we thank the
members of the Slepian group for thoughtful discussions throughout
the course of this work: N. Brown, J. Chellino, M. Hansen, F.
Kamalinejad, K. Meigs, W. Ortol
`
a Leonard, J. Sunseri, and C. Zhang.
JH is indebted to G. Hou for all their unconditional supports in
science. JH has received funding from the European Union’s Horizon
2020 research and innovation program under the Marie Skłodowska-
Curie grant agreement No 101025187.
DATA AVAILABILITY STATEMENT
The data sets underlying this article that are used to measure the 4-
point correlation functions and the covariance matrices are available
via https:// data.sdss.org/ sas/ dr12/boss/ lss/ .
REFERENCES
Albrecht A. , Steinhardt P. J., 1982, Phys. Rev. Lett. , 48, 1220
Alexander S. , Yunes N., 2009, Phys. Rep. , 480, 1
Alexander S. H. S. , Peskin M. E., Sheikh-Jabbari M. M., 2006, Phys. Rev.
Lett. , 96, 081301
Alvarez-Gaume L. , Witten E., 1984, Nucl. Phys. B , 234, 269
Bardeen J. M. , 1980, Phys. Rev. D , 22, 1882
Barnaby N. , Namba R., Peloso M., 2011, J. Cosmol. Astropart. Phys. , 2011,
009
Bartolo N. , Matarrese S., Peloso M., Shiraishi M., 2015, J. Cosmol. Astropart.
Phys. , 2015, 039
Baym G. , B
¨
odeker D., McLerran L., 1996, Phys. Rev. D , 53, 662
Bernardeau F. , Colombi S., Gazta
˜
naga E., Scoccimarro R., 2002, Phys. Rep. ,
367, 1
Beutler F. et al., 2014, MNRAS , 443, 1065
Beutler F. , Castorina E., Zhang P. , 2019, J. Cosmol. Astropart. Phys. , 2019,
040
BOSS collaboration , 2017, MNRAS , 470, 2617
Brizard A. J. , Murayama H., Wurtele J. S., 2000, Phys. Rev. E , 61, 4410
Bundy K. , Leauthaud A., Saito S., Maraston C., Wa ke D. A., Thomas D.,
2017, ApJ , 851, 34
Cahn R. N. , Slepian Z., 2020, preprint ( arXiv:2010.14418 )
Cahn R. N. , Slepian Z., Hou J., 2021, preprint ( arXiv:2110.12004 )
Cannon R. et al., 2006, MNRAS , 372, 425
Contaldi C. R. , Magueijo J., Smolin L., 2008, Phys. Rev. Lett. , 101, 141101
Dawson K. S. et al., 2013, AJ , 145, 10
Deser S. , Jackiw R., Templeton S., 1982a, Phys. Rev. Lett. , 48, 975
Deser S. , Jackiw R., Templeton S., 1982b, Ann. Phys. , 140, 372
DESI Collaboration , 2016, preprint ( arXiv:1611.00036 )
Dyda S. , Flanagan
´
E.
´
E., Kamionkowski M., 2012, Phys. Rev. D , 86, 124031
Eisenstein D. J. et al., 2001, AJ , 122, 2267
Eisenstein D. J. et al., 2011, AJ , 142, 72
Eskilt J. R. , Komatsu E., 2022, Phys. Rev. D , 106, 063503
Feldman H. A. , Kaiser N., Peacock J. A., 1994, ApJ , 426, 23
Freidel L. , Minic D., Takeuchi T. , 2005, Phys. Rev. D , 72, 104002
Friesen B. et al., 2017, Proc. International Conference for High Performance
Computing, Networking, Storage and Analysis (SC ’17) . Association for
Computing Machinery, p. 20
Fry J. N. , Peebles P. J. E., 1978, ApJ , 221, 19
Garcia K. , Slepian Z., 2022, MNRAS , 515, 1199
Gualdi D. , Ve r d e L., 2022, J. Cosmol. Astropart. Phys. , 2022, 050
Hahn C. , Scoccimarro R., Blanton M. R., Tin ker J. L., Rodr
´
ıguez-Torres S.
A., 2017, MNRAS , 467, 1940
Hirata C. M. , 2009, MNRAS , 399, 1074
Ho S. et al., 2012, ApJ , 761, 14
Hotelling H. , 1953, J. R. Stat. Soc. B, 15, 193
Downloaded from https://academic.oup.com/mnras/article/522/4/5701/7169316 by guest on 15 June 2023
5730 J. Hou, Z. Slepian, and R. N. Cahn
MNRAS 522, 5701–5739 (2023)
Hou J. , Cahn R. N., Philcox O. H. E., Slepian Z., 2021a, Phys. Rev. D , 106,
043515
Hou J. et al., 2021b, MNRAS , 500, 1201
Jackiw R. , Pi S. Y. , 2003, Phys. Rev. D , 68, 104012
Jeong D. , Kamionkowski M., 2012, Phys. Rev. Lett. , 108, 251301
Jeong D. , Schmidt F. , 2019, Phys. Rev. D , 102, 023530
Kaiser N. , 1987, MNRAS , 227, 1
Kamionkowski M. , Souradeep T., 2011, Phys. Rev. D , 83, 027301
Kenney J. F. , 1947, Mathematics of statistics. Macmillan, London
Kerscher M. , Szapudi I., Szalay A. S., 2000, ApJ , 535, L13
Kitaura F.- S . et al., 2016, MNRAS , 456, 4156
Kullback S. , Leibler R. A., 1951, Ann. Math. Stat. , 22, 79
Kuntz I. , 2019, Found. Phys. , 49, 191
Landy S. D. , Szalay A. S., 1993, ApJ , 412, 64
Laureijs R. et al., 2011, preprint ( arXiv:1110.3193 )
Lavaux G. , Jasche J., Leclercq F., 2019, preprint ( arXiv:1909.06396 )
Leauthaud A. et al., 2016, MNRAS , 457, 4021
Lee T. D. , Ya n g C. N., 1956, Phys. Rev. , 104, 254
Li M. , Zhao D., 2022, Phys. Lett. B , 827, 136968
Li Y. , Hu W. , Takada M., 2014, Phys. Rev. D , 89, 083519
Linde A. D. , 1982, Phys. Lett. B , 108, 389
Linde A. D. , 1983, Phys. Lett. B , 129, 177
Liu T. , Tong X., Wang Y., Xianyu Z.-Z., 2020, J. High Energy Phys. , 2020,
189
LSST Science Collaboration , 2009, LSST Science Book, Ve r s i on 2.0. preprint
( arXiv:0912.0201 )
Lue A. , Wa ng L., Kamionkowski M., 1999, Phys. Rev. Lett. , 83, 1506
Maraston C. et al., 2013, MNRAS , 435, 2764
Masters K. L. et al., 2011, MNRAS , 418, 1055
Masui K. W. , Pen U.-L., Tur ok N., 2017, Phys. Rev. Lett. , 118, 221301
Minami Y. , Komatsu E., 2020, Phys. Rev. Lett. , 125, 221301
Nishizawa A. , Kobayashi T., 2018, Phys. Rev. D , 98, 124018
Orlando G. , Pieroni M., Ricciardone A., 2021, J. Cosmol. Astropart. Phys. ,
2021, 069
¨
Ozsoy O. , 2021, Phys. Rev. D , 104, 123523
Parejko J. K. et al., 2013, MNRAS , 429, 98
Peebles P. J. E. , 2001, in Mart
´
ınez V. J., Tr imble V. , Pons-Border
´
ıa M. J., eds,
ASP Conf. Ser. Vo l . 252, Historical Development of Modern Cosmology.
Astron. Soc. Pac., San Francisco, p. 201
Philcox O. H. E. , Hou J., Slepian Z., 2021, preprint ( arXiv:2108.01670 )
Philcox O. H. E. , Slepian Z., Hou J., Wa rne r C., Cahn R. N., Eisenstein D. J.,
2022, MNRAS , 509, 2457
Planck Collaboration VI , 2020, A&A , 641, A6
Pogosian L. , Wyman M., 2008, Phys. Rev. D , 77, 083509
Portillo S. K. N. , Slepian Z., Burkhart B., Kahraman S., Finkbeiner D. P.,
2018, ApJ , 862, 119
Putter R. d. , Wagner C., Mena O., Ve r d e L., Perci v al W. J., 2012, J. Cosmol.
Astropart. Phys. , 2012, 019
Quashnock J. M. , Loeb A., Spergel D. N., 1989, ApJ , 344, L49
Reid B. et al., 2016, MNRAS , 455, 1553
Rodr
´
ıguez-Torres S. A. et al., 2017, MNRAS , 468, 728
Ross A. J. et al., 2012, MNRAS , 424, 564
Ross A. J. et al., 2016, MNRAS , 464, 1168
Sabiu C. G. , Hoyle B., Kim J., Li X.-D., 2019, ApJS , 242, 29
Saito S. , Ichiki K., Taruya A., 2007, J. Cosmol. Astropart. Phys. , 2007, 002
Saito S. et al., 2016, MNRAS , 460, 1457
Sakharov A. D. , 1967, JETP Lett., 5, 24
Scoccimarro R. , Couchman H. M. P. , Frieman J. A., 1999, ApJ , 517, 531
Sellentin E. , Heavens A. F., 2016, MNRAS , 456, L132
Shiraishi M. , 2016, Phys. Rev. D , 94, 083503
Shiraishi M. , Nitta D., Yok o yama S., Ichiki K., Takahashi K., 2011, Prog.
Theor. Phys. , 125, 795
Shiraishi M. , Komatsu E., Peloso M., 2014, J. Cosmol. Astropart. Phys. ,
2014, 027
Slepian Z. , Eisenstein D. J., 2015a, preprint ( arXiv:1510.04809 )
Slepian Z. , Eisenstein D. J., 2015b, MNRAS , 454, 4142
Slepian Z. , Eisenstein D. J., 2015c, MNRAS , 455, L31
Slepian Z. , Eisenstein D. J., 2018, MNRAS , 478, 1468
Slepian Z. et al., 2017a, MNRAS , 468, 1070
Slepian Z. et al., 2017b, MNRAS , 469, 1738
Slepian Z. et al., 2017c, MNRAS , 474, 2109
Slepian Z. , Li Y. , Schmittfull M., Vlah Z., 2019, preprint ( arXiv:1912.00065 )
Smee S. A. et al., 2013, AJ , 146, 32
Soda J. , Kodama H., Nozawa M., 2011, J. High Energy Phys. , 2011, 67
Sorbo L. , 2011, J. Cosmol. Astropart. Phys. , 2011, 003
Starobinsky A. A. , 1982, Phys. Lett. B , 117, 175
Sugiyama N. S. , Saito S., Beutler F. , Seo H.-J., 2019, MNRAS , 484,
364
Sugiyama N. S. , Saito S., Beutler F. , Seo H.-J., 2021, MNRAS , 501,
2862
Szapudi I. , Szalay A. S., 1998, ApJ , 494, L41
’t Hooft G. , 1974, Nucl. Phys. B , 79, 276
Vilenkin A. , 1978, Phys. Rev. Lett. , 41, 1575
Vilenkin A. , Leahy D. A., 1982, ApJ , 254, 77
Wan g A. , Wu Q., Zhao W., Zhu T., 2013, Phys. Rev. D , 87, 103512
Wish art J. , 1928, Biometrika , 20A, 32
Witten E. , 1989, Commun. Math. Phys. , 121, 351
Wu C. S. , Ambler E., Hayward R. W. , Hoppes D. D., Hudson R. P., 1957,
Phys. Rev. , 105, 1413
Yunes N. , O’Shaughnessy R., Owen B. J., Alexander S., 2010, Phys. Rev. D ,
82, 064017
Zhu T. , Zhao W., Huang Y. , Wa ng A., Wu Q., 2013, Phys. Rev. D , 88, 063508
Downloaded from https://academic.oup.com/mnras/article/522/4/5701/7169316 by guest on 15 June 2023
Parity-odd 4PCF detection 5731
MNRAS 522, 5701–5739 (2023)
APPENDIX A: PARITY-ODD BASIS FUNCTIONS
IN CARTESIAN REPRESENTATION
Below we list parity-odd basis functions in Cartesian representation
for angular momentum up to
max
= 4.
P
111
(
ˆ
r
1
,
ˆ
r
2
,
ˆ
r
3
) = −i
3
√
2
(4 π)
−3 / 2 ˆ
r
1
·(
ˆ
r
2
׈
r
3
)
,
P
122
(
ˆ
r
1
,
ˆ
r
2
,
ˆ
r
3
) = i
45
2
(4 π)
−3 / 2 ˆ
r
1
·(
ˆ
r
2
׈
r
3
) (
ˆ
r
2
·ˆ
r
3
)
,
P
133
(
ˆ
r
1
,
ˆ
r
2
,
ˆ
r
3
) = −i
15
4
√
7 (4 π)
−3 / 2 ˆ
r
1
·(
ˆ
r
2
׈
r
3
)
(
ˆ
r
2
·ˆ
r
3
)
2
−1
5
,
P
144
(
ˆ
r
1
,
ˆ
r
2
,
ˆ
r
3
) = + i
21
√
15
4
(4 π)
−3 / 2
ˆ
r
1
·(
ˆ
r
2
׈
r
3
)
×(
ˆ
r
2
·ˆ
r
3
)
3
−3
7
(
ˆ
r
2
·ˆ
r
3
)
,
P
223
(
ˆ
r
1
,
ˆ
r
2
,
ˆ
r
3
) = −i 15
5
8
(4 π)
−3 / 2 ˆ
r
1
·(
ˆ
r
2
׈
r
3
)
×(
ˆ
r
1
·ˆ
r
3
) (
ˆ
r
2
·ˆ
r
3
) −1
5
ˆ
r
1
·ˆ
r
2
,
P
234
(
ˆ
r
1
,
ˆ
r
2
,
ˆ
r
3
) = + i
4
7 ·15
(4 π)
−3 / 2 ˆ
r
1
·(
ˆ
r
2
׈
r
3
)
×(
ˆ
r
1
·ˆ
r
3
)(
ˆ
r
2
·ˆ
r
3
)
2
−1
7
ˆ
r
1
·ˆ
r
3
−2
7
(
ˆ
r
1
·ˆ
r
2
) (
ˆ
r
2
·ˆ
r
3
)
,
P
333
(
ˆ
r
1
,
ˆ
r
2
,
ˆ
r
3
) = i
5 ·105
6
√
6
(4 π)
−3 / 2 ˆ
r
1
·(
ˆ
r
2
׈
r
3
)
×[
(
ˆ
r
1
·ˆ
r
2
)(
ˆ
r
1
·ˆ
r
3
)(
ˆ
r
2
·ˆ
r
3
)
−1
5
(
ˆ
r
1
·ˆ
r
2
)
2
+ (
ˆ
r
1
·ˆ
r
3
)
2
+ (
ˆ
r
2
·ˆ
r
3
)
2
+
2
25
,
P
344
(
ˆ
r
1
,
ˆ
r
2
,
ˆ
r
3
) = −i
15 ·49
4
5
11
(4 π)
−3 / 2
ˆ
r
1
·(
ˆ
r
2
׈
r
3
)
×(
ˆ
r
1
·ˆ
r
2
)(
ˆ
r
1
·ˆ
r
3
)(
ˆ
r
2
·ˆ
r
3
)
2
−1
5
(
ˆ
r
2
·ˆ
r
3
)
3
+
31
245
(
ˆ
r
2
·ˆ
r
3
) −3
49
(
ˆ
r
1
·ˆ
r
2
)(
ˆ
r
1
·ˆ
r
3
)
−2
7
(
ˆ
r
1
·ˆ
r
3
)
2
(
ˆ
r
2
·ˆ
r
3
) −2
7
(
ˆ
r
1
·ˆ
r
2
)
2
(
ˆ
r
2
·ˆ
r
3
)
. (A1)
The eight functions abo v e are the fundamental set and the remaining
fifteen are generated by interchanging the arguments.
APPENDIX B: DISCUSSION OF PEARSON
CORRELATION COEFFICIENT BETWEEN TWO
DATA SETS
Consider two data sets d
1
= s
1
+ n
1
−
s
1
all
≡ s
1
+ n
1
and d
2
=
s
2
+ n
2
−
s
2
all
≡ s
2
+ n
2
, with s the S/N ratio and n the noise.
These are data vectors –in our case, o v er all channels and radial bins.
s
j
all
denotes an average over all channels and radial bins.
Given two data sets, each with a certain intrinsic statistical noise
drawn from Gaussian distribution, what is the maximum allowed
correlation between them? In the decorrelated basis, we can treat
each data point independently. We recall from equation ( 21 ) the
definition of the Pearson correlation coefficient and insert in it the
data vectors as defined above, finding
r
p =
N
bin
i= 1
d
( i)
1 −
d
1
all
d
( i)
2 −
d
2
all
N
bin
i= 1
d
( i)
1 −
d
1
all
2
N
bin
i= 1
d
( i)
2 −
d
2
all
2
. (B1)
Assuming a linear relation between the signals from the two samples,
i.e. s
2
= b s
1
, that the average observed signal is given by
d
j
all
=
N
−1
bin
N
bin
i= 1
s
( i)
j
for j = 1, 2, and that the noise vanishes after averaging,
we have for the numerator
N
bin
i= 1
d
( i)
1 −
d
1
all
d
( i)
2 −
d
2
all
=
s
1
s
2
all
+
s
1
n
2
all
+
s
2
n
1
all
+
n
1
n
2
all
−
s
1
all
s
2
all
. (B2)
For the denominator, we use that
N
bin
i= 1
d
( i)
j −d
j
all
2
=
s
2
j
all
−s
j
2
all
+
n
2
j
all
+ 2
s
j
n
j
all
≡σ2
s
j + σ2
n
j + 2
s
j
n
j
all
. (B3)
The correlation coefficient is then
r
p
=
s
1
s
2
all
−
s
1
all
s
2
all
+
s
1
n
2
all
+
n
1
s
2
all
+
n
1
n
2
all
( σ2
s
1
+ 2
s
1
n
1
all
+ σ2
n
1
)( σ2
s
2
+ 2
s
2
n
2
all
+ σ2
n
2
)
.
(B4)
First we discuss the low S/N limit for s
1
n
1 and s
2
n
2
. We
assume the correlation between the signal and noise is zero, and
that the signal covariance is much smaller than that of the noise, i.e.
s
1
s
2
all
n
1
n
2
all
. We find in this limit that
r
p
=
n
1
n
2
all
σn
1
σn
2
= 0 , (B5)
where to obtain the last equality we assumed that the noise is
independent in each bin (as is assumed for a Pearson analysis, and as
is the case once we have rotated to the decorrelated basis as discussed
in the main text, Section 5.2 ).
Second, in the high S/N limit that n
1
s
1
and n
2
s
2
, and now
with
n
1
n
2
all
s
1
s
2
all
, we find
r
p
=
s
1
s
2
all
−
s
1
all
s
2
all
σs
1
σs
2
. (B6)
We note that σhere is the rms of the rele v ant vector over radial
bins and channels , not any kind of statistical error. It is simply a
measure of how much that vector varies as the angular momenta and
the radial bins are changed. This equation thus sho ws that, e ven if
one had a signal of very high amplitude (and hence high S/N and
high detection significance), if one had little variation o v er channels
and radial bins, one would obtain a low r
p
. Put simply, the detection
significance and r
p
are independent.
APPENDIX C: COMPARISON OF CMASS AND
LOWZ ANALYTIC COVARIANCES
Here, we compare the analytic covariances for the two samples
(Fig. C1 ). This test shows that they dif fer non-tri vially, thus pre-
cluding a cross-correlation analysis between the two samples if we
wished to diagonalize as described in Section 5.2 .
Downloaded from https://academic.oup.com/mnras/article/522/4/5701/7169316 by guest on 15 June 2023
5732 J. Hou, Z. Slepian, and R. N. Cahn
MNRAS 522, 5701–5739 (2023)
Figure C1. Comparison of the CMASS and LOWZ analytic covariance matrices, but with their volumes set to be equal. We do this because an o v erall
scaling will not adversely impact a cross-correlation analysis, but non-trivial differences in the matrices do and it is these we want to display. We have
mapped each set of channels and radial bins into 1D indices to make a 2D plot. The left-hand panel shows a symmetrized test; notably, the diagonal is
non-vanishing after subtracting the identity matrix. The non-vanishing off-diagonal elements present in the left-hand panel indicate that the eigenbases of
the two matrices also differ, as we might expect given that the shot noise impacts the covariance non-trivially and the two matrices are computed with
different shot noises (see Ta b le 1 ). The right-hand panel shows just the diagonal of our test matrix; the residual is likely due to the slightly different number
densities.
APPENDIX D: REDSHIFT-DISTRIBUTION
DEPENDENCE ON IMAGING DEPTH
We here assess how the imaging depth affects the number density as
a function of redshift, focusing on CMASS. Figs D1 and D2 illustrate
the redshift distribution’s dependence on the r - and i -band imaging
depths for, respectively, the NGC and the SGC. In each cap in both
r and i band, the redshift distribution is very similar across the three
bins in imaging depth that we construct. This similarity implies that
the imaging depth does not strongly impact the redshift distribution.
Figure D1. Redshift-distribution dependence on imaging depth for CMASS NGC. Left-hand panel: Normalized galaxy number counts as a function of redshift
for i band. We have split the sample into three bins in imaging depth. Right-hand panel: Same as the left but for r band. We see that the three bins in i - and
r -band depths have very similar n ( z), implying that the imaging depth is unlikely to impact our analysis.
Downloaded from https://academic.oup.com/mnras/article/522/4/5701/7169316 by guest on 15 June 2023
Parity-odd 4PCF detection 5733
MNRAS 522, 5701–5739 (2023)
Figure D2. Same as Fig. D1 but for CMASS SGC. Again, the i and r bands have very similar n ( z) for the three bins in imaging depth, implying that imaging
depth does not strongly impact our analysis for SGC.
APPENDIX E: COARSER RADIAL BINNING
Here, we show the results of using a coarser radial binning by a
factor of roughly two relative to our 10-bin case. We generally see
lower detection significance, consistent with the theory that ‘internal
cancellation’ occurs to a greater extent in this coarser binning. Our
results are displayed in Fig. E1 .
Figure E1. Here, we show a compressed analysis (Section 5.1.2 ), and direct approaches with both the mock covariance (lower middle panel) and the analytic
cov ariance (lo wer right panel) for a 4PCF with just six radial bins. This substantially reduces the number of degrees of freedom, permitting the use of the mock
cov ariance. Ho we ver, the detection significance is also degraded; we attribute this to much larger ‘internal cancellation’ (Section 2.2 ) than in our 10- and 18-bin
analyses.
Downloaded from https://academic.oup.com/mnras/article/522/4/5701/7169316 by guest on 15 June 2023
5734 J. Hou, Z. Slepian, and R. N. Cahn
MNRAS 522, 5701–5739 (2023)
APPENDIX F: MAXIMUM SCALE CUT AND
MINIMUM BIN SEPARATION
To explore whether the use of small scales makes a difference in
our analysis (as these are the scales on which the mocks may
be most likely to imperfectly mirror the data, due to approximate
treatment of non-linear structure formation), we test the detec-
tion significances by applying further restrictions to the radial
bins. First, we force the minimum radial bin separation to be
r ≥15 h
−1
Mpc such that the results are less sensitive to small
scales. Second, we vary the maximum radial bin r
max
= 90 h
−1
Mpc
and r
max
= 130 h
−1
Mpc such that we are less affected by any
mismatch between mocks and data around the BAO position and
towards larger scales (as was seen in some of the 2PCF measure-
ments).
The results are shown in Figs F1 and F2 . In all these cases
the detection significance reduces compared to the fiducial case
(without minimum bin separation or maximum radial bin) given
that the number of degrees of freedom is reduced. Ho we ver, we
still observ e non-ne gligible detection significances for all cases. It
is worth pointing out that there are in total only 35 ×23 = 805
degrees of freedom when using r
max
= 90 h
−1
Mpc , which al-
lows us to use the mock covariance directly. Despite the mis-
match between the analytic covariance matrix and the expected
χ2 distribution, there is no substantial difference in the detection
significances.
Figure F1. Upper panel: CMASS sample with minimum radial bin separation r = 15 h
−1
Mpc and maximum radial bin r
max
= 90 h
−1
Mpc . The left-most
column uses the data compression method with N
eig
= 200. The middle column directly uses the mock covariance. The right-most column uses the analytic
cov ariance. Lo wer panel: Same as the upper row but for LOWZ. In all these cases the detection significance is reduced relative to that in our fiducial analysis
due to having fewer degrees of freedom.
Downloaded from https://academic.oup.com/mnras/article/522/4/5701/7169316 by guest on 15 June 2023
Parity-odd 4PCF detection 5735
MNRAS 522, 5701–5739 (2023)
Figure F2. Upper panel: CMASS sample with minimum radial bin separation r = 15 h
−1
Mpc and maximum radial bin r
max
= 130 h
−1
Mpc . The left-hand
column uses the data compression method with N
eig
= 200. The right-hand column uses the analytic cov ariance. Lo wer panel: Same as the upper row but for
LOWZ. As in Fig. F1 the detection significance is reduced compared to that in our fiducial analysis.
APPENDIX G: ALL ANGULAR CHANNELS
Downloaded from https://academic.oup.com/mnras/article/522/4/5701/7169316 by guest on 15 June 2023
5736 J. Hou, Z. Slepian, and R. N. Cahn
MNRAS 522, 5701–5739 (2023)
Figure G1. The parity-odd 4PCF for the BOSS CMASS data, with NGC in red and SGC in blue. The plot includes all the angular channels for 10 radial bins.
The error bars are the rms of the PATCHY mocks.
Downloaded from https://academic.oup.com/mnras/article/522/4/5701/7169316 by guest on 15 June 2023
Parity-odd 4PCF detection 5737
MNRAS 522, 5701–5739 (2023)
Figure G2. The parity-odd 4PCF for the BOSS LOWZ data including all the angular channels for 10 radial bins. NGC is in brown and SGC in blue; the error
bars are the rms of the PATCHY mocks.
Downloaded from https://academic.oup.com/mnras/article/522/4/5701/7169316 by guest on 15 June 2023
5738 J. Hou, Z. Slepian, and R. N. Cahn
MNRAS 522, 5701–5739 (2023)
Figure G3. The parity-odd 4PCF for the BOSS CMASS data including all the angular channels for 18 radial bins. NGC is in red and SGC is in blue; the error
bars are the rms of the PATCHY mocks.
Downloaded from https://academic.oup.com/mnras/article/522/4/5701/7169316 by guest on 15 June 2023
Parity-odd 4PCF detection 5739
MNRAS 522, 5701–5739 (2023)
Figure G4. The parity-odd 4PCF for the BOSS LOWZ data including all the angular channels for 18 radial bins. NGC is in brown and SGC in blue; the error
bars are the rms of the PATCHY mocks.
This paper has been typeset from a T
E
X/L
A
T
E
X file prepared by the author.
© 2023 The Author(s).
Published by Oxford University Press on behalf of Royal Astronomical Society. This is an Open Access article distributed under the terms of the Creative Commons Attribution License
( http://cr eativecommons.or g/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original wor k is properly cited.
Downloaded from https://academic.oup.com/mnras/article/522/4/5701/7169316 by guest on 15 June 2023