Available via license: CC BY 4.0
Content may be subject to copyright.
Prepared for submission to JCAP
Effort.jl: a fast and differentiable
emulator for the Effective Field
Theory of the Large Scale Structure
of the Universe
Marco Bonici,a,b,c,1Guido D’Amico,d,e Julien Bel,fand
Carmelita Carbonec
aWaterloo Centre for Astrophysics, University of Waterloo, Waterloo, ON N2L 3G1, Canada
bDepartment of Physics and Astronomy, University of Waterloo, 200 University Ave W,
Waterloo, ON N2L 3G1, Canada
cINAF Istituto di Astrofisica Spaziale e Fisica cosmica di Milano, Via Alfonso Corti 12,
I-20133 Milano, Italy
dDipartimento di Scienze Matematiche, Fisiche e Informatiche, Università di Parma, Viale
delle Scienze 7/A 43124 Parma, Italy
eINFN Gruppo Collegato di Parma, Viale delle Scienze 7/A 43124 Parma, Italy
fAix Marseille Univ, Université de Toulon, CNRS, CPT, Marseille, France
E-mail: mbonici@uwaterloo.ca
Abstract. We present the official release of the EFfective Field theORy surrogaTe (Effort.jl),
a novel and efficient emulator designed for the Effective Field Theory of Large-Scale Structure
(EFTofLSS). This tool combines state-of-the-art numerical methods and clever preprocessing
strategies to achieve exceptional computational performance without sacrificing accuracy. To
validate the emulator reliability, we compare Bayesian posteriors sampled using Effort.jl
via Hamiltonian MonteCarlo methods to the ones sampled using the widely-used pybird code,
via the Metropolis-Hastings sampler. On a large-volume set of simulations, and on the BOSS
dataset, the comparison confirms excellent agreement, with deviations compatible with Mon-
teCarlo noise. Looking ahead, Effort.jl is poised to analyze next-generation cosmological
datasets and to support joint analyses with complementary tools.
Keywords: cosmological parameters from LSS, power spectrum, redshift surveys, gradient-
based methods
1Corresponding author
arXiv:2501.04639v1 [astro-ph.CO] 8 Jan 2025
Contents
1 Introduction 1
2 The Effort.jl Emulator 3
2.1 Neural Network Architecture and Core Functionalities 3
2.2 Bias Treatment and Emulator Structure 4
2.3 Preprocessing Strategy 4
2.4 Observational Effects 5
2.4.1 Computation of additional Quantities 5
2.4.2 Alcock-Paczynski (AP) Effect 7
2.4.3 Window Mask Convolution 9
3 Cosmological Inference Framework 9
3.1 Probabilistic Programming with Turing.jl 9
3.2 The Hamiltonian MonteCarlo sampler 10
3.3 MicroCanonical Hamiltonian Monte Carlo Sampler 11
4 Results 11
4.1 Accuracy checks and preprocessing impact 11
4.2 Application to the PT-challenge simulations and BOSS 14
5 Conclusions 16
A The galaxy power spectrum in the EFTofLSS 19
1 Introduction
In the era of precision cosmology, the analysis of Large Scale Structure (LSS) surveys play a
pivotal role in understanding the universe’s contents and dynamics, from its birth to late-time
accelerated expansion. Forthcoming datasets from collaborations such as the Dark Energy
Spectroscopic Instrument (DESI) [1–3] and Euclid [4] promise to probe the universe’s LSS
with unprecedented accuracy. These datasets present opportunities to explore both the stan-
dard cosmological model and its potential extensions with unparalleled precision. However,
they also impose new computational demands because of the increased dimensionality of the
posterior distribution to explore, requiring tools that can efficiently handle the scale and
complexity of these analyses.
One promising technique that has emerged to address these computational challenges is
emulation. An emulator is a surrogate model that can approximate the outputs of compu-
tationally expensive models, but at a significantly lower computational cost. This approach
offers a twofold benefit: it accelerates the computation and allows for the application of more
advanced optimization and sampling techniques, in particular gradient-based methods. In
recent years, emulators have proven to be highly successful in domains with computationally
demanding theoretical models, such as in cosmology.
After the first seminal papers on this topic [5–7], several emulators have been produced
in the literature, emulating the output of Boltzmann solvers such as CAMB [8] or CLASS [9],
– 1 –
with applications ranging from the Cosmic Microwave Background (CMB) [10–14], the linear
matter power spectrum [11,15–19], galaxy power spectrum multipoles [17,19–22], and the
galaxy survey angular power spectrum [23–29].
While the acceleration of likelihood evaluations through emulation marks a significant
step forward, it is only one facet of the broader efficiency gains that emulators can bring
to cosmological analyses. Emulators enable the use of gradient-based samplers, which are
particularly well-suited to high-dimensional and complex parameter spaces. These samplers,
such as Hamiltonian Monte Carlo (HMC) [30,31] and its variants, exploit the differentia-
bility of the model to explore parameter spaces more efficiently than gradient-free methods.
As a result, they significantly reduce the number of evaluations required to reach chain con-
vergence, especially for complex target distributions, offering substantial improvements in
computational performance.
This capability has led to a growing body of work employing gradient-based samplers in
cosmology, with applications to CMB and LSS analyses [12,14,25,32–42]. The increasing
adoption of these approaches reflects the potential of gradient-based methods to accelerate
inference in challenging, high-dimensional models.
The primary method for computing log-likelihood gradients is known as algorithmic
differentiation (AD), or automatic differentiation [43–45]. At its core, AD is built on a
straightforward concept: any computer program can be interpreted as a sequence of elemen-
tary functions, each with a known analytical derivative1. The task of an AD engine is to apply
the chain rule to combine these derivatives systematically. While AD is capable of differen-
tiating highly complex programs, it can still benefit from higher-level mathematical insights.
When custom, hand-written derivatives are provided, significant speed improvements can be
achieved. A classic example is matrix multiplication; although this operation can be broken
down into individual sums and products, each of which is differentiable, it is far more efficient
to express its derivative as another matrix product, which achieves the same result with fewer
computational resources [48].
Several codes have already been developed to efficiently compute quantities relevant
for galaxy clustering analyses. Notable examples include Comet [20], Matryoshka [21], and
EmulateLSS [49]. These frameworks have been successfully applied to a variety of cosmological
datasets, demonstrating their effectiveness in this field. However, while all these frameworks
are differentiable, none of them has been explicitly used in synergy with gradient-based sam-
plers.
In this context, we introduce Effort.jl, a Julia [50] package that builds on these
previous works with a focus on computational performance and integration with gradient-
based algorithms. One of the key strengths of Effort.jl is its adaptability: observational
effects—such as the Alcock-Paczynski (AP) effect and the survey mask—can be either in-
corporated into the data generated to train the emulator or included a posteriori. While
the former approach is computationally more efficient, as no post-processing of the emula-
tor output is required, the latter is more flexible as the output can be adapted without the
need to retrain the emulator. Furthermore, great care has been devoted to ensure Effort.jl
compatibility with AD engines, enabling efficient differentiation throughout the entire work-
flow. While previous codes have laid the groundwork, Effort.jl offers a distinct advantage
for analyses centered on gradient-based techniques, providing a robust and flexible toolkit
tailored to the evolving needs of modern cosmological research.
1A possible exception is represented by programs with discrete stochastic behavior; in order to deal with
this scenario, tailored approaches have been developed as in [46,47].
– 2 –
Through its interface with the probabilistic programming language (PPL) Turing.jl2,
Effort.jl can be used in conjuction with gradient-based sampling and optimization tech-
niques, making it suitable for computing posteriors in Bayesian analyses as well as profile
likelihoods (and best-fit points) for frequentist constraints.
The design of Effort.jl prioritizes computational efficiency without sacrificing accu-
racy, enabling the rapid exploration of parameter spaces for the analysis of big cosmological
datasets. To validate the performance of Effort.jl, we applied it to the PT-challenge sim-
ulations [51] and the BOSS data [52], finding no significant deviations from more traditional
pipelines that combine CLASS or CAMB with an EFTofLSS code.
This paper presents the technical details of Effort.jl, the methodologies behind its
implementation, and its application to both current and future cosmological surveys, with
particular emphasis on the upcoming datasets from DESI and Euclid.
This paper is structured as follows: in Sec. 2, we introduce the core features of Effort.jl,
detailing its architecture, preprocessing strategies, and the implementation of observational
effects. Sec. 3describes the probabilistic programming framework and the gradient-based
sampling methods employed for cosmological inference. In Sec. 4.1, we validate the perfor-
mance of Effort.jl through applications to the PT-challenge simulations and the BOSS
dataset, comparing its results against standard pipelines. Finally, in Sec. 5, we present our
conclusions and outline potential future developments, including applications to upcoming
surveys and extensions of the emulator framework.
2 The Effort.jl Emulator
In this section, we describe the core features of the Effort.jl emulator, highlighting its archi-
tecture, preprocessing strategies, training approach, and the implementation of observational
effects. Effort.jl enables fast computation of the galaxy power spectrum at 1-loop in the
EFTofLSS (we refer the reader to Appendix Afor a brief review) and seamless integration
with AD systems. We discuss its efficient handling of bias parameters, its neural network
(NN) architecture, and the methods used to incorporate observational effects, ensuring high
precision and computational performance.
2.1 Neural Network Architecture and Core Functionalities
Effort.jl3is a software package developed in Julia [53], a high-level, high-performance pro-
gramming language tailored for technical computing. The choice of Julia allows Effort.jl
to achieve exceptional performance, with the ability to compute the power spectrum multi-
poles in approximately 15 µs4. Additionally, Julia’s support for various AD systems plays a
crucial role in enabling efficient sampling, a key feature of the use of Effort.jl for cosmo-
logical parameter inference.
The key functionalities useful for building emulators, such as commands for the instan-
tiation and execution of trained emulators and postprocessing of the emulators output, are
abstracted in an external package, AbstractCosmologicalEmulators.jl5. This allows us to
build other emulators such as Capse.jl, a surrogate model for CMB observables [14], with
minimal effort.
2https://turinglang.org/
3https://github.com/CosmologicalEmulators/Effort.jl
4All the benchmarks of this paper are performed locally using a single core of an i7-13700H CPU.
5https://github.com/CosmologicalEmulators/AbstractCosmologicalEmulators.jl
– 3 –
The AbstractCosmologicalEmulators.jl package integrates two different NN libraries:
SimpleChains.jl6and Lux.jl7.SimpleChains.jl offers superior performance on central
processing units (CPUs), while Lux.jl extends compatibility to graphics processing units
(GPUs).
Moreover, AbstractCosmologicalEmulators.jl’s methods for emulator instantiation
via JSON configuration files eliminates the need for pickling or serialization of trained NNs.
This approach ensures robustness, portability, and compatibility across different computa-
tional environments.
In addition to the NN architecture, Effort.jl handles observational effects such as
survey window masking and the Alcock-Paczynski (AP) effect. These effects are implemented
using tools from the Julia ecosystem, ensuring that all steps are compatible with the AD
framework.
Thanks to these design choices, Effort.jl is easy to train even using standard hardware
such as CPUs, making it particularly efficient and accessible, avoiding the need for large-scale
hardware setups typically required for more resource-demanding models.
2.2 Bias Treatment and Emulator Structure
Effort.jl is designed to emulate the galaxy power spectrum multipoles at 1-loop in the
EFTofLSS. The computation is carried out as follows:
Pℓ(k;θ) = X
i,j
bibjPij,ℓ(k;θ) + Sℓ(k)(2.1)
Here, Pℓ(k;θ)represents the power spectrum multipoles as a function of wavenumber kand
cosmological parameters and redshift θ, which control the cosmological model. The terms
Pij,ℓ(k;θ)are the individual components of the power spectrum multipoles, biare biases and
counterterms, Sℓ(k)represents the stochastic terms. By treating biases and counterterms
analytically, Effort.jl (as pybird does) decouples the elements depending on cosmological
parameters from the other ones, significantly reducing the dimensionality of the emulated
space. Regarding the stochastic part, this is constructed analytically by the code and does
not require to be emulated.
Other packages, such as EmulateLSS [49], incorporate bias parameters directly into the
emulation process, which can lead to the need for more complex NNs. In contrast, Effort.jl
follows a strategy similar to Comet [20] and Matryoshka [21], where bias parameters are
handled analytically. This reduces the complexity of the NNs, allowing for faster training and
inference, while the small computational overhead of bias inclusion remains negligible.
2.3 Preprocessing Strategy
Effort.jl employs a carefully optimized preprocessing pipeline to ensure both accuracy and
efficiency in the emulator. The linear matter power spectrum is given by:
Plin(k, z) = Ask−3k
k0ns−1
T2(k, z)D2(z),(2.2)
where Asand nsset the initial power spectrum conditions, k0is the pivot wavenumber, T(k, z)
is the matter transfer function, and D(z)is the scale-independent growth factor.
6https://github.com/PumasAI/SimpleChains.jl
7https://github.com/LuxDL/Lux.jl
– 4 –
The P11 and counter-term, Pct, components are linear in Plin(k, z), and therefore scale
proportionally with A(z) = AsD2(z), while the loop component is quadratic in Plin(k, z),
making it proportional to A2(z), up to the infrared (IR) resummation [54,55] and the effects
of massive neutrinos. To account for this structure, we rescale the P11 and Pct terms by A(z),
and the loop term by A(z)2.
While this rescaling significantly reduces the complexity, it is not perfect due to the IR
resummation. To address these residuals, Effort.jl combines the analytical rescaling with
machine learning, allowing the NNs to learn the remaining non-factorizable dependencies on
AsD(z). This hybrid approach ensures high precision with negligible computational overhead.
As shown in [14], this method enhances the emulator’s accuracy.
This rescaling is particularly effective for extended models with additional parameters,
which primarily influence the amplitude of the linear matter power spectrum (e.g. modifica-
tions to the dark energy equation of state), which primarily influence the amplitude of the
linear matter power spectrum. The analytical rescaling efficiently captures their impact on
the background evolution.
In Comet a similar method is used, known as evolution mapping, which exploits parameter
degeneracies that primarily affect the amplitude of the linear power spectrum [56]. While
Comet’s approach is used to reduce the number of input features, Effort.jl leverages it
primarily to reduce the dynamic range of the output features. This is especially advantageous
in cases with scale-dependent growths, such as when accounting for massive neutrinos, where
Effort.jl’s NN learns the residual effects not captured by the rescaling, offering flexibility
in modeling non-linear phenomena. The impact of this approach is assessed and shown in
Sec. 4.1.
The growth factor D(z)is thus used to postprocess the output of the NNs, and its
computation will be described in the next subsection.
Finally, to enhance numerical stability, Effort.jl applies min-max normalization to
both input and output features. This ensures consistent scaling across all features, preventing
any feature from dominating due to differences in magnitude, which leads to a more stable
and efficient NN training process, resulting in improved overall performance [57].
2.4 Observational Effects
In this subsection, we describe the treatment of key observational effects in Effort.jl, in-
cluding the computation of required quantities such as the growth factor, the inclusion of the
Alcock-Paczynski (AP) effect, and the window mask convolution.
2.4.1 Computation of additional Quantities
The accurate calculation of background quantities is fundamental for interpreting large-scale
structure data. In Effort.jl, we compute several key cosmological background quantities:
the Hubble factor H(z), the radial comoving distance r(z), the growth factor D(z), and the
growth rate f(z). These quantities are essential inputs for the inclusion of the AP effect (and,
in the case of f(z), redshift-space distortions), as we will discuss in the next subsection.
There are two main approaches to handling background quantities in cosmological anal-
yses. One approach, adopted by packages such as Matryoshka and CosmoPower [11], is to
emulate the background quantities. While this can speed up calculations, it requires an
additional emulator and a well-trained model to ensure precision across the parameter space.
In contrast, Effort.jl computes these background quantities dynamically by integrat-
ing the relevant equations at runtime. This approach avoids the need for additional emulators,
– 5 –
simplifying the pipeline and reducing dependency on pre-computed models. With careful im-
plementation, these calculations can be performed efficiently, providing precise results in a
short amount of time.
When computing the Hubble factor, Effort.jl accounts for contributions from non-
relativistic matter, evolving DE, radiation, and massive neutrinos. In particular, for massive
neutrinos, we incorporate the transition from relativistic to non-relativistic regimes, follow-
ing the treatment described in [58,59]; here we review the most important equations for
completeness. The dimensionless Hubble factor E(z) = H(z)/H0is given by
E2(z)=Ωγ,0(1 + z)4+ Ωc,0(1 + z)3+ Ων(z)+ΩDE,0(1 + z)3(1+w0+wa)exp −3waz
1 + z.(2.3)
Ωγ,0,Ωc,0, and ΩDE,0are the abundances of radiation, non-relativistic matter, and DE, re-
spectively, and the DE equation of state is parametrized as w(a) = w0+wa(1 −a)[60,61].
The contribution from massive neutrinos is given by
Ων(a) = 15
π4Γ4
ν
Ωγ,0
a4
Nν
X
j=1 Fmja
kBTν,0,(2.4)
where mjis the mass of each neutrino eigenstate, and Tν,0is the current temperature of the
neutrino background, Γν≡Tν,0/Tγ,0is the neutrino-to-photon temperature ratio today and
the function F(y)is defined as
F(y)≡Z∞
0
dxx2px2+y2
1 + ex.(2.5)
The parameter Γνtakes into account corrections to the instantaneous decoupling and it is
given by
Γ4
ν=1
3Neff 4
114/3
,(2.6)
where we take Neff = 3.044 [62,63].
The comoving distance r(z)is computed as
r(z) = cZz
0
dz′
H(z′).(2.7)
We implement two numerical approaches to solve the integral for r(z). First, we em-
ploy an adaptive integration Gaussian quadrature scheme, which ensures a high-precision
calculation but is more computationally expensive. This method is ideal for cases requiring
extremely accurate results. Second, we use a Gauss-Lobatto rule with a fixed number of
points (9), which provides sufficient precision for most practical purposes, yielding a rela-
tive accuracy better than 10−5compared to the adapative quadrature scheme8. We further
validated our results with CAMB and pyccl [64], finding an agreement up to the fifth digit.
The growth factor D(a)is computed by solving the following second-order differential
equation:
D′′(a) + 2 + E′(a)
E(a)D′(a) = 3
2Ωm(a)D(a),(2.8)
8The adaptive Gaussian quadrature scheme is used within our suite of unit tests to verify that the faster
Gauss-Lobatto method is accurate enough for typical cosmological applications.
– 6 –
where D′≡dD
dln a.
We start solving this equation at high redshift, nominally z= 138, deep in the matter
domination regime. The appropriate initial conditions in this regime are given by:
D(zi) = ai
D′(zi) = ai,(2.9)
where aicorresponds to the initial scale factor.
To integrate this equation, we use the Tsitouras algorithm [65], a Runge-Kutta method
known for its efficiency and accuracy.
The solver is implemented in the DifferentialEquations.jl9package, part of the
Julia ScientificMachineLearning (SciML) suite, which provides a high-performance environ-
ment for solving differential equations [66].
The solver is made fully differentiable, supporting both forward and backward automatic
differentiation (AD) systems. This ensures that gradients can be computed accurately and
efficiently during optimization processes that require differentiation of the solver output, such
as when integrating Effort.jl with gradient-based pipelines.
The growth rate f(z)is related to the evolution of the growth factor through
f(z)≡dln D(z)
dln a,(2.10)
where ais the scale factor. The solver also retrieves the first derivative of the growth factor,
D′(a), which can be used directly to compute f(z). These background quantities are dynam-
ically computed for each cosmological model considered in Effort.jl, providing a flexible
and precise foundation for modeling galaxy clustering. A comparison between the growth
rate computed by CLASS and Effort.jl is shown in Fig. 1; for this test, we considered a
cosmological model with a single massive neutrino with Mν= 0.5 eV and an extreme scenario
for the DE equation of state, with the values for w0and wabeing respectively of −2and −3.
2.4.2 Alcock-Paczynski (AP) Effect
The Alcock-Paczynski (AP) effect is a geometric distortion that arises when converting red-
shift space into physical space, since one must assume a reference cosmological model [67].
This effect causes anisotropies in the observed galaxy clustering, affecting the inferred dis-
tances along and across the line-of-sight. In this work, we follow the notation used in [68,69].
To account for the AP effect in Effort.jl, we calculate the distortion factors as fol-
lows 10:
q∥=DA(z)H(z= 0)
Dref
A(z)Href (z= 0), q⊥=Href (z)/Href (z= 0)
H(z)/H(z= 0) (2.11)
Here, q∥and q⊥represent the distortions along and transverse to the line-of-sight, re-
spectively. These scaling factors adjust for the differences between the reference and true
cosmology, where DA(z)is the angular diameter distance, and H(z)is the Hubble factor.
The transformation of the power spectrum multipoles due to the AP effect is given by:
9https://github.com/SciML/DifferentialEquations.jl
10These are the appropriate definitions when using wavenumber units of h/Mpc, which we assume through-
out.
– 7 –
0.6
0.7
0.8
0.9
1.0
f(z)
CLASS
Effort
012345
z
−0.25
0.00
0.25
∆f%
Figure 1: Comparison of the growth rate f(z)as computed by CLASS and Effort.jl. The
two codes have a remarkably good agreement, with differences smaller than 1% over the entire
redshift range considered.
Pℓ(kref) = 2ℓ+ 1
2q∥q2
⊥Z1
−1
dµref Pkkref , µref , µ µref Lℓµref .(2.12)
This equation expresses the power spectrum multipoles Pℓ(k), where µref is the cosine
of the angle between the wavevector kand the line-of-sight in the reference cosmology. The
function Lℓ(µref )is the Legendre polynomial of order ℓ, and the transformation adjusts the
power spectrum to account for differences between the true and reference cosmology.
The relations between the wavevector kand angle µin the true and reference cosmologies
are given by:
k=kref
q⊥1 + µref 21
F2−11/2
, µ =µref
F1 + µref 21
F2−1−1/2
(2.13)
Here, F=q∥/q⊥is the anisotropy factor, capturing the relative stretching between the
line-of-sight and transverse directions due to cosmological distortions. These transformations
ensure that the clustering signal is correctly interpreted in terms of the true cosmological
model.
– 8 –
We implemented two methods to integrate the AP effect. We use an adaptive Gaus-
sian quadrature scheme for high precision, although this approach is more computationally
expensive. Alternatively, we use a Gauss-Lobatto quadrature with 5 points, which achieves
near floating-point precision with reduced computational cost.
The main challenge arises when incorporating AD. To compute the AP effect, we need
to interpolate the emulator’s output onto a new k-grid, whose specific values depend on
cosmological parameters. For this, we employ a quadratic spline interpolation, as in pybird.
However, calculating the Jacobians of the splines with respect to their input leads to a sparse
Jacobian matrix, which the AD system does not naturally optimize for.
Although the AP calculation itself takes approximately 30 µs, backward propagation of
gradients through the AP calculation initially took around 100 ms, significantly degrading
Effort.jl’s performance.
To resolve this, we reimplemented the spline interpolation and wrote custom backward
differentiation rules that take advantage of the sparsity pattern. This optimization reduced
the time for calculating the Jacobians to approximately 200 µs, representing a three-order-of-
magnitude improvement.
We validated the correctness of these custom differentiation rules using both forward-
mode AD and finite difference methods.
2.4.3 Window Mask Convolution
The finite size and irregular geometry of surveys introduce an observational window function
that modulates the observed clustering signal. Accounting for this requires a multiplication
of the theory correlation function with a window mask in real space, or a convolution of the
window mask with the theory power spectrum in Fourier space.
In Effort.jl, we adopt the approach of [70], implementing the window convolution as
an efficient array contraction. The window function, derived from the survey mask, is applied
to the theoretical predictions at each step of the MCMC sampling, ensuring that observational
effects are consistently incorporated into the analysis.
The convolution is executed using Tullio.jl [71], a high-performance framework for
array operations. The entire process is completed in a few µs. To further enhance perfor-
mance, we have written custom differentiation rules, enabling the automatic differentiation
(AD) system to efficiently manage the convolution operation.
By accurately incorporating these observational effects, Effort.jl delivers precise pre-
dictions for the power spectrum, supporting robust cosmological parameter inference.
3 Cosmological Inference Framework
3.1 Probabilistic Programming with Turing.jl
For cosmological parameter inference, we employ a probabilistic programming language (PPL)
which is built in Julia, namely Turing.jl [72]. PPLs allow users to specify probabilistic
models that include the necessary priors and likelihoods, streamlining the process of drawing
samples from the posterior distribution. This flexibility is particularly useful for complex
models, such as those involving cosmological parameters, as it enables integration with ad-
vanced samplers and AD frameworks.
Turing.jl provides a straightforward way to define probabilistic models by allowing
users to declare their priors and construct the likelihood function. Once the model is defined,
Turing.jl takes care of deriving the posterior and can utilize various sampling algorithms for
– 9 –
inference. A key strength of Turing.jl is its compatibility with gradient-based samplers, such
as the Hamiltonian Monte Carlo (HMC) and its variants, including the No-U-Turn Sampler
(NUTS) and MicroCanonical Langevin Monte Carlo [73].
In this work, we use NUTS and MicroCanonical Langevin Monte Carlo [73] as samplers,
all of them accessed through the Turing.jl interface. This approach has been successfully
employed in various cosmological studies, as demonstrated in [14,25,38]. These samplers
efficiently explore the posterior distributions of cosmological and nuisance parameters, taking
full advantage of the differentiability of Effort.jl.
3.2 The Hamiltonian MonteCarlo sampler
The Hamiltonian Monte Carlo (HMC), also known as Hybrid Monte Carlo, is a state-of-the-
art method for drawing samples from a probability distribution [31]. It is particularly useful
when the distribution is over high-dimensional spaces and/or with complex geometries.
The key idea behind HMC is to introduce momentum variables, p, for each parameter
in the model, which is instead represented as a position variable q. A potential energy V(q)
is introduced, given as minus the logarithm of the joint-likelihood log L:
−log P(q) = V(q)H(q, p) = V(q) + U(q, p),(3.1)
where the kinetic term is given by:
U(q, p) = pTM−1p. (3.2)
These momentum variables, together with the original parameters, form a Hamiltonian
system. The system evolves according to Hamilton’s equations, which in the vanilla case read:
dp
dt=−∂V
dq=∂log P
dq
dq
dt= +∂U
dp=M−1p.
(3.3)
In the vanilla HMC, the system evolves deterministically for a fixed amount of time,
following a trajectory that is likely to stay in regions of high probability under the target
distribution. The trajectory is usually numerically integrated with a symplectic integrator.
The deterministic evolution in HMC allows it to explore the target distribution more
efficiently. However, it requires the ability to compute gradients of the log-probability of the
target distribution, which can be computationally expensive for complex models.
Vanilla HMC is a powerful sampling method that combines ideas from physics and
statistics and can efficiently draw samples from high-dimensional distributions, but requires
tuning and may be computationally expensive for complex models. One of the main challenges
with the vanilla HMC is the need to carefully choose and tune the step size and the number
of steps. If these parameters are not set correctly, the HMC can either miss important regions
of the parameter space or waste computational resources by taking too many steps.
The No-U-Turn Sampler (NUTS) is a self-tuning variant of the Hamiltonian Monte Carlo
(HMC) method, designed to address some of the shortcomings of the vanilla HMC [30]. The
name “No-U-Turn” comes from the sampler’s unique feature of stopping its trajectory once it
starts to turn back on itself, hence avoiding unnecessary computations.
NUTS tunes HMC hyper-parameters by adaptively choosing the trajectory length. It
starts with a trajectory of length 1, and then doubles the length as long as the trajectory
– 10 –
does not make a U-turn. A U-turn is detected when the current position and the proposed
new position are moving in opposite directions along the gradient of the log-probability.
This adaptive algorithm ensures that NUTS spends more time exploring new regions of
the parameter space and less time retracing its steps. It also removes the need for manual
tuning of the trajectory length, making NUTS more user-friendly than vanilla HMC.
However, like HMC, NUTS requires the ability to compute gradients of the log-probability,
which can be computationally expensive for complex models. In our case, the computation
of the log-likelihood gradient is performed using one of the Julia AD systems.
3.3 MicroCanonical Hamiltonian Monte Carlo Sampler
The MicroCanonical Hamiltonian Monte Carlo (MCHMC) sampler [73,74], is a variation of
the traditional HMC method. Unlike standard HMC, which relies on a Metropolis-Hastings
correction step to ensure detailed balance, MCHMC operates within the microcanonical en-
semble, where the Hamiltonian energy H(q, p)is conserved throughout the entire trajectory.
Removing the need for Metropolis corrections, reduces computational overhead and increases
sampling efficiency.
MCHMC also employs a variable mass Hamiltonian in Eq. (3.2), which adapts to the
geometry of the posterior distribution. In regions of higher probability density, the parti-
cle moves slower, improving the sampler’s capacity to explore complex and high-curvature
posterior distributions.
This method is especially well-suited for cosmological inference tasks, where the pa-
rameter space is often large and complex. MCHMC has been demonstrated to outperform
traditional HMC methods in terms of sampling efficiency for such high-dimensional prob-
lems, providing faster convergence while maintaining accuracy in the estimation of posterior
distributions [14,73,75].
4 Results
4.1 Accuracy checks and preprocessing impact
Having laid out the structure of Effort.jl, we can now discuss a concrete implementation.
Given its flexible structure, it is possible to use Effort.jl with two different approaches,
i.e. including redshift as a free parameter or working at a specific redshift. While the former
approach is more flexible, as it enables us to train one generic emulator that can be used
in more situations, the latter option naturally leads to a more precise surrogate, as we are
removing one of the input parameters. Considering that surveys like DESI and Euclid divide
galaxy in only a handful of bins with a specific effective redshift, the second approach is
useful in case one needs very high precision. Here we focus on an emulator of the former
kind, since we want to provide a tool that does not require additional training. The training
dataset contained 60,000 couples of input parameters and P(k)s. We input to the NN the
redshift and cosmological parameters distributed according to the Latin Hypercube in the
parameter range described in Table 1. We employ NNs with 5 hidden layers, each of them
with 64 neurons. We used a batch size of 128, and trained for 100,000 epochs using an ADAM
optimizer with an initial learning rate of 10−4; when using an 8-core CPU, this process took
around one hour of wall-time. We employed a vanilla mean square error loss function for the
emulator presented in this section and used in the BOSS analysis, while for the PT-challenge
we used a modified loss function which included a 1/k2weight. The dataset as been split in
– 11 –
80% and 20% for, respectively, the training and test sets. To prevent overfitting, we save the
weights only when the loss on the testing dataset shows improvements.
Parameter Min Max
z0.25 2.0
ln 1010As2.5 3.5
ns0.8 1.10
H0[km/s/Mpc] 50.0 80.0
ωb0.02 0.025
ωc0.08 0.2
Mν[eV] 0.0 0.5
Table 1: Parameter bounds for redshift and cosmological parameters.
Before showing a few applications of trained emulators, we want to gauge the impact of
the preprocessing procedure outlined in Sec. 2.3.
The question we aim to answer is: how can we improve the performance of our em-
ulators? The straightforward approach is to use a bigger dataset and/or a bigger neural
network, according to the approximation theorem [76]. The second approach which we want
to investigate here is a rescaling of the output features which depends on the value of the
input features; this method has already been applied in [14], where the CMB spectra where
rescaled by As, leading to a reduction of the residuals by a factor of 3, clearly showing the
benefit of this strategy. Here we can make a further step, dividing the output features by the
growth factor as outlined in Sec. 2.3.
In order to assess the impact of rescaling on the observables considered here in a quanti-
tative way, we study the distribution of the percentage residuals in the following 4 scenarios:
•No rescaling is employed
•We rescale by As
•We rescale by D2(z)
•We rescale by AsD2(z)
Furthermore, we consider two training datasets, with 20,000 and 60,000 samples. The results
of this comparison are shown in Fig. 2. As can be seen from the comparison of the dashed and
solid lines, which represents respectively the 20,000 and the 60,000 samples training dataset,
getting a bigger dataset helps in improving the accuracy of the emulator. However, we want
to emphasize that the rescalings have a much bigger impact. This shows how it is possible
to get a high-precision emulator even when using a small neural-network: in other studies,
like [11,21,22], the NNs employed had a significantly higher number of neurons.
Finally, it is worth mentioning that our most effective physics-based preprocessing pro-
cedure, while lowering the emulation error, relies on the solution of an ODE that, although
relatively fast (on the order of 150 µs), has now become the main bottleneck of our pipeline
since Effort.jl computes each multipole in around 15 µs. To circumvent this issue, we
employed a symbolic regression approach (using SymbolicRegression.jl [77]) to emulate
the ODE solution by directly seeking a closed-form expression for the growth factor. This
strategy offers two main advantages: first, once the symbolic expression is found, it can be
– 12 –
Figure 2: 95% distribution of the percentage errors. We show the impact of the different
rescaling schemes employed. The dot-dashed and solid lines refers respectively to a training
dataset with 20,000 and 60,000 elements. As can be seen, a bigger dataset helps in enhancing
the performance, but the rescaling is even more impactful. For this test, we fix the EFT
parameters, using a combination obtained from the measurements employed in [42].
ported to any programming language without loss of precision; second, it greatly speeds up
the preprocessing step. Indeed, evaluating the symbolic expression takes only 200 ns, offering
a dramatic speedup compared to the ODE solution (see also [18,78,79] for other applications
of symbolic regression related to LSS observables).
Moreover, while joint analyses can partially amortize the ODE cost by solving it once for
multiple redshifts, it remains desirable to reduce this expense further. In our tests, the sym-
bolic expression we found is accurate at the 0.1% level in the same parameter range explored
by our trained Effort.jl emulator for 99.87% of the validation dataset. Substituting the
ODE solver with the symbolic growth factor does not affect the final emulator performance,
except in the specific case of the monopole emulator trained on 60,000 samples, where we
– 13 –
observe a modest increase in the residual error, which is still comfortably below 0.4%, as can
be seen from Fig. 3. Nonetheless, the symbolic approach proves advantageous in terms of
portability, consistency, and computational efficiency. However, for users who do not want to
sacrifice accuracy, it is still possible to choose to use the ODE solver in Effort.jl.
Figure 3: 95% distribution of the percentage errors, comparing the ODE and the symbolic
regression rescaling. As can be seen, the two approaches give almost indistinguishable results,
with the sole exception of the monopole emulator trained with 60,000 samples, as in that case
the growth factor symbolic emulator error is not negligible compared to the ODE rescaled
emulator error.
Supported by our findings, we advocate for physics-based preprocessing, as it helps in
improving the accuracy of the emulators by leveraging the domain knowledge of the scientists.
4.2 Application to the PT-challenge simulations and BOSS
In this section we show the application of Effort.jl to two different sets of measurements,
the PT-challenge and BOSS datasets.
– 14 –
The PT-challenge11 was designed to demonstrate the reliability and robustness of the
EFTofLSS as a powerful and trustworthy tool for analyzing cosmological datasets. To achieve
this, a set of high-precision simulations was generated. This approach provided an ideal testing
ground for validating the accuracy and effectiveness of theoretical models like the EFTofLSS.
The simulations consisted of ten boxes with independent random realizations, each com-
prising 3,0723mass elements within a periodic cube of side length 3,840 h−1Mpc, resulting in a
total simulated volume of 566 (h−1Gpc)3. Initial conditions were generated with second-order
Lagrangian Perturbation Theory (2LPT) and evolved using Gadget2. Particle snapshots
were recorded at six redshifts: z=3, 2, 1, 0.61, 0.51, and 0.38.
Halos were identified with the Rockstar halo finder, and galaxies were probabilistically
assigned based on the virial mass of halos, ensuring that mock galaxies were consistent with
observational data.
The resulting mock galaxy catalogs correspond to redshifts z= 0.38 (LOWZ), z= 0.51
(CMASS1), and z= 0.61 (CMASS2), mirroring the observational data sets; specifically, the
measurement employed in this work are those at z= 0.61. These simulations provide a
robust framework for testing cosmological models and statistical methods, owing to their
large volume and the precision of the mock galaxy distributions12. Here we analyzed these
simulations up to kmax = 0.12 h/Mpc.
The Baryon Oscillation Spectroscopic Survey (BOSS) provides publicly available redshift
catalogs that enable galaxy clustering measurements [80,81]. Power spectrum multipoles and
window function matrices are the ones used in [82].
The BOSS dataset is divided into subsamples based on the Northern and Southern
Galactic Caps (NGC and SGC). We divide them in two redshift bins (CMASS and LOWZ),
corresponding to effective redshifts zeff = [0.57,0.32] respectively. This results in a total of
four sets of multipoles13. We fit the power spectra with kmax = 0.23 h/Mpc for CMASS and
0.20 h/Mpc for LOWZ.
We validate the accuracy of our emulators on these datasets by comparing the Bayesian
posteriors produced by Effort.jl and pybird. Specifically, we present the posterior distribu-
tions for the cosmological and EFT parameters that pybird cannot marginalize analytically,
although Effort.jl does not perform the analytical marginalization. To preserve the secrecy
of the true cosmology of the PT-challenge, axis ticks and labels are omitted from the plots
related to that analysis.
Regarding the prior, we used the same prior and analysis settings used in [51] from
the West Coast Team and we developed a different emulator, tailored to have only 3 input
cosmological parameters 14 The results demonstrate remarkable accuracy, with an excellent
agreement between the two methods, consistent with Montecarlo noise. In terms of perfor-
mance, the pybird chains were generated using the MontePython sampler [83] and required
several hours to obtain a very high degree of convergence on multiple CPUs of a computing
cluster. By contrast, Effort.jl achieved convergence with significantly greater efficiency.
We ran four chains for both the NUTS and MCHMC samplers, using 500 and 10,000 burn-in
steps and 2,000 and 200,000 accepted steps, respectively. These chains were initialized using
Pathfinder.jl, a variational inference method that quickly generates samples from the typ-
11https://www2.yukawa.kyoto-u.ac.jp/ takahiro.nishimichi/data/PTchallenge/
12The Julia version of the PT-challenge likelihood is available here.
13The Julia version of the BOSS likelihood is available here.
14We will not release the trained emulators for this part of the work to ensure the input cosmology remains
undisclosed to the community.
– 15 –
ical set [46,84]. The entire analysis with Effort.jl required approximately 10 minutes on
a laptop and converged to the same posterior distribution as pybird. The Effective Sample
Size per second (ESS/s) was 1.2 for NUTS and 5.1 for MCHMC. We stress that, given the
large volume of the PT-challenge, this is a very stringent test of the accuracy our emulator.
A direct comparison of sampling efficiency with pybird was not made, as pybird sam-
ples from a smaller-dimensional parameter space. However, even when analyzing a higher-
dimensional space, Effort.jl achieves a sampling efficiency several orders of magnitude
greater than standard analysis pipelines.
For the BOSS analysis, the agreement between Effort.jl and pybird is similarly out-
standing, as shown in Fig. 5. We employed the same priors as in [82], also including the
correlated prior among EFT parameters. The pybird chains, using the MontePython sam-
pler, required a few days on a computing cluster to obtain chains with a very high degree of
convergence. Using Effort.jl, we ran eight chains for both NUTS and MCHMC, with the
same burn-in and accepted steps as in the PT challenge. Chains were again initialized from
aPathfinder.jl run. The total wall-clock time was slightly over one hour, with ESS/s of
0.4 for NUTS and 2.4 for MCHMC.
5 Conclusions
In this work, we introduced Effort.jl, a novel emulator for the EFTofLSS with a strong em-
phasis on computational performance. The design of Effort.jl prioritizes efficiency without
compromising accuracy, leveraging state-of-the-art numerical methods and machine learning
techniques to provide robust predictions for cosmological analyses.
Great care has been devoted to implementing observational effects, such as the Alcock-
Paczynski effect and window mask convolution, ensuring high computational efficiency. This
rigorous implementation enables Effort.jl to handle the complexity of LSS data while main-
taining precise modelling of observational effects. We checked the accuracy of these calcula-
tions both against standard Boltzmann solvers and pybird, finding an excellent agreement.
We explored various preprocessing strategies, showcasing their significant impact on
the emulator accuracy. Notably, we combined numerical method, such as solving ODE and
symbolic regression, with NN emulators of cosmological observables. This hybrid approach
demonstrates the potential for improving emulator precision and efficiency by leveraging do-
main knowledge readily available to cosmologists.
The coupling of Effort.jl with the Turing.jl framework highlights its versatility, al-
lowing it to efficiently integrate with gradient-based samplers such as NUTS and MCHMC.
Our results demonstrate the ability of Effort.jl to sample Bayesian posteriors very effi-
ciently, reducing computational costs significantly while achieving excellent agreement with
standard pipelines. In particular, this will become of paramount importance when considering
scenarios, like those found in [85–88], where analytical marginalization is not a viable option
and we need to explore the full parameter space.
Looking forward, we plan to apply Effort.jl to the forthcoming datasets from the DESI
survey [3], harnessing its computational efficiency to analyze next-generation cosmological
data with precision. In addition, we foresee significant potential in combining Effort.jl
with tools such as Capse.jl [14], which emulates CMB observables, and Blast.jl [89], which
computes the photometric 3×2 pt. This integration will facilitate efficient joint analyses of
diverse cosmological datasets and probes.
– 16 –
Figure 4: Triangle plot showing the standard pybird chains alongside those obtained using
Effort.jl in combination with Turing.jl, for the PT-challenge analysis. The lower trian-
gular section compares pybird and Effort.jl to validate the precision of our emulator and
likelihood implementation. The upper triangular part focuses solely on Effort.jl contours,
comparing the results obtained with the NUTS and MCHMC samplers. To align with the
PT-challenge guidelines, axis ticks and labels have been removed to avoid disclosing the true
cosmology.
The combination of Effort.jl and Blast.jl is particularly compelling, as it enables the
joint analysis of the Euclid main probes without relying on the Limber approximation, which
can introduce inaccuracies in cosmological inference [90,91]. Furthermore, as demonstrated
in [35], the use of HMC samplers is expected to play a pivotal role in the analysis of 3×2 pt
datasets from Stage IV surveys like Euclid. This approach will become even more critical
when combining Euclid’s photometric and spectroscopic datasets for joint analyses.
The capabilities of Effort.jl provide a solid foundation for further enhancements and
– 17 –
2.2 2.3
100ωb
0.7
0.8
0.9
1.0
1.1
ns
2.4
2.7
3.0
3.3
ln 1010As
0.65
0.70
0.75
h
0.10
0.12
0.14
0.16
0.18
ωc
0.10 0.14 0.18
ωc
0.65 0.70 0.75
h
2.4 2.7 3.0 3.3
ln 1010As
0.8 1.0
ns
2.1
2.2
2.3
100ωb
pybird+MontePython
Effort+NUTS
Effort+MCHMC
Figure 5: Triangle plot showing the standard pybird chains alongside those obtained using
Effort.jl, for the BOSS analysis, focusing on cosmological parameters. The lower triangular
section compares pybird and Effort.jl to validate the precision of our emulator and likeli-
hood implementation. As in fig. 4, in the lower triangular part we focus on the comparison
with the pybird chains, to ensure the correctness of our results, while in the upper part of
the plot we compare the two gradient based samplers employed in this study.
expansions. Additionally, we plan to extend the use of symbolic regression to w0wacosmolo-
gies, providing a computationally inexpensive replacement for the ODE computations, further
optimizing the preprocessing pipeline.
The modular structure of Effort.jl is sufficiently general to support compatibility
with other EFT-based codes, such as velocileptors [92], CLASS-PT [93], FOLPS [94], and
CLASS-OneLoop [95]. This flexibility opens the door for training Effort.jl to emulate these
codes as well, broadening its application and usability.
Finally, we are actively working on a jax-based version of Effort.jl. Since the cos-
– 18 –
mological community is more proficient with python, we thus see advantageous to provide a
version of our software compatible with such a language. While a fully equivalent jax imple-
mentation of Capse.jl is already complete15 , we are now focussing on translating Effort.jl
into jax.
Acknowledgements
MB is supported in part by funding from the Government of Canada’s New Frontiers in Re-
search Fund (NFRF). MB also acknowledges the support of the Canadian Space Agency and
the Natural Sciences and Engineering Research Council of Canada (NSERC), [funding refer-
ence number RGPIN-2019-03908]. This research was enabled in part by support provided by
Compute Ontario (computeontario.ca) and the Digital Research Alliance of Canada (alliance-
can.ca). MB acknowledges financial support from INAF MiniGrant 2022. MB is expecially
grateful to Anton Baleato-Lizancos, Guadalupe Canas-Herrera, Pedro Carrilho, Emanuele
Castorina, Arnaud de Mattia, Minas Karamanis, Chiara Moretti, Andrea Pezzotta, Uros Sel-
jak, and Martin White for useful discussions. GDA is especially grateful to Pierre Zhang
for useful discussions. We thank Takahiro Nishimichi for the PT-challenge simulations. Part
of this work has been carried on the High Performance Computing facility of the University
of Parma, whose support team we thank. MB thanks University of California Berkeley for
hospitality during his visit, during which this project was started.
A The galaxy power spectrum in the EFTofLSS
The Effective Field Theory of Large-Scale Structure (EFTofLSS) provides a systematic ap-
proach to modeling the redshift-space galaxy power spectrum by thoroughly accounting for
the effects of small-scale physics on large-scale clustering. This framework extends standard
perturbation theory by introducing counterterms that describe how small-scale physics, in-
cluding galaxy formation, influences the observed large-scale distribution. We give a concise
overview below and refer readers to [96,97] for more details.
The one-loop EFTofLSS expression for the redshift-space galaxy power spectrum is:
Pg(k, µ) = Z1(µ)2P11(k)+2Zd3q
(2π)3Z2(q,k−q, µ)2P11(|k−q|)P11(q)
+ 6Z1(µ)P11(k)Zd3q
(2π)3Z3(q,−q,k, µ)P11(q)
+ 2Z1(µ)P11(k)cct
k2
k2
m
+cr,1µ2k2
k2
r
+cr,2µ4k2
k2
r
+1
¯ngcϵ,0+cϵ,1
k2
k2
m
+cϵ,2fµ2k2
k2
m,(A.1)
where we follow the notation of [98]. This includes linear contributions, one-loop Standard
Perturbation Theory (SPT) terms, counterterms, and stochastic terms. Here, µis the cosine
of the angle between the line of sight (LoS) and the wavenumber k;P11(k)is the linear matter
power spectrum; and fis the growth factor. The scale k−1
mcharacterizes the size of collapsed
objects, while k−1
rgoverns counterterms needed to handle products of the velocity field at
the same point [98,99], and ¯ngdenotes the mean galaxy number density.
15https://github.com/CosmologicalEmulators/jaxcapse
– 19 –
The functions Znare the redshift-space galaxy density kernels of order n:
Z1(q1) = K1(q1) + fµ2
1G1(q1) = b1+fµ2
1,
Z2(q1,q2, µ) = K2(q1,q2) + fµ2
12G2(q1,q2) + 1
2fµq µ2
q2
G1(q2)Z1(q1) + perm.,
Z3(q1,q2,q3, µ) = K3(q1,q2,q3) + fµ2
123G3(q1,q2,q3)
+1
3fµq µ3
q3
G1(q3)Z2(q1,q2, µ123) + µ23
q23
G2(q2,q3)Z1(q1) + cyc.,
(A.2)
and Knare the corresponding nth-order galaxy density kernels in real space:
K1=b1,
K2(q1,q2) = b1
q1·q2
q2
1
+b2F2(q1,q2)−q1·q2
q2
1+b4+perm. ,
K3(k, q) = b1
504k3q3−38k5q+ 48k3q3−18kq5+ 9(k2−q2)3log k−q
k+q
+b3
756k3q52kq(k2+q2)(3k4−14k2q2+ 3q4) + 3(k2−q2)4log k−q
k+q .
(A.3)
We omit the full expressions for the second-order kernel F2and velocity kernels Gn(from
SPT) for brevity. Altogether, the model uses four bias parameters b1, b2, b3, b4, three coun-
terterm parameters cct, cr,1, cr,2, and three stochastic parameters cϵ,0, cϵ,1, cϵ,2, for a total of
ten parameters:
{b1, b2, b3, b4, cct, cr,1, cr,2, cϵ,0, cϵ,1, cϵ,2}.(A.4)
Finally, we employ IR-resummation [54,100,101] to account for the large effects of long-
wavelength displacements.
References
[1] DESI collaboration, The DESI Experiment Part I: Science,Targeting, and Survey Design,
1611.00036.
[2] DESI collaboration, Overview of the Instrumentation for the Dark Energy Spectroscopic
Instrument,Astron. J. 164 (2022) 207 [2205.10939].
[3] DESI collaboration, DESI 2024 VII: Cosmological Constraints from the Full-Shape Modeling
of Clustering Measurements,2411.12022.
[4] Euclid collaboration, Euclid. I. Overview of the Euclid mission,2405.13491.
[5] R. Jimenez, L. Verde, H. Peiris and A. Kosowsky, Fast cosmological parameter estimation
from microwave background temperature and polarization power spectra,Phys. Rev. D 70
(2004) 023005 [astro-ph/0404237].
[6] W.A. Fendt and B.D. Wandelt, Pico: Parameters for the Impatient Cosmologist,Astrophys.
J. 654 (2006) 2 [astro-ph/0606709].
[7] T. Auld, M. Bridges, M.P. Hobson and S.F. Gull, Fast cosmological parameter estimation
using neural networks,Mon. Not. Roy. Astron. Soc. 376 (2007) L11 [astro-ph/0608174].
– 20 –
[8] A. Lewis, A. Challinor and A. Lasenby, Efficient computation of CMB anisotropies in closed
FRW models,Astrophys. J. 538 (2000) 473 [astro-ph/9911177].
[9] D. Blas, J. Lesgourgues and T. Tram, The Cosmic Linear Anisotropy Solving System
(CLASS) II: Approximation schemes,JCAP 07 (2011) 034 [1104.2933].
[10] J. Albers, C. Fidler, J. Lesgourgues, N. Schöneberg and J. Torrado, CosmicNet. Part I.
Physics-driven implementation of neural networks within Einstein-Boltzmann Solvers,JCAP
09 (2019) 028 [1907.05764].
[11] A. Spurio Mancini, D. Piras, J. Alsing, B. Joachimi and M.P. Hobson, CosmoPower:
emulating cosmological power spectra for accelerated Bayesian inference from next-generation
surveys,Mon. Not. Roy. Astron. Soc. 511 (2022) 1771 [2106.03846].
[12] A. Nygaard, E.B. Holm, S. Hannestad and T. Tram, CONNECT: a neural network based
framework for emulating cosmological observables and cosmological parameter inference,
JCAP 05 (2023) 025 [2205.15726].
[13] S. Günther, J. Lesgourgues, G. Samaras, N. Schöneberg, F. Stadtmann, C. Fidler et al.,
CosmicNet II: emulating extended cosmologies with efficient and accurate neural networks,
JCAP 11 (2022) 035 [2207.05707].
[14] M. Bonici, F. Bianchini and J. Ruiz-Zapatero, Capse.jl: efficient and auto-differentiable CMB
power spectra emulation,The Open Journal of Astrophysics 7(2024) 10 [2307.14339].
[15] A. Mootoovaloo, A.H. Jaffe, A.F. Heavens and F. Leclercq, Kernel-based emulator for the 3D
matter power spectrum from CLASS,Astron. Comput. 38 (2022) 100508 [2105.02256].
[16] J. Donald-McCann, F. Beutler, K. Koyama and M. Karamanis, matryoshka: halo model
emulator for the galaxy power spectrum,Mon. Not. Roy. Astron. Soc. 511 (2022) 3768
[2109.15236].
[17] G. Aricò, R.E. Angulo and M. Zennaro, Accelerating Large-Scale-Structure data analyses by
emulating Boltzmann solvers and Lagrangian Perturbation Theory,2104.14568.
[18] D.J. Bartlett, L. Kammerer, G. Kronberger, H. Desmond, P.G. Ferreira, B.D. Wandelt et al.,
A precise symbolic emulator of the linear matter power spectrum,Astron. Astrophys. 686
(2024) A209 [2311.15865].
[19] T. Bakx, N.E. Chisari and Z. Vlah, COBRA: Optimal Factorization of Cosmological
Observables,2407.04660.
[20] A. Eggemeier, B. Camacho-Quevedo, A. Pezzotta, M. Crocce, R. Scoccimarro and
A.G. Sánchez, COMET: Clustering observables modelled by emulated perturbation theory,
Mon. Not. Roy. Astron. Soc. 519 (2022) 2962 [2208.01070].
[21] J. Donald-McCann, K. Koyama and F. Beutler, matryoshka II: accelerating effective field
theory analyses of the galaxy power spectrum,Mon. Not. Roy. Astron. Soc. 518 (2022) 3106
[2202.07557].
[22] S. Trusov, P. Zarrouk and S. Cole, Neural Network-based model of galaxy power spectrum:
Fast full-shape galaxy power spectrum analysis,2403.20093.
[23] A. Manrique-Yus and E. Sellentin, Euclid-era cosmology for everyone: neural net assisted
MCMC sampling for the joint 3 ×2 likelihood,Mon. Not. Roy. Astron. Soc. 491 (2020) 2655
[1907.05881].
[24] A. Mootoovaloo, A.F. Heavens, A.H. Jaffe and F. Leclercq, Parameter Inference for Weak
Lensing using Gaussian Processes and MOPED,Mon. Not. Roy. Astron. Soc. 497 (2020)
2213 [2005.06551].
[25] M. Bonici, L. Biggio, C. Carbone and L. Guzzo, Fast emulation of two-point angular statistics
for photometric galaxy surveys,Mon. Not. Roy. Astron. Soc. 531 (2024) 4203 [2206.14208].
– 21 –
[26] C.-H. To, E. Rozo, E. Krause, H.-Y. Wu, R.H. Wechsler and A.N. Salcedo, LINNA: Likelihood
Inference Neural Network Accelerator,JCAP 01 (2023) 016 [2203.05583].
[27] K. Zhong, E. Saraivanov, J. Caputi, V. Miranda, S.S. Boruah, T. Eifler et al., Attention-based
Neural Network Emulators for Multi-Probe Data Vectors Part I: Forecasting the
Growth-Geometry split,2402.17716.
[28] E. Saraivanov, K. Zhong, V. Miranda, S.S. Boruah, T. Eifler and E. Krause, Attention-Based
Neural Network Emulators for Multi-Probe Data Vectors Part II: Assessing Tension Metrics,
2403.12337.
[29] LSST Dark Energy Science collaboration, Machine Learning LSST 3x2pt analyses –
forecasting the impact of systematics on cosmological constraints using neural networks,
2403.11797.
[30] M.D. Hoffman and A. Gelman, The no-u-turn sampler: Adaptively setting path lengths in
hamiltonian monte carlo, 2011.
[31] M. Betancourt, A conceptual introduction to hamiltonian monte carlo, 2018.
[32] M. Millea, E. Anderes and B.D. Wandelt, Sampling-based inference of the primordial CMB
and gravitational lensing,Phys. Rev. D 102 (2020) 123542 [2002.00965].
[33] Y. Li, L. Lu, C. Modi, D. Jamieson, Y. Zhang, Y. Feng et al., pmwd: A Differentiable
Cosmological Particle-Mesh N-body Library,2211.09958.
[34] J.-E. Campagne, F. Lanusse, J. Zuntz, A. Boucaud, S. Casas, M. Karamanis et al.,
JAX-COSMO: An End-to-End Differentiable and GPU Accelerated Cosmology Library,Open
J. Astrophys. 6(2023) 1 [2302.05163].
[35] D. Piras and A. Spurio Mancini, CosmoPower-JAX: high-dimensional Bayesian inference with
differentiable cosmological emulators,The Open Journal of Astrophysics 6(2023) 20
[2305.06347].
[36] O. Hahn, F. List and N. Porqueres, DISCO-DJ I: a differentiable Einstein-Boltzmann solver
for cosmology,JCAP 06 (2024) 063 [2311.03291].
[37] M.S. Cagliari, E. Castorina, M. Bonici and D. Bianchi, Optimal constraints on Primordial
non-Gaussianity with the eBOSS DR16 quasars in Fourier space,JCAP 08 (2024) 036
[2309.15814].
[38] J. Ruiz-Zapatero, D. Alonso, C. García-García, A. Nicola, A. Mootoovaloo, J.M. Sullivan
et al., LimberJack.jl: auto-differentiable methods for angular power spectra analyses,
2310.08306.
[39] A. Mootoovaloo, J. Ruiz-Zapatero, C. García-García and D. Alonso, Assessment of
Gradient-Based Samplers in Standard Cosmological Likelihoods,2406.04725.
[40] C. Giovanetti, M. Lisanti, H. Liu, S. Mishra-Sharma and J.T. Ruderman, LINX: A Fast,
Differentiable, and Extensible Big Bang Nucleosynthesis Package,2408.14538.
[41] SPT-3G collaboration, Cosmology From CMB Lensing and Delensed EE Power Spectra
Using 2019-2020 SPT-3G Polarization Data,2411.06000.
[42] H. Zhang, M. Bonici, G. D’Amico, S. Paradiso and W.J. Percival, HOD-informed prior for
EFT-based full-shape analyses of LSS,2409.12937.
[43] A. Griewank and A. Walther, Evaluating Derivatives: Principles and Techniques of
Algorithmic Differentiation, Second Edition, Other Titles in Applied Mathematics, Society for
Industrial and Applied Mathematics (2008).
[44] M. Blondel and V. Roulet, The Elements of Differentiable Programming,arXiv e-prints
(2024) arXiv:2403.14606 [2403.14606].
– 22 –
[45] S. Scardapane, Alice’s Adventures in a Differentiable Wonderland – Volume I, A Tour of the
Land,arXiv e-prints (2024) arXiv:2404.17625 [2404.17625].
[46] G. Arya, M. Schauer, F. Schäfer and C. Rackauckas, Automatic Differentiation of Programs
with Discrete Randomness,arXiv e-prints (2022) arXiv:2210.08572 [2210.08572].
[47] B. Horowitz, C. Hahn, F. Lanusse, C. Modi and S. Ferraro, Differentiable stochastic halo
occupation distribution,Mon. Not. Roy. Astron. Soc. 529 (2024) 2473 [2211.03852].
[48] A.S. Jurling and J.R. Fienup, Applications of algorithmic differentiation to phase retrieval
algorithms,Journal of the Optical Society of America A 31 (2014) 1348.
[49] J. DeRose, S.-F. Chen, M. White and N. Kokron, Neural network acceleration of large-scale
structure theory calculations,JCAP 04 (2022) 056 [2112.05889].
[50] J. Bezanson, A. Edelman, S. Karpinski and V.B. Shah, Julia: A fresh approach to numerical
computing, 2015.
[51] T. Nishimichi, G. D’Amico, M.M. Ivanov, L. Senatore, M. Simonović, M. Takada et al.,
Blinded challenge for precision cosmology with large-scale structure: results from effective field
theory for the redshift-space galaxy power spectrum,Phys. Rev. D 102 (2020) 123541
[2003.08277].
[52] BOSS collaboration, The clustering of galaxies in the SDSS-III Baryon Oscillation
Spectroscopic Survey: RSD measurement from the LOS-dependent power spectrum of DR12
BOSS galaxies,Mon. Not. Roy. Astron. Soc. 460 (2016) 4188 [1509.06386].
[53] J. Bezanson, A. Edelman, S. Karpinski and V.B. Shah, Julia: A fresh approach to numerical
computing,SIAM Review 59 (2017) 65 [https://doi.org/10.1137/141000671].
[54] L. Senatore and M. Zaldarriaga, The IR-resummed Effective Field Theory of Large Scale
Structures,JCAP 02 (2015) 013 [1404.5954].
[55] T. Baldauf, M. Mirbabayi, M. Simonović and M. Zaldarriaga, Equivalence Principle and the
Baryon Acoustic Peak,Phys. Rev. D 92 (2015) 043514 [1504.04366].
[56] A.G. Sanchez, A.N. Ruiz, J.G. Jara and N.D. Padilla, Evolution mapping: a new approach to
describe matter clustering in the non-linear regime,Mon. Not. Roy. Astron. Soc. 514 (2022)
5673 [2108.12710].
[57] S. Ioffe and C. Szegedy, Batch Normalization: Accelerating Deep Network Training by
Reducing Internal Covariate Shift,arXiv e-prints (2015) arXiv:1502.03167 [1502.03167].
[58] M. Zennaro, J. Bel, F. Villaescusa-Navarro, C. Carbone, E. Sefusatti and L. Guzzo, Initial
Conditions for Accurate N-Body Simulations of Massive Neutrino Cosmologies,Mon. Not.
Roy. Astron. Soc. 466 (2017) 3244 [1605.05283].
[59] A.E. Bayer, A. Banerjee and Y. Feng, A fast particle-mesh simulation of non-linear
cosmological structure formation with massive neutrinos,JCAP 01 (2021) 016 [2007.13394].
[60] M. Chevallier and D. Polarski, Accelerating universes with scaling dark matter,Int. J. Mod.
Phys. D 10 (2001) 213 [gr-qc/0009008].
[61] E.V. Linder, Exploring the expansion history of the universe,Phys. Rev. Lett. 90 (2003)
091301 [astro-ph/0208512].
[62] J. Froustey, C. Pitrou and M.C. Volpe, Neutrino decoupling including flavour oscillations and
primordial nucleosynthesis,JCAP 12 (2020) 015 [2008.01074].
[63] J.J. Bennett, G. Buldgen, P.F. De Salas, M. Drewes, S. Gariazzo, S. Pastor et al., Towards a
precision calculation of Neff in the Standard Model II: Neutrino decoupling in the presence of
flavour oscillations and finite-temperature QED,JCAP 04 (2021) 073 [2012.02726].
[64] LSST Dark Energy Science collaboration, Core Cosmology Library: Precision
Cosmological Predictions for LSST,Astrophys. J. Suppl. 242 (2019) 2 [1812.05995].
– 23 –
[65] C. Tsitouras, Runge–kutta pairs of order 5(4) satisfying only the first column simplifying
assumption,Computers & Mathematics with Applications 62 (2011) 770.
[66] C. Rackauckas and Q. Nie, DifferentialEquations.jl–a performant and feature-rich ecosystem
for solving differential equations in Julia,Journal of Open Research Software 5(2017) .
[67] C. Alcock and B. Paczynski, An evolution free test for non-zero cosmological constant,Nature
281 (1979) 358.
[68] G. D’Amico, L. Senatore and P. Zhang, Limits on wCDM from the EFTofLSS with the
PyBird code,JCAP 01 (2021) 006 [2003.07956].
[69] M.M. Ivanov, M. Simonović and M. Zaldarriaga, Cosmological Parameters from the BOSS
Galaxy Power Spectrum,JCAP 05 (2020) 042 [1909.05277].
[70] F. Beutler, E. Castorina and P. Zhang, Interpreting measurements of the anisotropic galaxy
power spectrum,JCAP 03 (2019) 040 [1810.05051].
[71] M. Abbott, D. Aluthge, N3N5, V. Puri, C. Elrod, S. Schaub et al., mcabbott/tullio.jl: v0.3.7,
Oct., 2023. 10.5281/zenodo.10035615.
[72] H. Ge, K. Xu and Z. Ghahramani, Turing: a language for flexible probabilistic inference, in
International Conference on Artificial Intelligence and Statistics, AISTATS 2018, 9-11 April
2018, Playa Blanca, Lanzarote, Canary Islands, Spain, pp. 1682–1690, 2018,
http://proceedings.mlr.press/v84/ge18b.html.
[73] J. Robnik and U. Seljak, Fluctuation without dissipation: Microcanonical Langevin Monte
Carlo,2303.18221.
[74] J. Robnik, G.B. De Luca, E. Silverstein and U. Seljak, Microcanonical Hamiltonian Monte
Carlo,2212.08549.
[75] A.E. Bayer, U. Seljak and C. Modi, Field-Level Inference with Microcanonical Langevin
Monte Carlo, in 40th International Conference on Machine Learning, 7, 2023 [2307.09504].
[76] G.V. Cybenko, Approximation by superpositions of a sigmoidal function,Mathematics of
Control, Signals and Systems 2(1989) 303.
[77] M. Cranmer, Interpretable Machine Learning for Science with PySR and
SymbolicRegression.jl,arXiv e-prints (2023) arXiv:2305.01582 [2305.01582].
[78] D.J. Bartlett, B.D. Wandelt, M. Zennaro, P.G. Ferreira and H. Desmond, SYREN-HALOFIT:
A fast, interpretable, high-precision formula for the ΛCDM nonlinear matter power spectrum,
Astron. Astrophys. 686 (2024) A150 [2402.17492].
[79] C. Sui, D.J. Bartlett, S. Pandey, H. Desmond, P.G. Ferreira and B.D. Wandelt, syren-new:
Precise formulae for the linear and nonlinear matter power spectra with massive neutrinos
and dynamical dark energy,2410.14623.
[80] K.S. Dawson, D.J. Schlegel, C.P. Ahn, S.F. Anderson, É. Aubourg, S. Bailey et al., The
Baryon Oscillation Spectroscopic Survey of SDSS-III,AJ 145 (2013) 10 [1208.0022].
[81] BOSS collaboration, The clustering of galaxies in the completed SDSS-III Baryon Oscillation
Spectroscopic Survey: cosmological analysis of the DR12 galaxy sample,Mon. Not. Roy.
Astron. Soc. 470 (2017) 2617 [1607.03155].
[82] G. D’Amico, Y. Donath, M. Lewandowski, L. Senatore and P. Zhang, The BOSS bispectrum
analysis at one loop from the Effective Field Theory of Large-Scale Structure,JCAP 05
(2024) 059 [2206.08327].
[83] T. Brinckmann and J. Lesgourgues, MontePython 3: boosted MCMC sampler and other
features,Phys. Dark Univ. 24 (2019) 100260 [1804.07261].
[84] S. Axen, D. Karrasch, J. Burton, M. Hauru and P. Yong, mlcolab/pathfinder.jl: v0.9.8, Dec.,
2024. 10.5281/zenodo.14576747.
– 24 –
[85] M.M. Ivanov, C. Cuesta-Lazaro, S. Mishra-Sharma, A. Obuljen and M.W. Toomey, Full-shape
analysis with simulation-based priors: Constraints on single field inflation from BOSS,Phys.
Rev. D 110 (2024) 063538 [2402.13310].
[86] S. Paradiso, M. Bonici, M. Chen, W.J. Percival, G. D’Amico, H. Zhang et al., Reducing
nuisance prior sensitivity via non-linear reparameterization, with application to EFT analyses
of large-scale structure,2412.03503.
[87] K. Akitsu, Mapping the galaxy-halo connection to the galaxy bias: implication to the
HOD-informed prior,2410.08998.
[88] M. Shiferaw, N. Kokron and R.H. Wechsler, How do uncertainties in galaxy formation physics
impact field-level galaxy bias?,2412.06886.
[89] S. Chiarenza, M. Bonici, W. Percival and M. White, BLAST: Beyond Limber Angular power
Spectra Toolkit. A fast and efficient algorithm for 3x2 pt analysis,The Open Journal of
Astrophysics 7(2024) 112 [2410.03632].
[90] D.N. Limber, The Analysis of Counts of the Extragalactic Nebulae in Terms of a Fluctuating
Density Field.,ApJ 117 (1953) 134.
[91] X. Fang, E. Krause, T. Eifler and N. MacCrann, Beyond Limber: Efficient computation of
angular power spectra for galaxy clustering and weak lensing,JCAP 05 (2020) 010
[1911.11947].
[92] S.-F. Chen, Z. Vlah, E. Castorina and M. White, Redshift-Space Distortions in Lagrangian
Perturbation Theory,JCAP 03 (2021) 100 [2012.04636].
[93] A. Chudaykin, M.M. Ivanov, O.H.E. Philcox and M. Simonović, Nonlinear perturbation theory
extension of the Boltzmann code CLASS,Phys. Rev. D 102 (2020) 063533 [2004.10607].
[94] H.E. Noriega, A. Aviles, S. Fromenteau and M. Vargas-Magaña, Fast computation of
non-linear power spectrum in cosmologies with massive neutrinos,JCAP 11 (2022) 038
[2208.02791].
[95] D. Linde, A. Moradinezhad Dizgah, C. Radermacher, S. Casas and J. Lesgourgues,
CLASS-OneLoop: accurate and unbiased inference from spectroscopic galaxy surveys,JCAP
07 (2024) 068 [2402.09778].
[96] A. Perko, L. Senatore, E. Jennings and R.H. Wechsler, Biased Tracers in Redshift Space in the
EFT of Large-Scale Structure,arXiv e-prints (2016) arXiv:1610.09321.
[97] G. d’Amico, J. Gleyzes, N. Kokron, K. Markovic, L. Senatore, P. Zhang et al., The
cosmological analysis of the SDSS/BOSS data from the Effective Field Theory of Large-Scale
Structure,Journal of Cosmology and Astroparticle Physics 2020 (2020) 005.
[98] G. D’Amico, L. Senatore, P. Zhang and T. Nishimichi, Taming redshift-space distortion effects
in the EFTofLSS and its application to data,Journal of Cosmology and Astroparticle Physics
2024 (2024) 037.
[99] M.M. Ivanov, O.H.E. Philcox, M. Simonović, M. Zaldarriaga, T. Nischimichi and M. Takada,
Cosmological constraints without nonlinear redshift-space distortions,Physical Review D 105
(2022) 043531.
[100] L. Senatore and M. Zaldarriaga, Redshift Space Distortions in the Effective Field Theory of
Large Scale Structures,arXiv e-prints (2014) arXiv:1409.1225.
[101] M. Lewandowski and L. Senatore, An analytic implementation of the IR-resummation for the
BAO peak,Journal of Cosmology and Astroparticle Physics 2020 (2020) 018.
– 25 –