ArticlePDF Available

Enhancing Gravitational-Wave Science with Machine Learning

IOP Publishing
Machine Learning: Science and Technology
Authors:

Abstract and Figures

Machine learning has emerged as a popular and powerful approach for solving problems in astrophysics. We review applications of machine learning techniques for the analysis of ground-based gravitational-wave (GW) detector data. Examples include techniques for improving the sensitivity of Advanced Laser Interferometer GW Observatory and Advanced Virgo GW searches, methods for fast measurements of the astrophysical parameters of GW sources, and algorithms for reduction and characterization of non-astrophysical detector noise. These applications demonstrate how machine learning techniques may be harnessed to enhance the science that is possible with current and future GW detectors.
This content is subject to copyright. Terms and conditions apply.
Mach. Learn.: Sci. Technol. 2(2021) 011002 https://doi.org/10.1088/2632-2153/abb93a
OPEN ACCESS
RECEIVED
7 May 2020
REVISED
31 July 2020
ACC EPTE D FOR PU BLIC ATION
16 September 2020
PUBLISHED
1 December 2020
Original Content from
this work may be used
under the terms of the
Creative Commons
Attribution 4.0 licence.
Any further distribution
of this work must
maintain attribution to
the author(s) and the title
of the work, journal
citation and DOI.
TOPICAL REVIEW
Enhancing gravitational-wave science with machine learning
Elena Cuoco1,2,3, Jade Powell4, Marco Cavagli`
a5, Kendall Ackley6,7, Michał Bejger8,
Chayan Chatterjee7,9, Michael Coughlin10,11, Scott Coughlin12, Paul Easter6,7, Reed Essick13,
Hunter Gabbard14, Timothy Gebhard15,16, Shaon Ghosh17, Leïla Haegel18, Alberto Iess19,20,
David Keitel21, Zsuzsa M´
arka22, Szabolcs M´
arka23, Filip Morawski8, Tri Nguyen24, Rich Ormiston25,
Michael Pürrer26, Massimiliano Razzano3,27, Kai Staats12, Gabriele Vajente10 and Daniel Williams14
1European Gravitational Observatory (EGO), I-56021 Cascina, Pisa, Italy.
2Scuola Normale Superiore (SNS), Piazza dei Cavalieri, 7 - 56126 Pisa, Italy.
3Istituto Nazionale di Fisica Nucleare, Sezione di Pisa, Pisa, I-56127, Italy.
4OzGrav, Swinburne University of Technology, Hawthorn, Melbourne, VIC 3122, Australia.
5Institute of Multi-messenger Astrophysics and Cosmology, Missouri University of Science and Technology, 1315 N. Pine St., Rolla,
MO 65409, United States of America.
6School of Physics and Astronomy, Monash University, Monash, Vic 3800, Australia.
7OzGrav: The ARC Centre of Excellence for Gravitational Wave Discovery, Clayton VIC 3800, Australia.
8Nicolaus Copernicus Astronomical Center, Polish Academy of Sciences, Bartycka 18, 00-716, Warsaw, Poland.
9Department of Physics, The University of Western Australia, 35 Stirling Highway, Perth, Western Australia 6009, Australia
10 LIGO, California Institute of Technology, Pasadena, CA 91125, United States of America.
11 School of Physics and Astronomy, University of Minnesota, Minneapolis, Minnesota 55455, United States of America
12 Center for Interdisciplinary Exploration & Research in Astrophysics (CIERA), Northwestern University, Evanston, IL 60208, United
States of America.
13 Kavli Institute for Cosmological Physics, University of Chicago, Chicago, IL, United States of America.
14 SUPA, University of Glasgow, Glasgow G12 8QQ, United Kingdom.
15 Max Planck Institute for Intelligent Systems, Max-Planck-Ring 4, 72076 Tübingen, Germany
16 Max Planck ETH Center for Learning Systems, Max-Planck-Ring 4, 72076 Tübingen, Germany
17 Montclair State University, Montclair, NJ, United States of America.
18 Universit´
e de Paris, CNRS, Astroparticule et Cosmologie, F-75013 Paris, France.
19 Universit`
a di Roma Tor Vergata, I-00133 Roma, Italy.
20 Istituto Nazionale di Fisica Nucleare, Sezione di Roma Tor Vergata, I-00133 Roma, Italy.
21 Universitat de les Illes Balears, IAC3–IEEC, E-07122 Palma de Mallorca, Spain.
22 Columbia Astrophysics Laboratory, Columbia University in the City of New York, 550 W 120th St., New York, NY 10027, United
States of America.
23 Department of Physics, Columbia University in the City of New York, 550 W 120th St., New York, NY 10027, United States of
America
24 LIGO, Massachusetts Institute of Technology, Cambridge, MA, 02139, United States of America.
25 University of Minnesota, Minneapolis, MN 55455, United States of America.
26 Max Planck Institute for Gravitational Physics (Albert Einstein Institute), Am Mühlenberg 1, Potsdam 14476, Germany.
27 Department of Physics, University of Pisa, Pisa, I-56127, Italy.
E-mail: elena.cuoco@ego-gw.it
Keywords: gravitational waves, machine learning, deep learning
Abstract
Machine learning has emerged as a popular and powerful approach for solving problems in
astrophysics. We review applications of machine learning techniques for the analysis of
ground-based gravitational-wave (GW) detector data. Examples include techniques for improving
the sensitivity of Advanced Laser Interferometer GW Observatory and Advanced Virgo GW
searches, methods for fast measurements of the astrophysical parameters of GW sources, and
algorithms for reduction and characterization of non-astrophysical detector noise. These
applications demonstrate how machine learning techniques may be harnessed to enhance the
science that is possible with current and future GW detectors.
1. Introduction
In February 2015, the Laser Interferometer Gravitational-wave Observatory (LIGO) [1] Scientific
Collaboration and the Virgo [2] Collaboration announced the first observation of a Gravitational-Wave
© 2020 The Author(s). Published by IOP Publishing Ltd
Mach. Learn.: Sci. Technol. 2(2021) 011002 E Cuoco et al
(GW) signal from a stellar-mass Compact Binary Coalescence (CBC) system [3,4]. Nine additional Binary
Black Hole (BBH) mergers and one Binary Neutron Star (BNS) merger were observed by LIGO and Virgo
during the first two advanced detector observing runs (O1/O2) [5]. The six month-long O3a run (April
1st–October 1st, 2019) and the recently completed O3b run have yielded tens of new BBH, BNS and Neutron
Star-Black Hole (NSBH) detections [68] and detection candidates [9]. The current rate of detections is
expected to increase in future observing runs, as LIGO and Virgo approach their design sensitivity and
additional detectors such as KAGRA [10,11] and LIGO-India [12] join the network of ground-based GW
observatories. The improved sensitivity of the GW network will allow scientists to gain insights into the
origins and astrophysical distributions of CBC GW sources [13], test general relativity [14], and measure
cosmological parameters such as the Hubble constant [15]. It may also lead to the discovery of GW signals
from new source types, such as core-collapse supernovae (CCSNe) [16] and magnetars [17], or a stochastic
GW background of cosmological or astrophysical origin [18].
Despite all the initial successes, the future of GW astronomy is facing many challenges. Processing and
analyzing the increased rate of detections in future observing runs will require researchers to streamline
current search pipelines. Refined astrophysical investigations of GW sources and tests of the fundamental
nature of gravity will require precise reconstructions of GW signals and accurate estimates of their statistical
and systematic errors. Identification and mitigation of instrumental and environmental data artifacts will
require the development of fast and efficient methods for detector and signal characterization.
Machine Learning (ML) algorithms are novel methods for tackling these issues. The LIGO and Virgo
Scientific Collaborations run searches for modeled and unmodeled astrophysical transient signals, as well as
searches for continuous GWs from isolated compact objects and searches for a stochastic background of
GWs of cosmological or astrophysical origin. These searches rely on different techniques, such as matched
filtering [19,20], time-coincident detection of coherent excess power between multiple detectors [21,22],
and cross-correlation methods [23]. Because of the effectiveness of ML algorithms in identifying patterns in
data, ML techniques may be harnessed to make all these searches more sensitive and robust. Applications of
ML algorithms to GW searches range from building automated data analysis methods for low-latency
pipelines to distinguishing terrestrial noise from astrophysical signals and improving the reach of searches.
In this paper, we review the ML-based techniques that have been developed by LIGO and Virgo scientists
to improve the analysis of GW data. In recent years, ML has gained popularity among LIGO and Virgo
researchers thanks to advances in detection and classification of noise transients [2433], searches for CBC
systems [3437], parameter estimation of transient signals [3741], noise removal [42] and citizen science
projects [43]. We also examine the potential of ML to improve GW science as current detectors approach
their design sensitivity and other detectors join the GW network of ground-based observatories.
The structure of the paper is as follows. In section 2we review the ML algorithms which have been
developed to improve the quality of LIGO-Virgo data. In section 3we review ML techniques which aim at
improving the modeling of GW signals. In section 4we describe how ML can be used to improve the
sensitivity of GW searches. In section 5we review methods for parameter estimation of GW signals and
source population inference. Conclusions are given in section 6.
2. Algorithms for gravitational-wave data quality improvement
The output of a GW detector is a temporal series of the detector strain, h(t). The sensitivity of an ideal
detector is determined by the physics inherent to the instrumental design and is limited by fundamental,
irreducible noise sources, such as quantum noise of the laser light, thermal noise of the mirror coatings and
optic suspensions, or gravity gradient noise [1,2]. The sensitivity of a real-world detector is also limited by
the presence of technical noise sources of different origins, related for example to the feedback control
systems that are needed to maintain the systems in operation, or by instrumental and environmental
disturbances. Often these noise sources are non-stationary, i.e. their statistical properties vary over short or
long time scales. In some instances, noise sources might be stationary at their origins, but couple to the
detector strain in a non-linear way.
The non-stationary, non-Gaussian nature of the data and the presence of noise artifacts may impact data
quality or detector performance, and increase the false alarm rate of searches. Short-lived environmental and
instrumental noise disturbances, known as detector ‘glitches, may also affect low-latency detection and
parameter estimation of astrophysical transient signals, as evidenced by the GW170817 BNS detection [5].
These non-stationary and non-linear transients, as well as continuous noise signals at given frequencies in
the form of spectral lines, are two examples of major factors affecting the performance of GW searches [44].
Astrophysical signals have typical amplitudes comparable to the detector background noise. Therefore,
characterization and reduction of detector noise is essential to GW searches. Some of the techniques for the
identification and mitigation of LIGO-Virgo data quality are described in reference [6].
2
Mach. Learn.: Sci. Technol. 2(2021) 011002 E Cuoco et al
Figure 1. Time-frequency representations of different types of glitches occurring in GW data. (Examples from the GravitySpy
project [45]). ML algorithms can help identify the origin of these glitches and increase the sensitivity of GW transient searches.
2.1. Machine learning for h(t) glitch characterization and classification
The first step in the characterization of detector transient noise is to distinguish the glitches from potential
astrophysical signals and then classify them into different families. This task can be tackled by extracting
features from each glitch time series and mapping these features to the target glitch types. However, glitches
often exhibit a complex temporal and frequency evolution that can make their characterization with a fixed
number of features very difficult. Moreover, the increasing sensitivity of the detectors may lead to a larger
number of glitch morphologies. ML can help solve the problem of glitch classification. Earlier works on a
variety of unsupervised ML methods [46,47] and neural networks (2017Mukund) have shown that ML
algorithms can be very effective.
Deep Convolutional Neural Networks [49,50] (CNNs) are extremely promising for glitch classification
based on time-frequency representations. CNNs are designed to extract features from 2D matrices, such as
images, and use these features for classification purposes. Feeding time-frequency transforms, Omega Scans
[51], and Q-transforms [52] to a CNN-based deep network is an effective approach to glitch classification.
Reference [31] implements an image-based detection and classification pipeline built on 2D CNN layers.
Tests with GW simulated signals show that this method provides a ~99% accuracy in classification and
differentiation of glitches from chirp-like signals. CNNs typically provide higher accuracy of distinguishing
glitches with similar morphology with respect to other ML approaches. In particular, one effective way of
looking at glitch appearance is to build time-frequency maps such as spectrograms and feed them to a CNN.
In fact, CNNs have an unique power to automatically extract the most significant features from an image,
which can be used to distinguish between different images.
Along the same lines, reference [53] uses a Wavelet Detection Filter [54] to extract the features from the
input data set. The algorithm performs glitch classification with a boosted gradient method [55] which could
be suitable for real-time analysis.
One critical step in supervised learning approaches is the availability of labeled glitch samples. The
citizen science project GravitySpy [43] addresses this issue. Based on the Zooniverse platform, GravitySpy
leverages the advantages of citizen science and ML to design a socio-computational system to analyze and
characterize transients in GW data. The citizen scientists classify images of glitches such as those shown in
figure 1. The glitch categories resulting from this manual classification process are then used as labels for
supervised ML approaches [45]. In addition, as the citizen scientists identify new categories of transients, the
ML algorithms are re-trained to take into account the new categories.
3
Mach. Learn.: Sci. Technol. 2(2021) 011002 E Cuoco et al
Figure 2. A qualitative illustration of how auxiliary channels can help determine the non-astrophysical nature of detector triggers.
The top time series is h(t). The other time series are from detector auxiliary monitoring different sources which are not sensitive
to GWs. The spike in the strain time series at t=0 occurs also in auxiliary channels 1 and 3. This may indicate that the trigger is
not of astrophysical origin.
2.2. Glitch characterization and classification with auxiliary channels
The LIGO and Virgo detectors record data streams from a large number of subsystems controlling different
aspects of the instruments and monitor their state. These auxiliary channels include data from a variety of
instrumental and environmental sensors, such as photodetectors and seismometers. These sensors can
witness noise sources which couple to the interferometers. Their data can be used to diagnose and mitigate
non-astrophysical couplings. An example is shown in figure 2.
A full, manual analysis of LIGO and Virgo auxiliary channel data is generally impracticable because of
the huge number of instrumental and environmental monitoring sensors, amounting to several tens of
thousand per interferometer. The power of ML to handle huge data sets proves invaluable in analyzing
auxiliary channel data.
Methods for glitch identification based on auxiliary channel data have been extensively investigated
within the LIGO and Virgo collaborations [28,5659]. In these approaches, the GW channel is generally
used to determine labels for the training samples while the glitch identification process relies only on
information from auxiliary channels. Once the model is trained, it is fully independent of the GW channel
data and considers only parameters computed from auxiliary channels known not to be related to
astrophysical signals.
The first study of canonical ML algorithms within a glitch detection framework such as Random Forests
(RF), neural networks, and support vector machines, was published in reference [28]. Since then, multiple
authors have investigated ML algorithms to infer the presence of non-Gaussian noise in GW data through
features in auxiliary channels.
iDQ is a glitch detection pipeline that can produce real-time data products in low-latency [60,61]. iDQ
decomposes the problem of glitch identification into a 2-class classification scheme within a supervised
learning framework with several asynchronous tasks: training, cross-validation, calibration, and low-latency
prediction. The pipeline utilizes glitch features in auxiliary channels generated in real-time [52,62] to
construct supervised-learning training sets from recent data labeled by witnessing noise in the target
channel, typically taken to be h(t). It then automatically trains a variety of ML algorithms to identify glitches
in the target channel.
iDQ operated in real-time throughout the first three LIGO-Virgo observing runs, providing probabilistic
statements about the presence of glitches in LIGO data and their auxiliary witnesses. iDQ’s Ordered Veto List
[56] algorithm contributed to the rapid release of the GW170817 BNS event by autonomously identifying
the glitch coincident with the GW trigger in the LIGO Livingston detector [63] and in multiple auxiliary
witnesses within 8 seconds of the event first being reported (see figure 7 of reference [61]). Additionally, iDQ
demonstrated that the surrounding data was uncorrupted, which helped confirm the presence of an
astrophysical signal in multiple detectors and led to the candidate’s announcement [64].
The LIGO Scientific Collaboration developed several additional promising approaches for auxiliary
channel-based glitch identification in high-latency settings. These approaches typically use different auxiliary
features, ML algorithms, and channel-sets.
4
Mach. Learn.: Sci. Technol. 2(2021) 011002 E Cuoco et al
Two fast algorithms which aim at identifying the origin of glitches and can be used with minimal tuning
to track their causes are based on RF and Genetic Programming (GP) [65]. RF and GP methods are
interpretable, easy to use and tune, and can work with relatively small data sets without the inherent risk of
overfitting. The algorithms require as input a list of times when a specific class of noise transients occurs and
rely on features which are directly drawn from the numerical metadata generated by real-time LIGO-Virgo
data quality pipelines, such as Omicron [66]. This approach minimizes the feature generation step of the
process, which is the typical bottleneck for ML-based glitch investigations. The data sets are assembled with
features from noise transients and randomly-selected background triggers. The algorithms can be quickly
trained and run on LIGO-Virgo computing clusters. A typical analysis with a number of noise transients of
the order of a few thousands can be completed in minutes. The methods were validated in reference [65] on
two sets of h(t) glitches with known origin from the first two LIGO-Virgo observing runs.
Another ML tool that utilises auxiliary channel information is Elastic-net based ML for Understanding
(EMU) [59]. EMU uses data from the full list of LIGO’s auxiliary channels per detector site. The algorithm
uses logistic regression with elastic net regularization [67,68] to predict the probability of a glitch. Instances
where a glitch is predicted to be present are classified as 1 (‘glitch’) while instances where a glitch is predicted
to be absent are classified as 0 (‘glitch-free’) with a continuum of certainty between these limiting values. As
other algorithms described above, EMU provides a measure of the auxiliary channel significance in
predicting h(t) glitches. This may enable researchers to uncover instrumental and environmental noise
couplings to the GW data stream that can be used by commissioners to eliminate the instrumental root of
the glitches. EMU’s initial performance is illustrated in reference [59]. It was also characterized using
automatically clustered subsets of glitches, which can be derived according to frequency, duration, and other
trigger generator parameters, or by using existing methods to identify classes of glitches [69].
Reference [70] uses ML regression and clustering methods to infer peak ground velocities from
Earthquake Early Warning alerts. The algorithm is trained on archival seismic data to determine the ground
motion and the state of a GW interferometer during an earthquake. The estimated ground velocity is then
used to forecast the potential effect of earthquakes on the detector in near real-time making it possible to
switch the detector control configuration during periods of excessive ground motion.
Hey LIGO [71] is an ML-based information retrieval tool which aims at supporting the commissioning
and characterization efforts of GW observatories. The algorithm responds to an user query by searching the
detector open source logbook data and returning information on detector operation, maintenance, and
characterization tasks. The Hey LIGO web application incorporates a natural language processing-based
information retrieval system that can also perform visualization of the user-queried data. It can aid in the
investigations performed by the other ML studies in this section by for example finding all the log book
entries about earthquakes. It may also be a useful source of information about data quality to people outside
of LIGO and Virgo who would not have access to the auxiliary channels of data used by some of ML
algorithms in this section.
2.3. Methods for non-stationary noise subtraction and denoising
The output signal measured by LIGO and Virgo interferometers is the sum of noise and GW signals. ML
techniques can be used to identify, model and subtract technical noises that couple in non-stationary or
non-linear ways. In particular, ML algorithms can be designed to construct filters for non-linear noise
subtraction. Unfortunately, some of the noise sources which can potentially be filtered out do not necessarily
couple linearly into the interferometer. Therefore, even after Wiener filtering [19], which can only filter our
linear noise as in refs [72,73], there exists a substantial amount of non-linear couplings polluting the output.
Leveraging the ability of ML to infer non-linear functions, environmental and control data streams can be
used as input of neural networks to find the transfer functions of the systems producing non-linear noise in
the detector output. The trained network can then be used to subtract those non-linear couplings from the
output data and lower the total noise floor. The ML algorithms are also much faster than Wiener filtering and
can be applied to the data in real time. The above method was implemented, for example, in DeepClean [74]
and NonSENS [42].
An especially interesting case is when the noise source is monitored by one or more available fast signals
and its coupling can be described as linear on short time scales, but with coupling transfer functions that
vary on longer time scales. The method can be illustrated by considering the effect of the longitudinal control
noise due to the signal recycling cavity feedback, which couples to the detector strain in a frequency region
that spans between about 10 and 300 Hz. The coupling is modulated on time scales longer than about 1
second by the residual angular motion of the mirrors. In this case and similar circumstances, it is possible to
efficiently track the time-varying noise coupling using the interferometer angular control signals and develop
a stable, parametric model of the noise polluting the detector strain [42]. This model can be used to perform
a time-domain subtraction of the noise that outperforms the performance of any linear and stationary
5
Mach. Learn.: Sci. Technol. 2(2021) 011002 E Cuoco et al
Figure 3. A typical BBH GW signal in whitened design sensitivity detector noise. The signal was generated with two 20 solar mass
black holes and an optimal SNR =17. Left: Whitened time series. Right: Time-frequency spectrogram of the data showing the
signal chirplike shape.
scheme. This non-stationary noise subtraction scheme was successfully applied to LIGO data during the first
half of the third observing run. The method allowed removal of the non-stationary power supply line
coupling that was limiting the detector sensitivity at the mains frequency of 60 Hz and in a 4 Hz-wide band
around 60 Hz due to sidebands created by the coupling modulation [42].
Deep learning may also be used to uncover underlying signals by applying various denoising algorithms.
Algorithms to denoise GW data include the total-variation method [75,76] which applies the split Bergman
regularization to obtain the total-variation regularization [77], dictionary learning [78,79], deep learning
[80] with WaveNet implementation [81], and deep recurrent neural networks [82,83] in denoising
auto-encoders architecture [84,85].
3. Gravitational waveform modeling
LIGO and Virgo searches for GWs for CBC systems which rely on a matched filtering analysis [8690] and
the estimation of source parameters, performed with Bayesian inference [91,92], require GW templates.
Waveform templates are needed to compute the Signal-to-Noise Ratio (SNR) and the significance of GW
triggers, and the posterior probability distribution of the signal parameters. Figure 3shows an example of a
BBH signal in whitened detector data.
Accurate solutions of the Einstein equations for the two body problem can be obtained with Numerical
Relativity (NR) simulations. However, high computational cost required to produce NR waveforms prevents
the production of waveform catalogs spanning the full CBC parameter space [93101]. The main parameters
that describe a quasi-circular binary system of rotating BHs are the mass ratio and the spin vectors of the two
objects. Neutron stars have additional internal degrees of freedom that are described by their tidal
deformability, and generic compact binaries can also inspiral on eccentric orbits which further increases the
dimensionality of the binary parameter space. Since binary parameters of GW events are a priori unknown
their matched filter analysis with GW detector networks requires models of the emitted GWs that smoothly
vary as a function of binary parameters, rather than solutions at isolated points as provided by NR
simulations.
It is important to emphasize that waveform models need to be accurate and computationally efficient to
use them in detection and parameter estimation pipelines. Good accuracy in terms of the overlap, the
normalized cross-correlation maximized over time and phase shifts, between a waveform model and the
most accurate waveforms available (usually NR simulations, or NR simulations stitched together with
inspiral waveforms) is crucial to avoid missing events in searches or mis-measuring binary parameters. On
the other hand, waveform models need to be fast to evaluate since searches and parameter estimation require
tens to hundreds of millions of waveform evaluations and the cost rises for lower mass binaries with
hundreds of GW cycles in band.
The coalescence of a binary system starts with an inspiral phase corresponding to a large separation of the
two objects where the GWs emitted can be computed with the post-Newtonian (pN) formalism [102]. As the
objects come closer, the orbital velocity becomes comparable with the speed of light and the pN expansion
breaks down. In this strong gravity regime NR simulations are needed to compute the emitted gravitational
radiation during the merging of the objects. If the remnant is a BH the final object relaxes to a Kerr BH under
the emission of a sum of damped quasi-normal modes [103] in the so-called ‘ringdown’ phase.
LIGO and Virgo rely on approximate solutions that are traditionally obtained through the
effective-one-body (EOB) or phenomenological modeling approaches. In the EOB formalism [104,105] the
PN inspiral information is re-summed and calibrated to NR data, and the merger-ringdown part is obtained
6
Mach. Learn.: Sci. Technol. 2(2021) 011002 E Cuoco et al
from a phenomenological fit to NR data. EOB models [104119] first solve the orbital dynamics by solving a
complex system of ordinary differential equations and then obtain the waveform as a second step.
Phenomenological models [120130] provide full gravitational waveforms by tuning an extended PN
inspiral expansion and separate analytical functions for the merger and ringdown waveforms to NR
simulations that have been combined with PN or EOB inspiral waveforms, and then smoothly combining
these three regions. Surrogate or reduced order models of NR or EOB waveforms [115,131141] can
significantly accelerate these waveforms while keeping high accuracy. They have become indispensable tools
for GW data analysis over the past several years. Surrogate models fit interpolated decomposed waveform
data pieces over the binary parameter space and also compress the waveform in time or frequency.
The above modeling approaches rely mostly on traditional polynomial interpolation or fitting techniques
for model construction. In the past few years, more advanced ML techniques have started to be explored for
model building. Gaussian process regression (GPR) can be considered as a generalization of a multivariate
normal distribution and allows to not only build a model of input data, but at the same time obtain a
measure of the uncertainty in the data. GPR enables us to infer the waveform at points of the parameter
space not covered by numerical relativity by assuming a joint Gaussian distribution between the known and
predicted values, then computing the posterior probability of the predicted parameters conditioned on the
numerical relativity training data. Along with the mean, the GPR can also compute the variance of the
waveform at a given point in parameter space. This variance will be low in regions supported by training
data, but high when no numerical relativity training data is available in the neighborhood of the evaluation
point and especially so when the GPR is used for extrapolation. See figure 2.2 right panel in [142] for a
1-dimensional example of how a GPR depends on training data. GPR fits have been used to build surrogate
models of non-precessing BBH systems with iterative refinement [138], as well as surrogates of precessing
binary NR waveforms [140,143]. The usefulness of including the uncertainty on the waveform in models is
the capability to marginalize over waveform uncertainty in Bayesian estimations of the source
parameters [144,145].
A comparative study of regression methods used for fitting or interpolating waveform data in waveform
models covering both traditional polynomial methods, GPRs, and, for the first time, artificial neural
networks has been carried out recently [146]. The increasing use of ML in GW source modeling motivated a
study comparing the performance of ML approaches against traditional regression methods. In [147], the
authors carefully investigated the accuracy, training, and execution time of ML methods (GPR and ANN)
against linear interpolation, radial basis functions, tensor product interpolation, polynomial fits, and greedy
multivariate polynomial fit. This study addressed the question of whether a more sophisticated and
complicated method is necessary to build a gravitational waveform model for aligned-spin and precessing
binaries. They concluded that sophisticated regression methods, especially ML, are not necessarily needed in
standard GW modeling applications, although ML techniques might be more suitable for problems with
higher complexity, like fully spinning black holes.
A key input in the design of waveform models is the mass and spin of the remnant BHs, entirely
predicted from the initial BHs parameters by general relativity and computed in numerical simulations. The
parameters of the remnants from non-precessing binary systems are traditionally determined with explicit
fits to numerical relativity results [148,149], a procedure that has been extended to determine the final spin
magnitude for precessing systems [150]. The description of fully spinning BHs however requires a larger
dimensional space, where ML methods are particularly suited to capture the complex relationship between
the initial and final parameters. For this reason, the determination of remnant BHs mass, spin and recoil
velocity for precessing systems has been performed independently with GPR and deep neural networks
(DNN), both methods showing increased accuracy compared to existing fits [151,152].
GWs from the remnants of BNS mergers can be used to place constraints on the neutron star equation of
state. However, the NR simulations that are currently used to model the merger and post-merger stage of
such GW signals are computationally expensive to generate and of limited accuracy, thereby restricting the
ability to use them to perform parameter estimation of candidate GW detections. A hierarchical ML
algorithm trained on NR simulations was developed in reference [153] that can quickly generate
gravitational waveform amplitude spectra with mean overlaps of 0.95, and can also be used to place
constraints on the quadrupolar tidal deformability given sufficient SNR in the post-merger signal.
4. Gravitational-wave signal searches
Searches for GW signals in ground-based detector data are typically split into four different types, depending
on their search strategy. The first type includes CBCs. BBH, BNS and NSBH are the most common type of
sources and the only type of source detected to date. The second type, which we refer to as GW ‘bursts’,
includes transient sources with an unknown or partially modeled signal morphology, as for example, the
7
Mach. Learn.: Sci. Technol. 2(2021) 011002 E Cuoco et al
ones produced by CCSNe. The third source type includes long-duration, continuous GWs that may be
produced by an individual rotating neutron star. The fourth type is a stochastic background of GWs, which
could consist of remnant GWs from the Big Bang or distant unresolved CBCs. In this section, we review ML
approaches that have been developed to enhance LIGO and Virgo searches for these different types of GW
signals.
4.1. CBC searches
Matched filtering [19] is a common approach to search for sources from CBCs in LIGO and Virgo data [6].
Matched filtering works by first taking as input the raw calibrated strain data. Then a template bank of
waveforms is generated spanning a large astrophysical parameter space. The matched filter searches then
produce a list of GW triggers by cross-correlating the GW data with the waveforms in the template bank
divided by the interferometer’s spectral noise density. Signal consistency tests are done on triggers above a
certain SNR threshold generated from the matched filtering algorithm in order to determine the
time-frequency distribution of power in a trigger. This distribution is compared to what would nominally be
expected from the power in the matched filtering waveform. The comparison is done by splitting the
template up into a number of frequency bins, such that each bin contributes an equal amount of power to
the total matched filter SNR. An additional statistic is then constructed to compare the expected to the
measured power in each bin [154].
Recently, there have been developments in exploring how random forest ML algorithms may be used as
an alternative to standard CBC signal detection techniques. Using a ranking statistic derived from a random
forest of bagged decision trees, instead of the usual chi-squared statistic, it was reported that sensitivity
improvements were achieved on the order of 70±13%109±12 %compared to matched filtering [155]. This
study was carried out as an intermediate mass BH and a stellar-mass BBH search (25 M). In
reference [156], the random forest classifier was instead trained/tested on simulated single-detector NSBH
events and the authors use hand-crafted features derived from a bank of inspiral templates as input. The
classifier was able to detect 1.5–2 times as many signals as those those found by standard matched filter
detection techniques at low false positive rates as compared to the standard ‘reweighted SNR’ statistic, and
does not require the chi-squared test to be computed. The results from both of the studies discussed above,
show that random forest ML approaches have the potential to produce higher detection efficiencies for CBC
GW sources. This is because the random forest algorithm is better at rejecting transient noise artefacts that
produce chi-squared values typical of signals.
Other promising approaches to CBC signal identification have been proposed by several groups by
applying deep learning-based methods [3437]. CNN algorithms have been applied in multiple studies
towards the search for GWs from CBC signals. CNNs were applied to simulated BBH signals and real LIGO
events in non-Gaussian noise, as well as non-Gaussian noise alone in an attempt to classify such signals
[157]. This analysis reported that CNNs perform as well as the standard matched filter method in extracting
BBH signals under non-ideal conditions. CNNs and matched filtering were also compared in Gaussian noise,
where matched filtering is the ideal filter [36] and were able to match the efficiency of a matched filter
analysis. These studies address the fundamental question of the feasibility of deep learning application to
CBC GW searches [36,37].
In [158], they mention that CNN methods for CBC searches cannot provide an accurate measure of the
detection statistical significance. The CNNs are efficient at making detections, but they do not provide us
with a measure of how confident we are that a detection is real, whereas matched filter searches tell us how
many sigmas above the noise level the detection is. The latter can be achieved in matched filter searches by
sliding the data of one detector in time by an amount which is larger than the typical GW travel time
between detectors. A measure of coincident events after the time slides is used to estimate the search
background and the significance of detection candidates. CNN algorithms should implement an accurate
statistical measure of the background to take current, ML-based CBC searches from proof-of-principle
studies to production search codes, and is an active area for future development.
4.2. Burst searches
A GW burst is a short duration signal with an unknown or partially modeled waveform morphology due to
complicated or unknown astrophysics. Potential sources of GW bursts are CCSNe [159], pulsar glitches
[160], NSs collapsing into BHs [161], cosmic string cusps [162], and many others. The uncertainties in signal
morphology make it difficult to produce training sets for GW burst signals.
One of the standard LIGO-Virgo GW burst search methods is the coherent Wave Burst (cWB) pipeline
[21,22]. The cWB algorithm uses a Wilson–Daubechies–Meyer wavelet transform to measure the excess
power in the time-frequency domain that occurs coherently between multiple detectors. A maximum
likelihood approach is then applied to determine the probability of a signal being present in the data and
8
Mach. Learn.: Sci. Technol. 2(2021) 011002 E Cuoco et al
produce a list of GW detection candidate triggers. This model-independent method is well-suited for
searching for signals with unknown morphology. The sensitivities of burst searches are generally more
affected by short-duration glitches that may occur coincident in time between multiple detectors. No
detections other than CBC sources have been reported by cWB unmodeled searches to date [163].
Currently there are no published ML searches for generic GW bursts. However, there have been ML
searches for specific burst source types. In reference [164], the authors employ a neural network algorithm to
reduce the impact of glitches on the cWB burst search and increase the significance of the CBC signals which
are detected by the pipeline.
In recent years, multi dimensional simulations of CCSNe have produced a selection of GW signal
predictions from CCSN explosions [165168]. However, some of these simulations include approximations
of the required input physics that may result in artificial changes in the GW emission. Moreover, many
simulations are ended before the peak GW emission time due to lack of computational resources. Despite
these issues, common features in the time-frequency GW emission have recently been identified by various
CCSN simulation codes. This has allowed researchers to produce approximate models for a wider range of
the CCSN parameter space than can be explored with full CCSNe multi-dimensional simulations [169].
These approximate models allow us to explore supervised ML techniques for CCSN searches.
In reference [170], the authors apply a CNN to searches for GW bursts from CCSNe in a three-detector
network. Training and testing are performed with a parametrized phenomenological waveform model which
is designed to match the most common features predicted by CCSN simulations. The method uses 100
different parameterizations of the phenomenological model. cWB pre-processing data is used to prepare
images which are fed into the CNN. Red-green-blue (RGB) images are produced to determine the number of
detectors where a signal is present: Red (R) for LIGO Hanford, green (G) for LIGO Livingston and blue (B)
for Virgo. The algorithm is shown to improve the efficiency of cWB in its standard configuration.
A technique to reduce the background of searches for galactic CCSNe in single-interferometer
configurations was developed in reference [33]. The method is based on a supervised evolutionary
algorithm, Genetic Programming (GP). The procedure assumes that the event time and the distance of the
CCSN are known from neutrino and optical observations. The GP algorithm is first trained on off-source
data to produce a multivariate expression of the trigger features which is used as a cut to lower the search
background. The multivariate expression can then be applied to on-source windows around a GW event
candidate to increase the detection confidence. The effectiveness of the method was tested by injecting the set
of waveforms used in the latest LIGO-Virgo CCSN search into 1.47 days of O1 data. The features are
extracted from the standard cWB pipeline. The GP algorithm is then used to classify the triggers and remove
the noise triggers. The outcome of the procedure is an increased statistical significance of GW candidate
triggers that leads to a reduction of the SNR needed for a detection at 3σconfidence level by a factor ~ 3.
In [171], the authors train a CNN on waveforms obtained by 3D simulations of neutrino-driven CCSNe,
phenomenological sine-Gaussian waveforms, and scattered light glitches. A wavelet detection filter (WDF)
[53] with time-domain whitening [172] is used to extract GPS triggers from the simulated Virgo and
Einstein Telescope noise backgrounds. Whitened time series and spectrograms are used as input of 1D and
2D CNN algorithms to classify signal and noise classes. The method is then tested on CCSN models removed
from the training sets. In a multi-label classification scheme, the CNN is capable of distinguishing among the
individual CCSN and glitch classes at different SNRs. In reference [173], the authors train a CNN algorithm
on the time series of CCSN waveforms and classify signals with two different explosion mechanisms.
4.3. Continuous wave searches
Narrow-band continuous GWs (CWs) from spinning deformed neutron stars [174] have not yet been
observed in data recorded by LIGO and Virgo. Although the expected waveform signatures for this type of
signal are well known, their small amplitude makes them extremely hard to detect, and any search needs to
process long stretches of data. As the number of parameter space points to search grows with the observation
time, the sensitivity of current searches is limited by the available computing power. Several established
search methods exist with different trade-offs between sensitivity, robustness and computational efficiency.
For searches covering wide parameter spaces, analysis pipelines usually follow a hierarchical approach [175]
with multiple stages. The first stage evaluates a detection statistic over a dense grid of candidate waveforms.
Additional stages often include clustering of candidates to reduce computational waste, vetoing of known
types of instrumental artifacts, and follow-up with increased resolution and/or observation time. See
reference [176] for a recent review.
The huge computational cost of conventional CW searches makes ML approaches a promising alternative
as they are fast at searching once training has been completed. In addition, many current follow-up methods
also involve manual tuning steps, for which extensive simulation campaigns must be repeated on each new
9
Mach. Learn.: Sci. Technol. 2(2021) 011002 E Cuoco et al
data set and search setup [e.g 177,179]. Robust and flexible ML solutions would reduce this additional
human and computational effort.
At this point, several teams have started to explore two avenues of using ML methods in CW searches:
(i) using ML algorithms as drop-in replacements for parts of a CW analysis pipeline that is still based on
traditional grid search methods for the initial search stage; (ii) full ML analysis of the initial strain data.
Among the first category, reference [179] presents an application of deep learning in the classification of
CW signal candidates. A conventional initial search, based on the F-statistic matched filter
pipeline—TD-Fstat search [180] (see the documentation in [181]), generates a large number of
multi-dimensional candidate signal distributions that have to be further analyzed. Then 1D and 2D versions
of a CNN classifier are implemented, trained, and tested on a broad range of signal frequencies,
corresponding to the reference frequency of the narrow-banded data. The training set contains Gaussian
noise, simulated CWs from spinning triaxial-ellipsoid neutron stars, and stationary lines mimicking detector
artifacts [182]. The authors of [179] demonstrate that these CNNs correctly classify the instances of data at
various SNRs and frequencies, while also showing concept generalization, i.e. satisfactory performance at
frequencies the networks were not trained on.
For another F-statistic search pipeline, the hierarchical semi-coherent analysis [183185] running on the
distributed computing Einstein@Home project [186], reference [187] employs a deep learning replacement
for traditional clustering methods [188190]. Clustering has the purpose of reducing the number of
candidates from the initial search stage, enabling deep follow-up [178,189] at acceptable computational cost,
thus improving the sensitivity of the overall pipeline. In [187], they train a region-based CNN [R-CNN,
178,192] on real output of an Einstein@Home search [193] on Advanced LIGO O1 data. They demonstrate
a high detection efficiency at low false alarm rate for sufficiently strong signals, and investigate the scaling of
this performance with signal strength. They identify the R-CNN’s brittle response to instrumental
disturbances in the data as both a problem and an opportunity; since in the current implementation it
already distinguishes these from normal noise, they expect that a more completed classification into three
classes of normal noise, disturbances and signals can be pursued.
In a more radical approach than the works above, the authors of reference [194] apply a DNN to search
for CWs from unknown spinning neutron stars on the raw time-series data. A CNN is trained on Gaussian
noise with signal injections and compared to a matched filter search [195]. The analysis shows that the
method is competitive with matched filtering, at least under these idealized noise conditions and when using
data spans of limited duration. Thus, they provide the first demonstration of a full-ML search for CWs
without prior input from a traditional first-stage search. However, the authors themselves consider this as
only a ‘first proof-of-principle’ and point out several steps required for developing it into a mature analysis
pipeline, including the use of data from multiple GW detectors and dealing with non-Gaussian real detector
data, including the pervasive narrow spectral artifacts [182] that have long been a challenge to traditional
CW search methods [196,197]. More recently, this approach has been expanded on in reference [198].
Besides CNNs and other neutral networks, several other non-traditional approaches to CW searches have
been recently explored. One successful application is the hidden Markov model tracking CW search
method [199202], where the true intrinsic emission frequency of a GW source is treated as a hidden
variable tracked by signal frequencies in the detector data, thus allowing for deviations from a simple signal
model or even for intrinsic frequency fluctuations (e.g. spin wandering in binary sources and pulsar timing
noise). The most likely time-frequency tracks are found with the Viterbi algorithm [203]. This method is
different from most algorithms usually considered as ML, for example in that it does not require a separate
training stage, but is nevertheless inspired by work in the computer and information science fields and
provides another example for fruitful imports into GW science. An independent Viterbi-based search has
also been developed in reference [204].
Methods originally developed for CW searches have also proved to be useful for long-duration transient
GWs [205,206]. After the detection of GW170817, several CW search methods have been adapted to search
for transient GW signals from long-lived BNS merger remnants [202,205,207,208]. In this scenario, as for
the shorter-duration post-merger signals briefly discussed above in section 3, waveforms are not well known,
and hence these traditional template-based search methods are limited in robustness to deviations from their
assumed models. The authors of reference [209] have developed a CNN analysis for BNS post-merger signals
lasting O(hours-days). They characterized the CNN approach, answering questions like: (i) how to train
them, (ii) how many samples to train on, (iii) robustness to signals on which the networks were not trained,
and (iv) the effect of different architectures on detection efficiency and false alarm probability.
They also applied their CNN in a real search on the week of LIGO data after GW170817, producing
upper limits on possible GW emission similar to those in [205].
Other research areas where conventional CW methods reach their limits in terms of (i) robustness of the
underlying signal model, (ii) computational efficiency, (iii) adaptability to new situations with reasonable
10
Mach. Learn.: Sci. Technol. 2(2021) 011002 E Cuoco et al
human time investment, include very young sources with rapid frequency evolution (similar to the BNS
post-merger case), glitching pulsars [206], and GW emission from neutron stars in binaries [e.g. 210212].
Deep learning and other ML methods, or simple yet innovative methods imported from the computational
science community, like the Viterbi algorithm [199204], can lead to big steps towards first detections of
these elusive GW signal types.
4.4. Stochastic background
A stochastic GW background may consist of many CBC sources that are too distant to be individually
resolved or remnant GWs from the Big Bang [213]. Typical searches for a Gaussian stochastic background
apply the cross-correlation method described in reference [23]. For non-Gaussian stochastic backgrounds
from stellar mass BBHs, a Bayesian nested sampling method can be implemented for optimal search
sensitivity [214]. In section 5we describe how ML can be used to improve the speed and accuracy of this
method. A Gaussian mixture model, which is a superposition of independent Gaussian distributions, is used
to predict the discovery time of the GW stochastic background in reference [214].
5. Astrophysical interpretation of gravitational-wave sources
5.1. Parameter estimation
To understand the astrophysics behind sources of GWs, it is essential to accurately measure the parameters of
the source. For CBC signals, this is currently achieved using the LALInference [91], Bilby [92], PyCBC
Inference [215], or RIFT [216] tools designed for Bayesian parameter estimation and model selection. A
Bayesian framework allows us to calculate posterior probability density functions for the parameters of GW
signals. It also allows us to calculate the evidence for different models which can be used for model selection.
Bayesian evidence is computationally costly. In the case of CBC signals, this is due to the high number of
signal parameters (~15), the process of generating waveforms, and the SNR and length of the GW data being
analyzed. In LALInference, the computational issues are addressed using either nested sampling [217] or
Markov Chain Monte Carlo (MCMC) techniques [218].
To estimate a posterior distribution, MCMC techniques work by stochastically wandering though a
parameter space, distributing samples that are proportional to the density of the posterior. The
LALInference implementation of MCMC uses the Metropolis-Hastings algorithm that requires a proposal
density function to generate a new sample that can only depend on the current sample. The efficiency
depends on the choice of the proposal density function.
Nested sampling is used to calculate the Bayesian evidence and can also produce posterior distributions
for the signal parameters. Nested sampling transforms the multi-dimensional evidence integral into a one
dimensional integral over the prior volume. First a set of live points are distributed over the entire prior. The
point with the lowest likelihood is then removed and replaced with a point with a higher likelihood and
continues until some stopping condition is reached. Posterior samples can then be produced by re-sampling
the chain of removed points and current live points according to their posterior probabilities. This method
can take days to weeks to measure the parameters of a GW signal. Speeding up this process becomes more
important as GW detectors become more sensitive and more signals need to be analysed.
Some efforts at speeding up GW parameter estimation with ML have already been implemented in LIGO
and Virgo data analysis codes. RIFT is a Gaussian process/random forest regression algorithm that
decomposes the high-dimensional likelihood into low-cost and costly degrees of freedom, then performs full
Bayesian inference hierarchically, first by efficiently marginalizing over the low-cost degrees of freedom on a
sparse sample of fixed high-cost parameters, then interpolating this marginal likelihood over the expensive
degrees of freedom. By efficiently parallelizing, RIFT can produce inferences of compact binary parameters at
a much faster speed [219]. M
ULTINEST
[220] is a nested sampling technique that uses live points that are
enclosed within a set of ellipsoids. B
AMBI
[38] incorporates the nested sampling in M
ULTINEST
and combines it
with a neural network to learn the likelihood function on the fly. M
ULTINEST
obtains a number of new
samples from the likelihood that B
AMBI
then uses to train the network to learn the likelihood function. The
input to the neural network is the parameter values for the model and the likelihood value computed at that
point in the parameter space. It has been shown that B
AMBI
can produce results comparable to the full nested
sampling techniques at a much faster speed, typically the calculation of each likelihood point is reduced from
seconds to milliseconds resulting in up to a factor of ~100 speed up for the analysis of a BNS signal.
In [40], a conditional variational autoencoder (CVAE) is used in order to approximate the Bayesian
posterior given a GW time series. A standard CVAE is typically made up of two neural networks (an encoder
and a decoder) and is conditioned on some property of the training data (typically the class). A standard
CVAE is trained to reconstruct the input that it has been given while marginalizing over a latent space. If the
CVAE has been conditioned on class, once trained the user may on-command generate samples from a
11
Mach. Learn.: Sci. Technol. 2(2021) 011002 E Cuoco et al
specified class. The modified CVAE used in [40] is derived by starting from the assumption that we would
like to minimize the cross-entropy between the true Bayesian posterior and an approximate Bayesian
posterior parameterized by a neural network. After some further derivations, one arrives at a network
configuration consisting of two encoder networks and one decoder network which, when given a GW time
series after training, will reproduce the complete n-dimensional Bayesian posterior in less than a second. 9
compact BBH parameters are inferred (m1,m2,dl,t0,ϕ, psi, RA, DEC, inclination angle) where phase (ϕ)
and psi have been internally marginalized out and all remaining parameters are fixed. Other work has also
been carried out in parallel by [41] using a multivariate Gaussian posterior model to produce 1- and
2-dimensional posteriors. This approach also utilizes reduced order modeling [221] in order to simplify the
input GW time series to the network. Once trained on a generated set of waveforms with many different
noise realizations, it is then easy to sample from the model with comparable latency to [40] (~1–2 ms).
In addition, reference [222] uses a different method, known as normalizing flows, to produce GW
posteriors comparable to those in reference [39,41] in less than 2s. A normalizing flow is a series of
invertible transformations that can be used to transform a simple initial distribution (in this case, a
multivariate Gaussian) into a more complex target distribution. Normalizing flows are usually realized as
neural networks, meaning that the parameters of the transformations are learned through training. A
masked autoregressive flow network is a normalizing flow where the transformation component is described
by an autoregressive network. Autoregressive refers to the property that any given input to a layer within the
network is conditioned on both the current input and previous inputs. ‘Masked’ refers to the structure of the
MAF layers, whereby certain connections are dropped (set to zero) in order to preserve the autoregressive
properties of the network (which makes the network a normalizing flow). The authors then compare three
different approaches: a conditional variational autoencoder (similar to the network used in [39]), a masked
autoregressive flow (MAF) [223], and a CVAE with autoregressive flows. The results from both the single
MAF and the CVAE networks are comparable, while the combined CVAE and MAF model produce the
optimal result.
All of the current GW ML parameter estimation studies are still at the proof-of-principle stage, but they
show that ML will be a promising tool for future GW parameter estimation. As the detectors sensitivity
improves, the number of detections will significantly increase, and one of the advantages in using these ML
techniques will be the speed in measuring the sources astrophysical parameters with respect to traditional
methods, making it easier to process a large number of GW alerts.
5.2. Low-latency source properties inference
The source property inference of CBC events is one of the three astrophysical data products that the LIGO
and Virgo collaborations provide to the external community, the other two being 3D-sky localization [224]
and source classification [225]. The source-property inference, also known as EM-Bright, consists of two
statistical metrics: (1) the probability that the CBC system contains a neutron star of mass less than 3.0M,
P(HasNS), and (2) the probability that the final coalesced object is surrounded by tidally-disrupted matter
after the merger, P(HasRemnant). Numerical solutions of the Einstein equations in the presence of matter
provide us with estimates of the amount of tidally-disrupted matter. Several fitting formulae have been
derived to compute this quantity in low-latency [226,227].
In an ideal situation, Bayesian parameter estimation of GW data can provide the posterior probability
distribution of the various source parameters. Cuts based on the maximum neutron star mass can then be
applied to compute P(HasNS). Similarly, the fitting formula for the tidally-disrupted matter can be applied
to the parameter posterior distributions to infer P(HasRemnant). However, the fastest LIGO-Virgo
parameter estimation infrastructures currently available do not meet the low-latency requirements for
electromagnetic (EM) follow-up in x-ray and optical wavelengths, even when including the speed up in
analysis time gained by the techniques described in section 5.1. Getting the most reliable EM-Bright values in
absence of Bayesian posterior samples poses the most important challenge in source properties inference.
Matched filter searches give point estimates of masses and spins that can be used for low-latency estimates
of EM-bright properties. Individual mass and spin components from matched filter searches have often large
errors with respect to true parameter values. During the O2 run, LIGO and Virgo researchers estimated these
errors by implementing a Fisher approximation method to construct an ellipsoidal region of the parameter
space around the matched filter point estimates. However, this method is only effective for high SNR signals
and it ignores the matched filter search biases in parameter measurements. A supervised ML algorithm was
developed to address this issue in reference [228]. The algorithm uses the scikit-learn [229]
KNeighborsClassifier method to classify a source as HasNS or HasRemnant. An input data set is created by
injecting simulated signals into GW data and performing a search. A map between true values and matched
filter search point estimates is then used to train the classifier. Source property inference of GW detection
candidates obtained with this method were provided in low-latency to the astronomy community during O3.
12
Mach. Learn.: Sci. Technol. 2(2021) 011002 E Cuoco et al
5.3. Rates and populations of gravitational-wave sources
LIGO and Virgo have already detected a small population of CBC sources [5]. The number of detections is
expanding rapidly as the detectors become more sensitive, and within the next few years the population size
will reach into the hundreds, allowing us to perform detailed CBC population studies. In the case of compact
binaries, measuring properties such as their mass and spin distributions could allow us to determine their
formation mechanism [230236].
Several studies have demonstrated how ML can be applied for population analysis once a large enough
population of CBC sources has been detected. A few of these studies apply unmodeled clustering techniques,
such as Gaussian Mixture Models, to determine if GW detections come from multiple CBC populations
[237239]. They perform the clustering on the mass and spin measurements of the individual CBC
detections. This method is well-suited as it does not require any prior knowledge of the expected CBC
populations. This allows us to determine how many different populations of sources are present in detections
made by LIGO and Virgo, and what fraction of detected events belong to each population. The masses and
spin distributions of each population can then inform us of the differences in formation of the different
source populations.
One of the current standard methods for population analysis in GWs is Bayesian hierarchical modeling.
This approach is effective when there are trusted known population models, and allows for mixing ratios of
different populations. However, this method can be computationally expensive. In [240], the authors
combine hierarchical Bayesian modeling with a flow-based deep generative network. This is the same as the
masked autoregressive flow described in section 5.1. Combining ML with Bayesian hierarchical modeling
allows for a population analysis that is too computationally complex for hierarchical modeling alone.
These modeled and unmodeled ML techniques will be applied to the GW populations detected in the
future GW detector observing runs.
5.4. Identification of electromagnetic counterparts to gravitational-wave sources
Detecting EM emission from a GW source can considerably increase the knowledge of the astrophysical
source properties and contribute to the validation of cosmological models. The authors in [241] applied an
Artificial Neural Network (ANN) to localize GW signals from BBH mergers. The input data to the ANN
consists of the arrival time delays, amplitude ratios, and phase differences from multiple GW detectors. For
the purpose of GW source localization, they divided the sky into grids or sectors. The samples were labelled
with the sector number to which they belonged and the ANN was trained to classify them into their correct
sector. It was found, when testing their method with simulated BBH signals in Gaussian Advanced LIGO
noise, that for coarse angular resolution (18–128 sectors), the model is able to classify unseen GW samples
into their correct sector in more than 90% of cases, when a multi-labelling scheme is applied. For finer
angular resolution (1024–8192 sectors), the exact classification accuracy drops. For these cases, the
probability distribution of the sectors, obtained for each GW sample, was used as a ranking statistic to
calculate the areas of 90% probability contours. This method is potentially orders of magnitude faster than
traditional sky location methods, taking around 18 ms to localize a single GW sample. Currently, work is
being done to extend this method for real noise and with BNS and NSBH sources.
Once the sky map for a GW event is produced, it is sent out to the EM community. However, the optical
and near-infrared transient contamination rate can be prohibitively large. ML techniques are a promising
method for optimising, automating and significantly reducing the number of optical and near-infrared
transients that must be manually vetted. The expected EM emission of a BNS merger is a short-duration
Gamma-Ray Burst (GRB) which is quickly followed by an optical afterglow and an optical and near-infrared
kilonova [242]. While the prompt GRB shows highly collimated emission along the line of sight of a jet, the
kilonova emission is isotropic. Even if the prompt emission of the GRB is off-axis with respect to observer’s
line of sight, the afterglow and the kilonova can be observed at later times, albeit at much fainter magnitudes.
The contamination rate of optical transients in kilonova searches is estimated to be 1.79 deg2down to a
limiting magnitude of mi=23.5 and mz=22.4 in the i- and z-band, respectively [243]. Extrapolating out to
a LIGO’s design detector horizon of DL=200 Mpc, and effectively increasing the sensitivity and limiting
magnitude which are required to observe kilonovae at these distances, this rate is expected to increase by a
factor of 4 [243]. Types of astrophysical contaminants that may occur in the same time windows are Type Ia
and Type II supernovae, flaring objects such as M-dwarf flares, and nuclear activity in active galactic nuclei
(AGN). While these objects typically evolve on time scales which are different than the time scales of GRB
afterglows and kilonovae, they may appear in single-subtracted images as possible candidates. Follow-up and
vetting of these contaminants is a time-consuming process.
Besides the optical astrophysical contamination, a large number of optical image artifacts occurs as a
direct result of image processing. Difference imaging, or image subtraction, is the main technique used for the
identification of novel transients. Image subtraction is the process of subtracting a new image against an
13
Mach. Learn.: Sci. Technol. 2(2021) 011002 E Cuoco et al
older reference image, removing consistent steady-state brightness sources and ideally leaving behind only
novel transients.
Because of the high computational cost of image subtraction, many ML techniques have been employed
to improve the rate of image processing contamination. While there are extensive studies in literature for
transient identification with ML, one method directly addresses the reduction of image artifacts in the
searches for the EM counterparts of GW signals over their hundreds to thousands of square degree error
region skymaps [244]. As true transients should appear point-like in the subtracted image, the Point-Spread
Function (PSF) of all objects in the subtracted image can be compared directly against point-like PSFs from
the convolved reference image. The PSF of the reference image is derived by decomposing each source in the
image onto a set of Zernike polynomials and summing the median of the distribution of coefficients for each
order. Every source in the subtracted image is decomposed in the same way and the coefficient for each order
is compared against the reference PSF, generating a value for how distant the reference PSF and the
subtracted source PSF are in statistical space, called the Zernike distance. By using shape information alone
and employing unsupervised methods, the method returns an efficiency of 91.5% and 92.8% on sets of
artificial source injection tests with archival images from the Dark Energy Camera [245] and the Palomar
Transient Facility [246,247], respectively. The reduction in the number of optical image artifacts is over
99.97% with over 91% of the simulated true signals being preserved. This is a reduction from hundreds to
thousands of artifacts per image to single or double digit objects, making manual vetting of optical
candidates feasible over the hundreds of sq. degree error regions.
6. Conclusions
In this paper, we have provided a review of ML applications to GW science. ML is an exciting area of
development in the field of multi-messenger astrophysics. Over the years, LIGO and Virgo researchers have
applied a variety of ML algorithms to many challenging problems in GW data analysis and detector
characterization.
The data from LIGO and Virgo is non-stationary and non-Gaussian. ML methods can be successfully
used to improve the quality of these data. The probability of a GW signal candidate being astrophysical or
transient detector noise can be determined by applying ML to the data from thousands of environmental and
instrumental monitors. The origin of detector noise can be inferred by applying classification techniques to
find different types of transient noise. Citizen science projects provide training data for some of these studies.
Matched filter searches and Bayesian parameter estimation require a priori knowledge of the theoretical
templates of the expected GW signals. Providing these signal predictions with full numerical relativity is
computationally expensive. ML can be used to infer the GW waveforms in areas of the signal parameter space
not covered by full numerical relativity. Searches based on ML techniques have comparable search sensitivity
to matched filter searches. They are promising tools for future GW searches.
ML methods can also be successfully applied in GW searches where the exact signal morphology is
unknown. Various ML algorithms have been developed to increase the search sensitivity of the standard
LIGO-Virgo searches for GW bursts, as well as searches for GWs from CCSN and longer duration
continuous GW signals.
After a GW signal has been detected, LIGO and Virgo run computationally expensive parameter
estimation codes to determine the characteristics of the source. We have reviewed several studies showing
that ML algorithms can significantly speed up parameter estimation of GW signals. After multiple detections
are made, ML can be applied to determine the populations of GW sources and their properties, which will
inform us of their formation mechanisms. Finally, ML can aid in finding EM counterparts to GW signals.
Many of the ML techniques discussed here are under study to be adapted to deal with the non-stationary
and non-Gaussian nature of GW detector data, so that they can move from proof-of-principle studies to
online search and parameter estimation pipelines. Techniques for determining the significance of a detection
made by an ML algorithm will need to be developed. At the same time, we are aware we have to deal with
imbalanced classes, where data augmentation or generative adversarial network can help. In the future, ML
studies could be extended into further GW research areas such as improvements to instrumentation. ML
pipelines in GW, at the moment, are still not massively implemented in the production chain. We are
evaluating the possibility of taking advantage of the computational performance given by ML algorithms for
real time analysis in low latency systems and for BNS signals where ML techniques could be employed to
send out fast alerts to astronomers before the merger time.
Considering the future improvements in the sensitivity of GW detectors, and their ability to detect many
events per week, ML techniques are poised to become essential tools in GW science and multi-messenger
astrophysics.
14
Mach. Learn.: Sci. Technol. 2(2021) 011002 E Cuoco et al
Acknowledgments
We thank Jess McIver and Damir Buskulic for their feedback on this work.
This publication is supported by work from COST Action CA17137, supported by COST (European
Cooperation in Science and Technology). JP, KA and PE are supported by the Australian Research Council
Centre of Excellence for Gravitational Wave Discovery (OzGrav), through project number CE170100004.
MC is supported by the National Science Foundation through award PHY-1921006 and PHY-2011334. LH is
supported by the Swiss National Science Foundation with the Early Postdoc Mobility grant number 181461.
DW and HG are supported by Science and Technology Facilities Council (STFC) grant ST/L000946/1. RE is
supported at the University of Chicago by the Kavli Institute for Cosmological Physics through an
endowment from the Kavli Foundation and its founder Fred Kavli. SM and ZM thank Columbia University
in the City of New York for their generous support and are supported by the National Science Foundation
under grant CCF-1740391. SM and ZM acknowledge computing resources from Columbia University’s
Shared Research Computing Facility project, which is supported by NIH Research Facility Improvement
Grant 1G20RR030893-01, and associated funds from the New York State Empire State Development,
Division of Science Technology and Innovation (NYSTAR) Contract C090171, both awarded April 15, 2010.
TDG acknowledges partial funding from the Max Planck ETH Center for Learning Systems. Gravity Spy and
SC is partly supported by the National Science Foundation award INSPIRE 15-47880. VG is supported by the
LIGO Laboratory, NSF grant PHY-1764464. DK is supported by the Spanish Ministry of Science, Innovation
and Universities grant FPA2016-76821 and the Vicepresid`
encia i Conselleria d’Innovaci´
o, Recerca i Turisme
and Conselleria d’Educaci´
o i Universitats of the Govern de les Illes Balears. MB and FM are partially
supported by the Polish National Science Centre Grants No. 2016/22/E/ST9/00037 and
2017/26/M/ST9/00978. LIGO was constructed by the California Institute of Technology and Massachusetts
Institute of Technology with funding from the United States National Science Foundation under grant
PHY-0757058. The authors are grateful for computational resources provided by the LIGO Laboratory and
supported by the National Science Foundation Grants PHY-0757058 and PHY-0823459.
ORCID iDs
Elena Cuoco https://orcid.org/0000-0002-6528-3449
Jade Powell https://orcid.org/0000-0002-1357-4164
Marco Cavagli`
ahttps://orcid.org/0000-0002-3835-6729
Reed Essick https://orcid.org/0000-0001-8196-9267
Alberto Iess https://orcid.org/0000-0001-9658-6752
David Keitel https://orcid.org/0000-0002-2824-626X
Filip Morawski https://orcid.org/0000-0002-6194-8239
Massimiliano Razzano https://orcid.org/0000-0003-4825-1629
References
[1] The LIGO Scientific Collaboration, Aasi J, Abbott B P and Abbott R, et al 2015 Class. Quantum Grav. 32 074001
[2] Acernese F et al 2015 Class. Quantum Grav. 32 024001
[3] Abbott B P et al (LIGO Scientific, Virgo) 2016 Phys. Rev. Lett. 116 061102
[4] Abbott B P et al (LIGO Scientific, Virgo) 2016 Phys. Rev. Lett. 116 241103
[5] Abbott B P et al (LIGO Scientific Collaboration and Virgo Collaboration) 2019 Phys. Rev. X9031040
[6] Abbott B P et al (LIGO Scientific Collaboration and Virgo Collaboration) 2020 Class. Quantum Grav. 37 055002
[7] Abbott R et al (LIGO Scientific, Virgo) 2020 Phys. Rev. D 102 043015
[8] Abbott B et al (LIGO Scientific, Virgo) 2020 Astrophys. J. Lett. 892 L3
[9] GraceDB—Gravitational-Wave Candidate Event Database (https://gracedb.ligo.org/superevents/public/O3/)
[10] Somiya K (KAGRA) 2012 Class. Quant. Grav. 29 124007
[11] Akutsu T et al (KAGRA) 2019 Nat. Astron. 335–40
[12] Iyer B et al 2011 LIGO-India Tech. rep. LIGO Document Control Center (https://dcc.ligo.org/ligo-M1100296/public)
[13] Abbott B P et al (LIGO Scientific, Virgo) 2019 Astrophys. J. 882 L24
[14] Abbott B P et al (LIGO Scientific, Virgo) 2019 Phys. Rev. D100 104036
[15] Abbott B P et al (LIGO Scientific, Virgo) 2019 (Preprint 1908.06060)
[16] Abbott B P et al (LIGO Scientific, Virgo) 2020 Phys. Rev. D 101 084002
[17] Abbott B P et al (LIGO Scientific, Virgo) 2019 Astrophys. J. 874 163
[18] Abbott B P et al (LIGO Scientific, Virgo) 2019 Phys. Rev. D100 061101
[19] Wiener N 1949 Extrapolation, Interpolation and Smoothing of Stationary Time Series (New York: Wiley)
[20] Sathyaprakash B S and Dhurandhar S V 1991 Phys. Rev. D44 3819–34
[21] Klimenko S et al 2016 Phys. Rev. D93 042004
[22] Klimenko S et al 2008 Class. Quantum Grav. 25 114029
[23] Allen B and Romano J D 1999 Phys. Rev. D59 102001
[24] Powell J, Trifir`
o D, Cuoco E, Heng I S and Cavagli`
a M 2015 Class. Quantum Grav. 32 215012
15
Mach. Learn.: Sci. Technol. 2(2021) 011002 E Cuoco et al
[25] Powell J, Torres-Forn´
e A, Lynch R, Trifir`
o D, Cuoco E, Cavagli`
a M, Heng I S and Font J A 2017 Class. Quantum Grav. 34 034002
[26] Mukund N, Abraham S, Kandhasamy S, Mitra S and Philip N S 2017 Phys. Rev. D95 104059
[27] George D, Shen H and Huerta E A 2018 Phys. Rev. D97 101501
[28] Biswas R et al 2013 Phys. Rev. D88 062003
[29] Rampone S, Pierro V, Troiano L and Pinto I M 2013 Int. J. Mod. Phys. 24 0129–1831
[30] Lightman M, Thurakal J, Dwyer J, Grossman R, Kalmus P, Matone L, Rollins J, Zairis S and M´
arka S 2006 J. Phys.: Conf. Series
32 58–65
[31] Razzano M and Cuoco E 2018 Class. Quantum Grav. 35 095016
[32] Huerta E A et al 2019 Nature Rev. Phys. 1600–8
[33] Cavagli`
a M, Gaudio S, Hansen T, Staats Kai, Szczepanczyk Marek and Zanolin M 2020 Mach. Learn.: Sci. Technol. 1015005
[34] Kim K, Harry I W, Hodge K A, Kim Y M, Lee C H, Lee H K, Oh J J, Oh S H and Son E J 2015 Class. Quantum Grav. 32 245002
[35] Kim K, Li T G, Lo R K, Sachdev S and Yuen R S 2020 Ranking candidate signals with machine learning in low-latency searches for
gravitational waves from compact binary mergers Phys. Rev. D101 083006
[36] Gabbard H, Williams M, Hayes F and Messenger C 2018 Phys. Rev. Lett. 120 141103
[37] George D and Huerta E A 2018 Phys. Lett. B778 64–70
[38] Graff P, Feroz F, Hobson M P and Lasenby A 2012 Mon. Not. Roy. Astron. Soc. 421 169–80
[39] Shen H, Huerta E A and Zhao Z 2019 (arXiv:1903.01998)
[40] Gabbard H, Messenger C, Heng I S, Tonolini F and Murray-Smith R 2019 Bayesian parameter estimation using conditional
variational autoencoders for gravitational-wave astronomy (arXiv:1909.06296)
[41] Chua A J K and Vallisneri M 2020 Phys. Rev. Lett. 124 041102
[42] Vajente G, Huang Y, Isi M, Driggers J C, Kissel J S, Szczepa´
nczyk M J and Vitale S 2020 Phys. Rev. D101 042003
[43] Zevin M et al 2017 Class. Quantum Grav. 34 064003
[44] LIGO Scientific Collaboration and Virgo Collaboration 2016 Class. Quantum Grav. 33 134001
[45] Bahaadini S, Noroozi V, Rohani N, Coughlin S, Zevin M, Smith J, Kalogera V and Katsaggelos A 2018 Inf. Sci. 444 172–86
[46] Powell J, Trifir`
o D, Cuoco E, Heng I S and Cavagli`
a M 2015 Class. Quant. Grav. 32 215012
[47] Powell J, Torres-Forn´
e A, Lynch R, Trifir`
o D, Cuoco E, Cavagli`
a M, Heng I S and Font J A 2017 Class. Quant. Grav. 34 034002
[48] Mukund N, Abraham S, Kandhasamy S, Mitra S and Philip N S 2017 Phys. Rev. D95 104059
[49] Yamashita R, Nishio M, Do R K G and Togashi K 2018 Insights into Imaging 9611–29
[50] Russakovsky O et al 2015 Int. J. Comput. Vis. 115 211–52
[51] Rollins J 2011 Multimessenger Astronomy with Low-Latency Searches for Transient Gravitational Waves PhD thesis Columbia
University
[52] Chatterji S, Blackburn L, Martin G and Katsavounidis E 2004 Class. Quantum Grav. 21 S1809–S1818
[53] Cuoco E, Razzano M and Utina A 2018 Wavelet-based classification of transient signals for gravitational wave detectors 2018 26th
European Signal Conf. (EUSIPCO) pp 2648–52
[54] Acernese F et al 2007 Class. Quantum Grav. 24 S671
[55] Tianqi Chen C G 2016 Xgboost: a scalable tree boosting system KDD’16 Proc. of the 22nd ACM SIGKDD Int. Conf. on Knowledge
Discovery and Data Mining pp 785–94
[56] Essick R, Blackburn L and Katsavounidis E 2013 Class. Quantum Grav. 30 155010
[57] Smith J R, Abbott T, Hirose E, Leroy N, MacLeod D, McIver J, Saulson P and Shawhan P 2011 Class. Quantum Grav. 28 235005
[58] Isogai T LIGO Scientific Collaboration and Virgo Collaboration 2010 Used percentage veto for LIGO and virgo binary inspiral
searches J. Conf. Series vol 243 p 012005
[59] Colgan R E, Corley K R, Lau Y, Bartos I, Wright J N, Marka Z and Marka S 2020 Phys. Rev. D101 102003 (Preprint 1911.11831)
[60] Essick R 2017 Detectability of dynamical tidal effects and the detection of gravitational-wave transients with LIGO PhD thesis
Massachusetts Institute of Technology
[61] Essick R, Godwin P, Hanna C, Blackburn L and Katsavounidis E 2020 (arXiv:2005.12761)
[62] Godwin P et al 2020 Stream-based noise acquisition and extraction analysis (snax) in preparation
[63] Abbott B P et al (LIGO Scientific Collaboration and Virgo Collaboration) 2017 Phys. Rev. Lett. 119 161101
[64] Reed E (on behalf of the LIGO-Virgo Scientific Collaborations) 2017 (https://gcn.gsfc.nasa.gov/gcn3/21509.gcn3)
[65] Cavagli`
a M, Staats K and Gill T 2019 Commun. Comput. Phys. 25 963–87
[66] Robinet F 2015 (https://tds.ego-gw.it/ql/?c=10651)
[67] Bishop C M 2006 Pattern Recognition and Machine Learning (Berlin, Heidelberg: Springer)
[68] Hastie T, Tibshirani R and Friedman J 2009 The Elements of Statistical Learning: Data Mining, Inference and Prediction 2nd ed
(Berlin: Springer) (http://www-stat.stanford.edu/ tibs/ElemStatLearn/)
[69] Zevin M et al 2017 Class. Quantum Grav. 34 064003
[70] Mukund N et al 2019 Class. Quant. Grav. 36 085005
[71] Mukund N, Thakur S, Abraham S, Aniyan A, Mitra S, Philip N S, Vaghmare K and Acharjya D 2018 Astrophys. J. Suppl. 235 22
[72] Davis D, Massinger T, Lundgren A, Driggers J C, Urban A L and Nuttall L 2019 Class. Quantum Grav. 36 055011
[73] Driggers J C, Vitale S, Lundgren A P, Evans M, Kawabe K, Dwyer S E and Izumi K et al (The LIGO Scientific Collaboration
Instrument Science Authors) 2019 Phys. Rev. D99 042001
[74] Ormiston R et al 2020 Extending the Reach of Gravitational-Wave Detectors With Machine Learning, in preparation
[75] Torres A, Marquina A, Font J A and Ib´
añez J M 2014 Phys. Rev. D90 084029
[76] Torres-Forn´
e A, Cuoco E, Marquina A, Font J A and Ib´
añez J M 2018 Phys. Rev. D98 084013
[77] Torres A, Marquina A, Font J A and Ib´
añez J M 2015 Split Bregman Method for Gravitational Wave Denoising Gravitational Wave
Astrophys. 40 289
[78] Torres-Forn´
e A, Marquina A, Font J A and Ib´
añez J M 2016 Phys. Rev. D94 124040
[79] Torres-Forn´
e A, Cuoco E, Font J A and Marquina A 2020 Application of dictionary learning to denoise LIGO’s blip noise
transients 102 023011
[80] Wei W and Huerta E A 2020 Physics Letters B800 135081
[81] van den Oord A, Dieleman S, Zen H, Simonyan K, Vinyals O, Graves A, Kalchbrenner N, Senior A and Kavukcuoglu K 2016
(arXiv:1609.03499)
[82] Jain L C and Medsker L R 1999 Recurrent Neural Networks: Design and Applications 1st ed (USA: CRC Press, Inc.)
[83] Pascanu R, Gulcehre C, Cho K and Bengio Y 2013 (arXiv:1312.6026)
16
Mach. Learn.: Sci. Technol. 2(2021) 011002 E Cuoco et al
[84] Shen H, Zhao Z, George D and Huerta E 2018 Denoising Gravitational Waves using Deep Learning with Recurrent Denoising
Autoencoders APS April Meeting Abstracts (APS Meeting Abstracts vol 2018) p S14.008
[85] Shen H, George D, Huerta E A and Zhao Z 2019 (arXiv:1903.03105)
[86] Nitz A H, Dal Canton T, Davis D and Reyes S 2018 Phys. Rev. D98 024050
[87] Sachdev S et al 2019 (Preprint 1901.08580)
[88] Messick C et al 2017 Phys. Rev. D95 042001
[89] Adams T, Buskulic D, Germain V, Guidi G M, Marion F, Montani M, Mours B, Piergiovanni F and Wang G 2016 Class. Quantum
Grav. 33 175012
[90] Chu Q 2017 PhD thesis The University of Western Australia
[91] Veitch J et al 2015 Phys. Rev. D91 042003
[92] Ashton G et al 2019 Astrophys. J. Supp. Ser. 241 27
[93] Pretorius F 2005 Phys. Rev. Lett. 95 121101
[94] Campanelli M, Lousto C O, Marronetti P and Zlochower Y 2006 Phys. Rev. Lett. 96 111101
[95] Baker J G, Centrella J, Choi D I, Koppitz M and van Meter J 2006 Phys. Rev. Lett. 96 111102
[96] Bruegmann B, Gonzalez J A, Hannam M, Husa S, Sperhake U and Tichy W 2008 Phys. Rev. D77 024027
[97] Centrella J, Baker J G, Kelly B J and van Meter J R 2010 Rev. Mod. Phys. 82 3069
[98] Mroue A H et al 2013 Phys. Rev. Lett. 111 241104
[99] Jani K, Healy J, Clark J A, London L, Laguna P and Shoemaker D 2016 Class. Quant. Grav. 33 204001
[100] Healy J, Lousto C O, Zlochower Y and Campanelli M 2017 Class. Quant. Grav. 34 224001
[101] Boyle M et al 2019 Class. Quant. Grav. 36 195006
[102] Blanchet L 2014 Living Rev. Rel. 17 2
[103] Berti E, Cardoso V and Starinets A O 2009 Class. Quant. Grav. 26 163001
[104] Buonanno A and Damour T 1999 Phys. Rev. D59 084006
[105] Buonanno A and Damour T 2000 Phys. Rev. D62 064015
[106] Damour T, Jaranowski P and Schaefer G 2000 Phys. Rev. D62 084011
[107] Damour T 2001 Phys. Rev. D64 124013
[108] Damour T, Iyer B R and Nagar A 2009 Phys. Rev. D79 064004
[109] Pan Y, Buonanno A, Boyle M, Buchman L T, Kidder L E, Pfeiffer H P and Scheel M A 2011 Phys. Rev. D84 124052
[110] Taracchini A, Pan Y, Buonanno A, Barausse E and Boyle M et al 2012 Phys. Rev. D86 024011
[111] Taracchini A, Buonanno A, Pan Y, Hinderer T and Boyle M et al 2014 Phys. Rev. D89 061502
[112] Pan Y, Buonanno A, Taracchini A, Kidder L E, Mrou´
e A H, Pfeiffer H P, Scheel M A and Szil´
agyi B 2014 Phys. Rev. D89 084006
[113] Damour T and Nagar A 2014 Phys. Rev. D90 044018
[114] Nagar A, Damour T, Reisswig C and Pollney D 2016 Phys. Rev. D93 044046
[115] Boh´
e A et al 2017 Phys. Rev. D95 044028
[116] Babak S, Taracchini A and Buonanno A 2017 Phys. Rev. D95 024010
[117] Knowles T D, Devine C, Buch D A, Bilgili S A, Adams T R, Etienne Z B and Mcwilliams S T 2018 Class. Quant. Grav. 35 155003
[118] Cotesta R, Buonanno A, Boh´
e A, Taracchini A, Hinder I and Ossokine S 2018 Phys. Rev. D98 084028
[119] Nagar A et al 2018 Phys. Rev. D98 104052
[120] Ajith P et al 2011 Phys. Rev. Lett. 106 241101
[121] Santamaría L et al 2010 Phys. Rev. D82 064016
[122] Hannam M, Schmidt P, Boh´
e A, Haegel L, Husa S, Ohme F, Pratten G and Pürrer M 2014 Phys. Rev. Lett. 113 151101
[123] Boh´
e A, Hannam M, Husa S, Ohme F, Pürrer M and Schmidt P 2016 PhenomPv2—Technical notes for the LAL implementation
Technical Report LIGO Document Control Center (https://dcc.ligo.org/LIGO-T1500602/public)
[124] Husa S, Khan S, Hannam M, Pürrer M, Ohme F, Jim´
enez Forteza X and Boh´
e A 2016 Phys. Rev. D93 044006
[125] Khan S, Husa S, Hannam M, Ohme F, Pürrer M, Jim´
enez Forteza X and Boh´
e A 2016 Phys. Rev. D93 044007
[126] Mehta A K, Mishra C K, Varma V and Ajith P 2017 Phys. Rev. D96 124010
[127] London L et al 2018 Phys. Rev. Lett. 120 161102
[128] Khan S, Chatziioannou K, Hannam M and Ohme F 2019 Phys. Rev. D100 024059
[129] Khan S, Ohme F, Chatziioannou K and Hannam M 2020 Phys. Rev. D101 024056
[130] Pratten G, Husa S, Garcia-Quiros C, Colleoni M, Ramos-Buades A, Estelles H and Jaume R 2020 (Preprint 2001.11412)
[131] Field S E, Galley C R, Hesthaven J S, Kaye J and Tiglio M 2014 Phys. Rev. X4 031006
[132] Pürrer M 2014 Class. Quant. Grav. 31 195010
[133] Pürrer M 2016 Phys. Rev. D93 064041
[134] Blackman J, Field S E, Galley C R, Szil´
agyi B, Scheel M A, Tiglio M and Hemberger D A 2015 Phys. Rev. Lett. 115 121102
[135] Blackman J, Field S E, Scheel M A, Galley C R, Ott C D, Boyle M, Kidder L E, Pfeiffer H P and Szil´
agyi B 2017 Phys. Rev. D
96 024058
[136] Blackman J, Field S E, Scheel M A, Galley C R, Hemberger D A, Schmidt P and Smith R 2017 Phys. Rev. D95 104023
[137] Lackey B D, Pürrer M, Taracchini A and Marsat S 2019 Phys. Rev. D100 024002
[138] Doctor Z, Farr B, Holz D E and Pürrer M 2017 Phys. Rev. D96 123011
[139] Varma V, Field S E, Scheel M A, Blackman J, Kidder L E and Pfeiffer H P 2019 Phys. Rev. D99 064045
[140] Varma V, Field S E, Scheel M A, Blackman J, Gerosa D, Stein L C, Kidder L E and Pfeiffer H P 2019 Phys. Rev. Research. 1033015
[141] Williams D, Heng I S, Gair J, Clark J A and Khamesra B 2019 (Preprint 1903.09204)
[142] Rasmussen C and Williams C 2006 Gaussian Processes for Machine Learning (Cambridge, MA: The MIT Press)
[143] Williams D, Heng I S, Gair J, Clark J A and Khamesra B 2019 A precessing numerical relativity waveform surrogate model for
binary black holes: a Gaussian process regression approach (Preprint 1903.09204)
[144] Moore C J and Gair J R 2014 Phys. Rev. Lett. 113 251101
[145] Moore C J, Berry C P L, Chua A J K and Gair J R 2016 Phys. Rev. D93 064001
[146] Setyawati Y, Pürrer Michael and Ohme Frank 2020 Class. Quant. Grav. 37 075012
[147] Setyawati Y, Pürrer M and Ohme F 2020 Class. Quantum Grav. 37 075012
[148] Jim´
enez-Forteza X, Keitel D, Husa S, Hannam M, Khan S and Pürrer M 2017 Phys. Rev. D95 064024
[149] Healy J and Lousto C O 2017 Phys. Rev. D95 024037
[150] Hofmann F, Barausse E and Rezzolla L 2016 Astrophys. J. 825 L19
[151] Varma V, Gerosa D, Stein L C, H´
ebert F and Zhang H 2019 Phys. Rev. Lett. 122 011101
17
Mach. Learn.: Sci. Technol. 2(2021) 011002 E Cuoco et al
[152] Haegel L and Husa S 2020 Classical and Quantum Gravity (https://doi.org/10.1088/1361-6382/ab905c)
[153] Easter P J, Lasky P D, Casey A R, Rezzolla L and Takami K 2019 Phys. Rev. D100 043005
[154] Allen B 2005 Phys. Rev. D71 062001
[155] Baker P T, Caudill S, Hodge K A, Talukder D, Capano C and Cornish N J 2015 Phys. Rev. D91 062004
[156] Kapadia S J, Dent T and Dal Canton T 2017 Phys. Rev. D96 104015
[157] George D and Huerta E 2018 Phys. Lett. B778 64–70
[158] Gebhard T D, Kilbertus N, Harry I and Schölkopf B 2019 Phys. Rev. D100 063015
[159] Fryer C L and New K C B 2003 Living Rev. Relativ. 62
[160] Andersson N and Comer G L 2001 Phys. Rev. Lett. 87 241101
[161] Baiotti L, Hawke I and Rezzolla L 2007 Class. Quantum Grav. 24 S187–S206
[162] Damour T and Vilenkin A 2005 Phys. Rev. D71 063510
[163] Abbott B P et al (LIGO Scientific Collaboration and Virgo Collaboration) 2017 Phys. Rev. D95 042003
[164] Vinciguerra S et al 2017 Class. Quantum Grav. 34 094003
[165] Powell J and Müller B 2019 Mon. Not. Roy. Astron. Soc. 487 1178–90
[166] Radice D, Morozova V, Burrows A, Vartanyan D and Nagakura H 2019 Astrophys. J. 876 L9
[167] Andresen H, Müller E, Janka H T, Summa A, Gill K and Zanolin M 2019 Mon. Not. Roy. Astron. Soc. 486 2238–53
[168] Takiwaki T and Kotake K 2018 Mon. Not. Roy. Astron. Soc. 475 L91–L95
[169] Torres-Forn´
e A, Cerd´
a-Dur´
an P, Passamonti A, Obergaulinger M and Font J A 2019 Mon. Not. Roy. Astron. Soc. 482 3967–88
[170] Astone P, Cerd´
a-Dur´
an P, Di Palma I, Drago M, Muciaccia F, Palomba C and Ricci F 2018 Phys. Rev. D98 122002
[171] Iess A, Cuoco E, Morawski F and Powell J 2020 Mach. Learn.: Sci. Technol. 1025014
[172] Cuoco E, Calamai G, Fabbroni L, Losurdo G, Mazzoni M, Stanga R and Vetrano F 2001 Class. Quantum Grav. 18 1727–51
[173] Chan M L, Heng I S and Messenger C 2019 (arXiv:1912.13517)
[174] Prix R (for the LIGO Scientific Collaboration) 2009 Gravitational Waves From Spinning Neutron Stars ch 24 vol 357 (Berlin
Heidelberg: Springer) pp 651–85 (https://dcc.ligo.org/LIGO-P060039/public)
[175] Brady P R and Creighton T 2000 Phys. Rev. D61 082001
[176] Riles K 2017 Mod. Phys. Lett. A32 1730035
[177] Ming J, Papa M A, Krishnan B, Prix R, Beer C, Zhu S J, Eggenstein H B, Bock O and Machenschalk B 2018 Phys. Rev. D97 024051
[178] Walsh S, Wette K, Papa M A and Prix R 2019 Phys. Rev. D99 082004
[179] Morawski F, Bejger M and Ciecielkag P 2020 Mach. Learn.: Sci. Technol. 1025016
[180] Time-domain F-statistic pipeline repository (https://github.com/mbejger/polgraw-allsky) (Accessed: 04-09-2018)
[181] Time-domain F-statistic pipeline documentation (http://mbejger.github.io/polgraw-allsky) (Accessed: 04-09-2018)
[182] Covas P B et al (LSC) 2018 Phys. Rev. D97 082002
[183] Pletsch H J 2008 Phys. Rev. D78 102005
[184] Pletsch H J and Allen B 2009 Phys. Rev. Lett. 103 181102
[185] Pletsch H J 2010 Phys. Rev. D82 042002
[186] Allen B et al Einstein@Home distributed computing project (https://einsteinathome.org)
[187] Beheshtipour B and Papa M A 2020 (Preprint 2001.03116)
[188] Behnke B, Papa M A and Prix R 2015 Phys. Rev. D91 064007
[189] Papa M A et al 2016 Phys. Rev. D94 122006
[190] Singh A, Papa M A, Eggenstein H B and Walsh S 2017 Phys. Rev. D96 082003
[191] Girshick R, Donahue J, Darrell T and Malik J 2013 (arXiv:1311.2524)
[192] He K, Gkioxari G, Doll´
ar P and Girshick R 2017 (arXiv:1703.06870)
[193] Abbott B P et al (LIGO Scientific, Virgo) 2017 Phys. Rev. D96 122004
[194] Dreissigacker C, Sharma R, Messenger C, Zhao R and Prix R 2019 Phys. Rev. D100 044009
[195] Wette K, Walsh S, Prix R and Papa M A 2018 Phys. Rev. D97 123016
[196] Keitel D, Prix R, Papa M A, Leaci P and Siddiqi M 2014 Phys. Rev. D89 064023
[197] Leaci P 2015 Phys. Scripta 90 125001
[198] Dreissigacker C and Prix R 2020 Phys. Rev. D102 022005
[199] Suvorova S, Sun L, Melatos A, Moran W and Evans R J 2016 Phys. Rev. DD93 123009
[200] Suvorova S, Clearwater P, Melatos A, Sun L, Moran W and Evans R J 2017 Phys. Rev. D96 102006
[201] Sun L, Melatos A, Suvorova S, Moran W and Evans R 2018 Phys. Rev. DD97 043013
[202] Sun L and Melatos A 2019 Phys. Rev. D99 123003
[203] Viterbi A 1967 IEEE Trans. Inf. Theory 13 260–9
[204] Bayley J, Woan G and Messenger C 2019 Phys. Rev. D100 023006
[205] Abbott B P et al (Virgo, LIGO Scientific) 2019 ApJ 875 160
[206] Keitel D et al 2019 Phys. Rev. D 100 064058
[207] Miller A et al 2018 Phys. Rev. D98 102004
[208] Oliver M, Keitel D and Sintes A M 2019 Phys. Rev. D99 104067
[209] Miller A L et al 2019 Phys. Rev. D100 062005
[210] Meadors G D, Goetz E and Riles K 2016 Class. Quant. Grav. 33 105017
[211] Abbott B P et al (LIGO Scientific, Virgo) 2019 Phys. Rev. D100 122002
[212] Covas P B and Sintes A M 2020 (Preprint 2001.08411)
[213] Christensen N 2019 Rep. Prog. Phys. 82 016903
[214] Smith R and Thrane E 2018 Phys. Rev. X8021019
[215] Biwer C M, Capano C D, De S, Cabero M, Brown D A, Nitz A H and Raymond V 2019 PASP 131 024503
[216] Lange J, O’Shaughnessy R and Rizzo M 2018 Submitted to PRD available at (arXiv:1805.10457)
[217] Veitch J and Vecchio A 2010 Phys. Rev. D81 062003
[218] Blasco A 2017 Mcmc (Cham: Springer Int. Publishing) pp 85–102
[219] Wysocki D, O’Shaughnessy R, Lange J and Fang Y L L 2019 Phys. Rev. D99 084026
[220] Feroz F, Hobson M P and Bridges M 2009 Mon. Not. Roy. Astron. Soc. 398 1601–14
[221] Chua A J K, Galley C R and Vallisneri M 2019 Phys. Rev. Lett. 122 211101
[222] Green S R, Simpson C and Gair J 2020 (arXiv:2002.07656)
[223] Papamakarios G, Pavlakou T and Murray I 2017 Masked autoregressive flow for density estimation (arXiv:1705.07057)
18
Mach. Learn.: Sci. Technol. 2(2021) 011002 E Cuoco et al
[224] Singer L P et al 2016 Astrophys. J. 829 L15
[225] Kapadia S J et al 2020 Classical and Quantum Gravity 37 045007
[226] Foucart F 2012 Phys. Rev. D86 124007
[227] Foucart F, Hinderer T and Nissanke S 2018 Phys. Rev. D98 081501
[228] Chatterjee D, Ghosh S, Brady P R, Kapadia S J, Miller A L, Nissanke S and Pannarale F 2019 (Preprint 1911.00116)
[229] Pedregosa F et al 2011 J. Mach. Learn. Res. 12 2825–30
[230] Mandel I and O’Shaughnessy R 2010 Class. Quantum Grav. 27 114007
[231] Farr W M, Stevenson S, Miller M C, Mandel I, Farr B and Vecchio A 2017 Nature 548 426–9
[232] Stevenson S, Berry C P L and Mandel I 2017 Mon. Not. Roy. Astron. Soc. 471 2801–11
[233] Stevenson S, Ohme F and Fairhurst S 2015 Astrophys. J. 810 58
[234] Vitale S, Lynch R, Sturani R and Graff P 2017 Class. Quantum Grav. 34 03LT01
[235] Gerosa D and Berti E 2017 Phys. Rev. D95 124046
[236] Talbot C and Thrane E 2018 Astrophys. J. 856 173
[237] Mandel I, Farr W M, Colonna A, Stevenson S, Tiˇ
no P and Veitch J 2017 Mon. Not. Roy. Astron. Soc. 465 3254–60
[238] Wysocki D 2017 (arXiv:1712.02643)
[239] Powell J, Stevenson S, Mandel I and Tino P 2019 (arXiv:1905.04825)
[240] Wong K W K, Contardo G and Ho S 2020 (arXiv:2002.09491)
[241] Chatterjee C, Wen L, Vinsen K, Kovalam M and Datta A 2019 Phys. Rev. D100 103025
[242] Metzger B D and Berger E 2012 Astrophys. J. 746 48
[243] Cowperthwaite P S et al 2018 Astrophys. J. 858 18
[244] Ackley K, Eikenberry S S, Yildirim C, Klimenko S and Garner A 2019 Automated Transient Detection with Shapelet Analysis in
Image-subtracted Data Astron. J. 158 172
[245] Flaugher B et al (DES) 2015 Astron. J. 150 150
[246] Rau A et al 2009 Publ. Astron. Soc. Pac. 121 1334–51
[247] Law N et al 2009 Publ. Astron. Soc. Pac. 121 1395
19
... Recent years have seen significant advances in machine learning-based analysis methods, resulting in substantial speedup in PE with precision comparable to traditional methods (Cuoco et al. 2021, Chua and Vallisneri 2020, Gabbard et al. 2022, Zhao et al. 2023, Cuoco et al. 2024, Hu et al. 2024, Polanska et al. 2024, Stergioulas 2024. ...
Preprint
The Einstein Telescope (ET), along with other third-generation gravitational wave (GW) detectors, will be a key instrument for detecting GWs in the coming decades. However, analyzing the data and estimating source parameters will be challenging, especially given the large number of expected detections - of order 10510^5 per year - which makes current methods based on stochastic sampling impractical. In this work, we use Dingo-IS to perform Neural Posterior Estimation (NPE) of high-redshift events detectable with ET in its triangular configuration. NPE is a likelihood-free inference technique that leverages normalizing flows to approximate posterior distributions. After training, inference is fast, requiring only a few minutes per source, and accurate, as corrected through importance sampling and validated against standard Bayesian inference methods. To confirm previous findings on the ability to estimate parameters for high-redshift sources with ET, we compare NPE results with predictions from the Fisher information matrix (FIM) approximation. We find that FIM underestimates sky localization errors substantially for most sources, as it does not capture the multimodalities in sky localization introduced by the geometry of the triangular detector. FIM also overestimates the uncertainty in luminosity distance by a factor of 3\sim 3 on average when the injected luminosity distance is dLinj>105 d^{\mathrm{inj}}_{\mathrm{L}} > 10^5~Mpc, further confirming that ET will be particularly well suited for studying the early Universe.
Article
Full-text available
We have conducted an extensive study using a diverse set of equations of state (EoSs) to uncover strong relationships between neutron star (NS) observables and the underlying EoS parameters using symbolic regression method. These EoS models, derived from a mix of agnostic and physics-based approaches, considered neutron stars composed of nucleons, hyperons, and other exotic degrees of freedom in beta equilibrium. The maximum mass of a NS is found to be strongly correlated with the pressure and baryon density at an energy density of approximately 800 MeV.fm3^{-3}. We have also demonstrated that the EoS can be expressed as a function of radius and tidal deformability within the NS mass range 1-2MM_\odot. These insights offer a promising and efficient framework to decode the dense matter EoS directly from the accurate knowledge of NS observables.
Chapter
Detecting gravitational waves (GWs) in measured data is a challenging task that demands advanced techniques for effective analysis due to the presence of intensive noise and the non-stationary nature of these signals. Quadratic time-frequency distributions (TFDs) from Cohen’s class provide valuable tools for analyzing various non-stationary signals simultaneously in the time and frequency domain. This chapter reviews a method that integrates deep convolutional neural networks (CNNs) with these quadratic TFDs to enhance the detection of GWs from binary black hole (BBH) mergers. The approach was validated on a comprehensive dataset of 100.000 signals combining actual Laser Interferometer Gravitational-Wave Observatory (LIGO) data with synthetically simulated GW injections. Twelve different two-dimensional (2D) TFD representations were calculated (resulting in 1.2 million TFDs) and used as inputs to three high-performance CNN models: ResNet-101, Xception, and EfficientNet. The proposed approach demonstrated superior detection performance, achieving high values across various classification metrics. Furthermore, it outperformed a CNN model using original time-series data, proving to be a viable solution for detecting GWs in low signal-to-noise ratio (SNR) environments.
Chapter
We explore the application of deep learning techniques to accelerate gravitational wave surrogate modeling. We focus on two recent approaches, using artificial neural networks (ANNs) with residual error modeling and autoencoder-driven spiral representation learning. For the ANN method, we demonstrate that adding a second network to learn residual errors significantly improves surrogate model accuracy. The autoencoder approach reveals an inherent spiral structure in the latent space representation of empirical interpolation coefficients. We take advantage of this insight to develop a neural spiral module that can be integrated into network architectures to accelerate training and improve performance. Comprehensive evaluations show that these methods achieve state-of-the-art accuracy while enabling faster waveform generation. The techniques presented have the potential to substantially accelerate gravitational wave data analysis as detector sensitivity improves and event rates increase.
Chapter
The detection of gravitational waves events in the data taken by the experiments LIGO, Virgo and KAGRA requires an extensive computing process to extract the signals from the noise. The traditional technique to perform this task is matched filtering. Although powerful, it is computationally very expensive and new techniques are being studied. In this chapter, the use of convolutional neural networks to detect signals of compact binary coalescences using two-dimensional spectrograms as input is explained. Different neural networks are used for different mass ranges. The results of the searches in O3 data are shown.
Chapter
Glitches, non-Gaussian transient waves which mimic gravitational-wave signals, are abundant in detectors and impact data quality. Therefore, identifying glitch type and eliminating them is crucial to unveil true astrophysical events. In this study, we evaluate the performance of logistic regression, extreme gradient boost, and support vector machines for glitch classification using features extracted via transfer learning on Inception-v3 and ResNet-50 models. We used two transfer learning strategies: fine-tuning pre-trained models with our dataset and using pre-trained models as feature extractors. Our results show that transfer learning significantly reduces training time compared to fine-tuning. The transfer learning method achieved a classification accuracy of 93.98%93.98\% with the lowest training time of 37.6 s. Transfer learning resulted in 24 and 31 times faster training for ResNet-50 and Inception-v3, respectively, proving highly beneficial for glitch classification in the LIGO experiment.
Chapter
Gravitational wave bursts, transient events lasting from milliseconds to minutes within the frequency band of current generation detectors, originate from a range of astrophysical phenomena, including core-collapse supernovae, neutron star glitches, and highly eccentric black hole mergers. Due to the complexity and diversity of these sources, their signal morphologies are often poorly modeled or completely unknown, making traditional matched-filter techniques ineffective for many target sources. More critically, detection methods must be sensitive to entirely unexpected phenomena, adopting an “eyes wide open” approach to enhance detection capabilities beyond known or predictable events. This chapter explores the integration of several machine learning techniques in the analysis of gravitational wave bursts, addressing the challenges posed by unmodeled and unknown signal morphologies and outlining the strategies developed to approach these signals with minimal assumptions.
Article
Full-text available
Data streams of gravitational-wave detectors are polluted by transient noise features, or “glitches,” of instrumental and environmental origin. In this work we investigate the use of total variation methods and learned dictionaries to mitigate the effect of those transients in the data. We focus on a specific type of transient, “blip" glitches, as this is the most common type of glitch present in the LIGO detectors and their waveforms are easy to identify. We randomly select 100 blip glitches scattered in the data from advanced LIGO’s O1 run, as provided by the citizen-science project Gravity Spy. Our results show that dictionary-learning methods are a valid approach to model and subtract most of the glitch contribution in all cases analyzed, particularly at frequencies below ∼1 kHz. The high-frequency component of the glitch is best removed when a combination of dictionaries with different atom length is employed. As a further example we apply our approach to the glitch visible in the LIGO-Livingston data around the time of merger of binary neutron star signal GW170817, finding satisfactory results. This paper is the first step in our ongoing program to automatically classify and subtract all families of gravitational-wave glitches employing variational methods.
Article
Full-text available
The sensitivity of wide-parameter-space searches for continuous gravitational waves is limited by computational cost. Recently it was shown that deep neural networks (DNNs) can perform all-sky searches directly on (single-detector) strain data [C. Dreissigacker , Phys. Rev. D 100, 044009 (2019)], potentially providing a low-computing-cost search method that could lead to a better overall sensitivity. Here we expand on this study in two respects: (i) using (simulated) strain data from two detectors simultaneously, and (ii) training for directed (i.e., single sky-position) searches in addition to all-sky searches. For a data time span of T=105 s, the all-sky two-detector DNN is about 7% less sensitive (in amplitude h0) at low frequency (f=20 Hz), and about 51% less sensitive at high frequency (f=1000 Hz) compared to fully-coherent matched-filtering (using weave). In the directed case the sensitivity gap compared to matched-filtering ranges from about 7%–14% at f=20 Hz to about 37%–49% at f=1500 Hz. Furthermore we assess the DNN’s ability to generalize in signal frequency, spin down and sky-position, and we test its robustness to realistic data conditions, namely gaps in the data and using real LIGO detector noise. We find that the DNN performance is not adversely affected by gaps in the test data or by using a relatively undisturbed band of LIGO detector data instead of Gaussian noise. However, when using a more disturbed LIGO band for the tests, the DNN’s detection performance is substantially degraded due to the increase in false alarms, as expected.
Article
Full-text available
We report the observation of gravitational waves from a binary-black-hole coalescence during the first two weeks of LIGO’s and Virgo’s third observing run. The signal was recorded on April 12, 2019 at 05∶30∶44 UTC with a network signal-to-noise ratio of 19. The binary is different from observations during the first two observing runs most notably due to its asymmetric masses: a ∼ 30 M ⊙ black hole merged with a ∼ 8 M ⊙ black hole companion. The more massive black hole rotated with a dimensionless spin magnitude between 0.22 and 0.60 (90% probability). Asymmetric systems are predicted to emit gravitational waves with stronger contributions from higher multipoles, and indeed we find strong evidence for gravitational radiation beyond the leading quadrupolar order in the observed signal. A suite of tests performed on GW190412 indicates consistency with Einstein’s general theory of relativity. While the mass ratio of this system differs from all previous detections, we show that it is consistent with the population model of stellar binary black holes inferred from the first two observing runs. Published by the American Physical Society 2020
Article
Full-text available
We present the first estimation of the mass and spin magnitude of Kerr black holes resulting from the coalescence of binary black holes using a deep neural network. The network is trained on a dataset containing 80% of the full publicly available catalog of numerical simulations of gravitational waves emission by binary black hole systems, including full precession effects for spinning binaries. The network predicts the remnant black holes mass and spin with an error less than 0.04% and 0.3% respectively for 90% of the values in the non-precessing test dataset, it is 0.1% and 0.3% respectively in the precessing test dataset. When compared to existing fits in the LIGO algorithm software library, the network enables to reduce the remnant mass root mean square error to one half in the non-precessing case. In the precessing case, both remnant mass and spin mean square errors are decreased to one half, and the network corrects the bias observed in available fits.
Article
Full-text available
Among the astrophysical sources in the Advanced Laser Interferometer Gravitational-Wave Observatory (LIGO) and Advanced Virgo detectors’ frequency band are rotating non-axisymmetric neutron stars emitting long-lasting, almost-monochromatic gravitational waves. Searches for these continuous gravitational-wave signals are usually performed in long stretches of data in a matched-filter framework e.g. the F -statistic method. In an all-sky search for a priori unknown sources, a large number of templates are matched against the data using a pre-defined grid of variables (the gravitational-wave frequency and its derivatives, sky coordinates), subsequently producing a collection of candidate signals, corresponding to the grid points at which the signal reaches a pre-defined signal-to-noise threshold. An astrophysical signature of the signal is encoded in the multi-dimensional vector distribution of the candidate signals. In the first work of this kind, we apply a deep learning approach to classify the distributions. We consider three basic classes: Gaussian noise, astrophysical gravitational-wave signal, and a constant-frequency detector artifact (‘stationary line’), the two latter injected into the Gaussian noise. 1D and 2D versions of a convolutional neural network classifier are implemented, trained and tested on a broad range of signal frequencies. We demonstrate that these implementations correctly classify the instances of data at various signal-to-noise ratios and signal frequencies, while also showing concept generalization i.e. satisfactory performance at previously unseen frequencies. In addition we discuss the deficiencies, computational requirements and possible applications of these implementations.
Article
Improvements in ground-based advanced gravitational wave (GW) detectors may soon allow us to observe the GW signal of a nearby core-collapse supernova. For most progenitors, likely with slowly rotating cores, the dominant GW emission mechanisms are the post-bounce oscillations of the proto-neutron star (PNS) before the explosion. We present a new procedure to compute the eigenmodes of the system formed by the PNS and the stalled accretion shock in general relativity including space–time perturbations. We apply our analysis to two core-collapse simulations and show that our improved method is able to obtain eigenfrequencies that accurately match the features observed in the GW signal and to predict the qualitative behaviour of quasi-radial oscillations. Our analysis is possible thanks to a newly developed algorithm to classify the eigenmodes in different classes (f, p, and g modes), improving our previous results. We find that most of the GW energy is stored in the lowest-order eigenmodes, in particular in the ²g1 mode and in the ²f mode. Our results also suggest that a low-frequency component of the GW signal attributed in previous works to the characteristic frequency of the standing accretion shock instability should be identified as the fundamental quadrupolar f mode. We also develop a formalism to estimate the contribution of quasi-radial (l = 0) modes to the GW quadrupolar component in a deformed background, with application to rapidly rotating cores. This work provides further support for asteroseismology of core-collapse supernovae and the inference of PNS properties based on GW observations.
Article
We present predictions for the gravitational wave (GW) emission of 3D supernova simulations performed for a 15 solar-mass progenitor with the prometheus–vertex code using energy-dependent, three-flavour neutrino transport. The progenitor adopted from stellar evolution calculations including magnetic fields had a fairly low specific angular momentum (jFe ≲ 10¹⁵ cm² s⁻¹) in the iron core (central angular velocity ΩFe,c ∼ 0.2 rad s⁻¹), which we compared to simulations without rotation and with artificially enhanced rotation (jFe ≲ 2 × 10¹⁶ cm² s⁻¹; ΩFe,c ∼ 0.5 rad s⁻¹). Our results confirm that the time-domain GW signals of SNe are stochastic, but possess deterministic components with characteristic patterns at low frequencies (≲200 Hz), caused by mass motions due to the standing accretion shock instability (SASI), and at high frequencies, associated with gravity-mode oscillations in the surface layer of the proto-neutron star (PNS). Non-radial mass motions in the post-shock layer as well as PNS convection are important triggers of GW emission, whose amplitude scales with the power of the hydrodynamic flows. There is no monotonic increase of the GW amplitude with rotation, but a clear correlation with the strength of SASI activity. Our slowly rotating model is a fainter GW emitter than the non-rotating model because of weaker SASI activity and damped convection in the post-shock layer and PNS. In contrast, the faster rotating model exhibits a powerful SASI spiral mode during its transition to explosion, producing the highest GW amplitudes with a distinctive drift of the low-frequency emission peak from ∼80–100 to ∼40–50 Hz. This migration signifies shock expansion, whereas non-exploding models are discriminated by the opposite trend.
Article
Understanding gravitational wave emission from core-collapse supernovae will be essential for their detection with current and future gravitational wave detectors. This requires a sample of waveforms from modern 3D supernova simulations reaching well into the explosion phase, where gravitational wave emission is expected to peak. However, recent waveforms from 3D simulations with multigroup neutrino transport do not reach far into the explosion phase, and some are still obtained from non-exploding models. We therefore calculate waveforms up to 0.9 s after bounce using the neutrino hydrodynamics code coconut-fmt. We consider two models with low and normal explosion energy, namely explosions of an ultra-stripped progenitor with an initial helium star mass of |3.5M3.5\, \mathrm{M}_{\odot }|⁠, and of an |18M18\, \mathrm{M}_{\odot }| single star. Both models show gravitational wave emission from the excitation of surface g modes in the proto-neutron star with frequencies between |800{\sim }800| and 1000 Hz at peak emission. The peak amplitudes are about |6| and |10cm10\, \mathrm{cm}|⁠, respectively, which is somewhat higher than in most recent 3D models of the pre-explosion or early explosion phase. Using a Bayesian analysis, we determine the maximum detection distances for our models in simulated Advanced LIGO, Advanced Virgo, and Einstein Telescope (ET) design sensitivity noise. The more energetic |18M18 \, \mathrm{M}_{\odot }| explosion will be detectable to about |17.5kpc17.5 \, \mathrm{kpc}| by the LIGO/Virgo network and to about |180kpc180\, \mathrm{kpc}| with the ET.
Article
The LIGO observatories detect gravitational waves through monitoring changes in the detectors’ length down to below 10−19 m/Hz variations—a small fraction of the size of the atoms that make up the detector. To achieve this sensitivity, the detector and its environment need to be closely monitored. Beyond the gravitational-wave data stream, LIGO continuously records hundreds of thousands of channels of environmental and instrumental data in order to monitor for possibly minuscule variations that contribute to the detector noise. A particularly challenging issue is the appearance in the gravitational wave signal of brief, loud noise artifacts called “glitches,” which are environmental or instrumental in origin but can mimic true gravitational waves and therefore hinder sensitivity. Currently, they are primarily identified by analysis of the gravitational-wave data stream, and auxiliary data channels often provide corroborating evidence. Here we present a machine learning approach that can identify glitches by considering all environmental and detector data channels, a task that has not previously been pursued due to its scale and the number of degrees of freedom within gravitational-wave detectors. The presented method is capable of reducing the gravitational-wave detector network’s false alarm rate and improving the LIGO instruments, consequently enhancing detection confidence.
Article
In the multimessenger astronomy era, accurate sky localization and low latency time of gravitational-wave (GW) searches are keys in triggering successful follow-up observations on the electromagnetic counterpart of GW signals. We, in this work, study the feasibility of adopting a supervised machine learning (ML) method for scoring rank on candidate GW events. We consider two popular ML methods, random forest and neural networks. We observe that the evaluation time of both methods takes tens of milliseconds for ∼45,000 evaluation samples. We compare the classification efficiency between the two ML methods and a conventional low-latency search method with respect to the true positive rate at given false positive rate. The comparison shows that about 10% improved efficiency can be achieved at lower false positive rate ∼2×10−5 with both ML methods. We also present that the search sensitivity can be enhanced by about 18% at ∼10−11 Hz false alarm rate. We conclude that adopting ML methods for ranking candidate GW events is a prospective approach to yield low latency and high efficiency in searches for GW signals from compact binary mergers.