Content uploaded by Zahra Aghaei
Author content
All content in this area was uploaded by Zahra Aghaei on Apr 22, 2022
Content may be subject to copyright.
Content uploaded by Ali Khaleghi
Author content
All content in this area was uploaded by Ali Khaleghi on Apr 22, 2022
Content may be subject to copyright.
A LITERATURE REVIEW OF THE EFFORTS MADE FOR EMPLOYING
MACHINE LEARNING IN SYNCHROTRONS*
A. Khaleghi
†1
, Z. Aghaei, K. Mahmoudi, H. Haedar, I. Imani, Computer Group, Imam Khomeini
International University, Qazvin, Iran
M. Akbari, M. Jafarzadeh, F.A. Mehrabi, P. Navidpour, Iranian Light Source Facility Institute
(ILSF) for Research in Fundamental Sciences (IPM), Tehran, Iran
1
also at Iranian Light Source Facility Institute (ILSF) for Research in Fundamental Sciences (IPM),
Tehran, Iran
Abstract
Using machine learning (ML) in various contexts is
increasing due to advantages such as automation for
everything, trends and pattern identification, highly error-
prone, and continuous improvement. Even non-computer
experts are trying to learn simple programming languages
like Python to implement ML models on their data. Despite
the growing trend towards ML, no study has reviewed the
efforts made on using ML in synchrotrons to our
knowledge. Therefore, we are examining the efforts made
to use ML in synchrotrons to achieve benefits like
stabilizing the photon beam without the need for manual
calibrations of measures that can be achieved by reducing
unwanted fluctuations in the widths of the electron beams
that prevent experimental noises obscured measurements.
Also, the challenges of using ML in synchrotrons and a
short synthesis of the reviewed articles were provided. The
paper can help related experts have a general
familiarization regarding ML applications in synchrotrons
and encourage the use of ML in various synchrotron
practices. In future research, the aim will be to provide a
more comprehensive synthesis with more details on how to
use the ML in synchrotrons.
INTRODUCTION
Synchrotrons light sources are very large-scale
experimental facilities. A synchrotron is a large machine
whose size is about a football field (Fig. 1). In these
facilities, electrons are accelerated to almost the speed of
light. By deflecting electrons through magnetic fields, they
create incredibly bright light. The electrons are deviated in
the storage ring by different magnetic components such as
bending magnets, undulators, wigglers, focusing magnets.
This deviation results in a tangential emission of X-Rays
by the electrons. The resulting X-rays are emitted as dozens
of thin beams, each channeled down "beamlines"
surrounding the storage ring in the experimental
workstations where the light is used for research. Each
beamline is designed for use with a specific technique or
type of analysis [1]–[3]. The produced light is advancing
research and development in fields as diverse as
biosciences, medical research, environmental sciences,
agriculture, minerals exploration, advanced materials,
engineering, forensics [1]. The intense and highly focused
light is used to study the dynamic and structure of materials
down to atomic level using various techniques offered by
different beamlines like diffraction, spectroscopy,
tomography, and imaging [4]. Please see the references
[1]–[3], [5] to see how a synchrotron works in more detail.
Also, the list of light sources of the world can be found in
[6].
Synchrotrons light sources worldwide are experiencing
fast changes from traditional 3rd generation to multi-bend
achromatic (MBA)-based 4th generation storage ring light
sources to achieve high-brightness and low-emittance
upgrades [7], [8]. The Advanced Photon Source (APS) and
the Advanced Light Source (ALS) are both being upgraded
Figure 1: A 3D illustration of a synchrotron [2].
to MBA-based new rings. Diamond Light Source (DLS)
designed a machine lattice based on double triple bend
achromats [8]. The upgrades will substantially harness the
light beam brightness from what is offered by the existing
rings (the light brightness is much more greater than the
sunlight) [7].
The rapid development of synchrotrons massively is
accompanied by two significant challenges. First, the new
rings drive for significantly lower emittances. Therefore,
the beam dynamics in the rings become extremely
nonlinear, causing smaller dynamic aperture and
potentially smaller momentum aperture [7]. The extremely
small emittance in a new ring needs much higher beam
stability, which raises the need for a good understanding of
the impact of environmental factors on the accelerator and
___________________________________________
* Work supported by Iranian Light Source Facility (ILSF)
† khaleghiali@ipm.ir
18th Int. Conf. on Acc. and Large Exp. Physics Control Systems ICALEPCS2021, Shanghai, China JACoW Publishing
ISBN: 978-3-95450-221-9 ISSN: 2226-0358 doi:10.18429/JACoW-ICALEPCS2021-FRBL03
Data Analytics
FRBL03
1039
Content from this work may be used under the terms of the CC BY 3.0 licence (© 2022). Any distribution of this work must maintain attribution to the author(s), title of the work, publisher, and DOI
beams [7]. Second, the speed of doing experiments and the
amount of raw data collected during each experiment is
increasing [9], [10]. Typically, each light source
theoretically can produce one petabyte of data per day [11].
Or accelerators can generate three-petabyte data just in one
experiment [12]. The manual analysis of such massive data
volumes is no longer possible [9], [10].
Moreover, the lack of automatic data analysis prevents
the delivery of new science from analyzing many collected
raw data. These effects are collectively called “data
deluge”, which is a prevalent problem in synchrotrons [9]–
[11]. To overcome issues such as data deluge, Machine
Learning (ML), a subset of artificial intelligence that
studies algorithms able to learn autonomously and directly
from the existing datasets, can be very effective [13], [14].
Using ML in many contexts is increasing due to advantages
such as automation for everything, trends and pattern
identification, highly error-prone, and continuous
improvement [15]. Even non-computer experts are trying
to learn simple programming languages like Python to
implement ML models on their data. Despite the growing
practice towards ML, no study has reviewed the efforts
made on using ML in synchrotrons to authors’ knowledge.
ML can be used to achieve benefits like:
• Reducing unwanted fluctuations in the widths of the
electron beams produced at synchrotrons that, in turn,
can prevent experimental noises obscured
measurements. This work can stabilize the photon
beam without the need for manual calibrations of
measures [16].
• Preventing data deluge [9], [10]
• Reducing user-in-the-loop decision-making [9], [10],
[17]
• Harnessing the brightness of light sources [17]
• Supporting efficient and clever monitoring and fault
detection [17]
• Optimal setup using automatic alignment [17]
• Supporting stable conditions by providing real-time
feedback to the experiments and end-users [17]
• Allowing users to focus on experiments at hand and
best use the allocated beam time rather than manual
setups and justifications [17]
• Providing instant and straight feedback from speedy
physics-based simulations [17]
In this paper, we are focusing on efforts investigated ML
applications in synchrotrons. Also, the challenges of using
ML in synchrotrons and a short synthesis of the reviewed
articles were provided. Our literature review can help
related experts have a general familiarization regarding ML
applications in synchrotrons and encourage the use of ML
in various synchrotron practices.
MACHINE LEARNING AND ITS
RELEVANT CONCEPTS
Machine learning is widely used to make predictions or
decisions, a subfield of artificial intelligence and the
process of making a mathematical model without being
explicitly programmed and using sample data, famous as
“training data”. In other words, ML algorithms can learn to
complete tasks using raw data [14], [18]. ML is usually
applied for classification, regression, clustering, anomaly
detection, dimensionality reduction, and reward
maximization [13]. Generally speaking, ML techniques
can be classified into three main categories, namely,
supervised learning, unsupervised learning, and
reinforcement learning (RL) [9]. Supervised learning is
valuable when pairs of input and desired output are
available. An algorithm can generalize the problem from
the given structured data and predict unknown input.
Unsupervised learning algorithms solve the tasks where
only input data is available [19]. Recently, Reinforcement
Learning (RL) has also attracted particular attention. RL is
based on dynamic environment-agent interaction, similar
to a Markov decision process [4], [20]. The agent starts an
action on the environment, and the environment reacts to
produce a reward, which the agent uses to learn how to
enhance its subsequent actions. RL approach does not
require a prepared data set consisting of input-output pairs
since the agent learns by the continuous interaction with
the environment, varying depending on the action and its
dynamics [19]. Finally, semi-supervised learning is
halfway between supervised and unsupervised learning. In
this case, the algorithm is provided with both unlabeled and
labeled data. This category is instrumental when available
data are incomplete and to learn representations [21].
Among ML algorithms, clustering and deep learning
(DL) are very popular. Regarding clustering is categorized
as unsupervised learning (needs no labeled data). It groups
data in some clusters. The similarity between the data
within each cluster is maximum, and the dissimilarity
between the data assigned to different clusters is minimum.
Clustering algorithms are whether based on centroid
research such as k-means or are density-based like
DBSCAN. They see clusters as areas of high density
separated by low density instead of determining the
centroids. K-means is the simplest and most common
clustering algorithm [19], and DBSCAN is robust against
outliers. It can be applied to eliminate them on different
stages of measurements and correction processes [22]. In
recent years, deep learning has emerged as the leading class
of ML algorithms, now almost synonymous with ML to the
public. Deep learning uses neural networks (NNs) (Fig. 2)
composed of hidden layers carrying out different
operations to find and explore complex data’s
representations. It improves the performance of classifiers
beyond that common ML algorithms offer, especially in the
circumstances involving large datasets with high
dimensions [23], [24].
18th Int. Conf. on Acc. and Large Exp. Physics Control Systems ICALEPCS2021, Shanghai, China JACoW Publishing
ISBN: 978-3-95450-221-9 ISSN: 2226-0358 doi:10.18429/JACoW-ICALEPCS2021-FRBL03
FRBL03
Content from this work may be used under the terms of the CC BY 3.0 licence (© 2022). Any distribution of this work must maintain attribution to the author(s), title of the work, publisher, and DOI
1040 Data Analytics
Figure 2: A fully connected NN with eight input features,
two output labels, and two hidden layers including ten
neurons each (Figure courtesy Alex LeNail).
CHALLENGES OF USING MACHINE
LEARNING IN SYNCHROTRONS
Many AI/ML platforms for beamline tuning have been
planned. However, most of them can not eventually be used
regularly as part of an accelerator’s central control system,
mainly due to limitations such as required hardware,
algorithms, software packages, and limited accessibility of
large and suitable datasets [14]. The most critical
challenges to scientists and end-users at synchrotrons for
utilizing machine learning are [10]:
• the necessity for long-term data preservation and
transfer, as well as the demand for data analysis
pipelines
• The requirement for instant feedback helping on-site
scientific and technical decisions during beamtime:
Such feedback dramatically depends on the accuracy
and automation implemented using sufficient
hardware and software infrastructure for the real-time
data evaluation and processing at synchrotrons.
• The need for user-friendly software packages to
effectively control the extensive data generated in
synchrotrons experiments: most beamlines users are
not experts in data computing and management.
Alizadeh and Khaleghi also listed the most critical data
management issues, which include: multiple source data,
data analysis, data storage, data accessibility, data process,
data format, data transfer, expensive data analysis tools,
online processing, clustering, storage reliability, data
mining, replication, real-time data collection and
visualization [12].
Regarding these problems, a cross-domain and cross-
facility solution that can accelerate creating a real-time
user-friendly, advanced data processing platform at
synchrotrons is needed [10]. The solutions must be
versatile and flexible enough to be integrated at various
experimental stations and cope with the heterogeneous
requirements of different beamlines and experiments [25].
For example, supervised learning algorithms need
advanced data management platforms because they require
large amounts of reliable training data to construct reliable
models [19], [26]. Unfortunately, while experimentalists
have access to large datasets, these data are typically not
tagged appropriately and thus are not suitable for
supervised ML methods. Continual advances in hardware
and software have enabled tremendous increases in the
data collection rate. However, this boost in data throughput
is not accompanied by a corresponding rise in identifying
useful data and the value of each datum [27].
Some studies reported that a big data center which is
termed a super facility is required for data management and
processing [9], [10], [18]. Because the success of ML can
be increased by the explosion in Big Data, advances in
computational power, particularly the use of graphics
processing units, and the development of more
sophisticated ML techniques such as DL [18], [26].
In other words, these super facilities allow users to focus
only on meaningful scientific data leading to discoveries
and insights instead of dealing with unstructured and
massive raw data. It can be achieved under users'
simultaneous access to the synchrotrons’ experiments with
the designed real-time user-friendly platforms [9], [10]. In
an interview conducted by Alizadeh and Khaleghi using
ten light source facility members, it was concluded that
86% of participants were not familiar with big data [12].
Also, this study first classified synchrotrons experiments
based on their techniques into three classes: imaging,
scattering, and spectroscopy (Fig. 3). Then based on these
main techniques, the researchers proposed a conceptual
model for different data management aspects required for
each method (Fig. 4). Some synchrotrons like the Shanghai
Synchrotron Radiation Facility (SSRF) [9], the National
Synchrotron Light Source II (NSLSII) [4], [17], and the
Stanford Synchrotron Radiation Light Source (SSRL) [7]
have made some efforts to provide the platforms needed
for robust data analysis and management. The main work
of these extensive platforms is that they should support a
complete automated process for real-time and offline
access for data management and computations to different
operations of synchrotrons by providing sufficient
hardware and software infrastructure [10].
Figure 3: Synchrotrons experiments classification based on
their techniques [12].
18th Int. Conf. on Acc. and Large Exp. Physics Control Systems ICALEPCS2021, Shanghai, China JACoW Publishing
ISBN: 978-3-95450-221-9 ISSN: 2226-0358 doi:10.18429/JACoW-ICALEPCS2021-FRBL03
Data Analytics
FRBL03
1041
Content from this work may be used under the terms of the CC BY 3.0 licence (© 2022). Any distribution of this work must maintain attribution to the author(s), title of the work, publisher, and DOI
Figure 4: Different data management aspects required for
each general synchrotrons experiments technique [12].
As the last point, we mention the concern that Hill et al.
[22] noted that remote data analysis workflows and the use
of distributed resources should be provided for ML
purposes. The amount of beamtime required for a
particular experiment reaches a limit as experiments
become shorter. Therefore, it is no longer valuable for a
user to visit the synchrotron physically. For this, remote
data analysis and pipelines are essential in steering the
experiment from a remote synchrotron and carrying out an
optimal measurement.
THE USED/ DEVELOPED ML
ALGORITHMS, SOFTWARE, AND
PLATFORMS FOR DATA MANAGEMENT
High performance and low latency ML models are
necessary to use ML techniques in everyday operations of
synchrotrons [28]. Among these models, Neural Networks
in synchrotrons have attracted particular attention due to
their ability to facilitate image-centric big data science and
many scientific imaging problems, such as denoising,
feature segmentation, image restoration, and super-
resolution [10], [29], [30]. Furthermore, NNs can explore
nonlinear and dynamic behaviors [24].
NNs process the feature of image pattern rather than the
value of each pixel, as with classical methods [31].
Moreover, image-based diagnostics can be used directly in
accelerators both as outputs and inputs [14]. NNs' accuracy
and efficiency for image recognition and classification
have been proved from various applications [31]. Several
tools have implemented NNs for synchrotrons operations.
The Xlearn toolbox implemented NNs for multiple
synchrotron X-ray imaging problems, which is an open-
source Python package. The Features of Xlearn are [31]:
1- Correction of instrument and beam instability artifacts
2- Improving low-dose images
3- Feature extraction and segmentation
4- Super-resolution X-ray microscopy
The Xlearn can be easily integrated into existing
computational pipelines available at various synchrotron
facilities [32]. Moreover, the Xlearn is based on Keras [33]
and Theano [34] packages. Keras is a well-known platform
for neural networks and Theano for tensor flow computing.
Keras and Theano also include GPU acceleration, the
critical feature of applying NNs on extensive datasets.
Please see [35] for source code, documentation, and
information on contributing to this library. However, NNs
are powerful in removing noise from reconstructed images.
For training, they require collecting a dataset of paired
noisy and high-quality measurements, which is a
significant obstacle to their use in practice. In this regard,
the Noise2Inverse was designed, a deep NN-based
denoising method for linear image reconstruction
algorithms that do not require any additional clean or noisy
data. Recently, Hendriksen et al. [36] used the
Noise2Inverse for deep denoising for multi-dimensional
synchrotron X-ray tomography without high-quality
reference data. This study applied the Noise2Inverse
method to datasets acquired at two synchrotron beamlines.
First, they used the technique on a static and a dynamic
micro-tomography dataset from the TOMCAT beamline at
the Swiss Light Source (SLS). Second, to investigate the
possibility of accelerating the acquisition process using an
X-ray diffraction tomography (XRD-CT), a dataset from
the ID15A beamline at the European Synchrotron
Radiation Facility (ESRF) was used. Results showed that
Noise2Inverse is capable of accurate denoising and enables
a substantial reduction in acquisition time while
maintaining image quality. Liu et al. [30] introduced a deep
NN model for real-time computed tomography at
synchrotron light sources to improve the quality of
tomographic reconstructions as data is collected. In turn,
this method produces high-quality output more quickly and
reduces the amount of data that must be collected. This
method can be integrated into the real-time streaming
tomography pipeline to enable better-quality images in the
early stages of data acquisition. Using real-world datasets
(tomography data, a common imaging modality at
synchrotrons) collected at APS, results showed significant
improvement in tomography image quality and system
throughput.
Usually, different beamlines exit in a light source facility
covering different scientific areas and utilizing different
multi-dimensional detector technologies. Moreover, at
synchrotrons, each beamline typically uses an individual
streamline data acquisition software developed specifically
for that beamline. The types of software are often
incompatible with each other, making it difficult for
scientists to compare data from different beamlines and
other light sources. Therefore, AI/ML tools must also be
compatible with these different beamlines [17], [37]. In this
way, the NSLSII [17] developed the Bluesky Suite, a
collection of Python libraries for data acquisition and
management and mainly to tackle the data “variety”
challenge and streaming and real-time data analysis at user
facilities. There are capabilities like all data and metadata
generated during an experiment can be emitted in real-time
to other processes in the form of ‘documents’, Python
dictionaries with comprehensible schema. The generated
documents can be distributed locally or over a network. All
beamline hardware is accessed via a library called ophyd.
Or, the access to historical data is through an API called
DataBroker.
18th Int. Conf. on Acc. and Large Exp. Physics Control Systems ICALEPCS2021, Shanghai, China JACoW Publishing
ISBN: 978-3-95450-221-9 ISSN: 2226-0358 doi:10.18429/JACoW-ICALEPCS2021-FRBL03
FRBL03
Content from this work may be used under the terms of the CC BY 3.0 licence (© 2022). Any distribution of this work must maintain attribution to the author(s), title of the work, publisher, and DOI
1042 Data Analytics
In another study recently published by NSLSII
investigated reinforcement learning [4]. This paper
demonstrated the use of RL methods for optimizing
beamline operations. This study also explained how the
Python-based Bluesky suite of data acquisition software
enables RL applications at a beamline. This functionality
by solving a classical RL problem, cartpole, in the Bluesky
environment was demonstrated. Furthermore, the use of
RL methods to address a prevalent scenario existed on
high-throughput beamlines: maximizing data quality
across multiple samples of different scattering strength
within a limited time window. Finally, the challenges and
overall strategy to realizing extensive development of RL
methods at large-scale user facilities were discussed.
The European Organization for Nuclear Research
(CERN) is the Large Hadron Collider (LHC) site, the
largest and highest-energy particle collider globally [13].
The CERN has mainly focused on applying supervised and
unsupervised ML techniques for various domains
associated with beam dynamics studies. Some of these
areas include beam commissioning of the collimation
system, optimization of beam lifetime and losses, detection
of collective beam instabilities, heating detection from
pressure readings, and numerical simulations of dynamic
aperture. For example, a fully automated software for beam
commissioning of the collimation system using ML
algorithms was developed. This new fully automatic
alignment software was successfully used throughout 2018
in the LHC operations. Furthermore, this software will be
used as the default software at the LHC in 2021. The time
to align the collimators at injection was decreased by
71.4%, compared to the semi-automatic alignment, namely
from 2.8 h to 50 minutes. Also, this tool was incorporated
into the angular alignment implementation and
successfully decreased the alignment time by 70%,
requiring no human intervention. For a complete review of
how ML techniques have been incorporated at CERN,
please see [13].
Topaz3 is data manipulation and machine learning
package implemented with python libraries for
Macromolecular Crystallography (MX) at DLS.
Specifically, it transforms electron density map data
obtained from diffraction experiments and uses machine
learning to estimate whether the original or inverse hand
has clearer information. Tensorflow-gpu is required to use
the machine learning side of Topaz3, which speeds up the
training and use of neural networks [38].
One of the decades-old problems in synchrotron light
sources facilities is that they simultaneously deliver light
to dozens of beamlines. One side effect of this is that the
movements of specific insertion devices (IDs), i.e.,
undulators and wigglers with variable magnetic fields,
cause the electron beam’s size to fluctuate. These
fluctuations affect other beamlines' performance. Usually,
changes are reduced using corrections based on a
combination of static, predetermined physics models and
lengthy calibration measurements. It is periodically
repeated to counteract drift in the accelerator and
instrumentation. Researchers at the ALS in Lawrence
Berkeley National Laboratory showed that NNs algorithms
can predict noisy fluctuations in the size of beams
generated by synchrotron light sources. Therefore, they can
correct changes before they occur (feed-forward vs. feed-
back correction). Consequently, this approach significantly
helps attain order-of-magnitude enhancement instability
that fulfils the requirements for different light sources. For
training the synchrotron, researchers fed electron-beam
data from the ALS, including the positions of the IDs and
blips in electron-beam performance raised by ID
adjustments, into a NN. One key advantage of this
approach is that the required data for retraining NNs can be
obtained constantly, even while the feed-forward system is
active during a regular user run [16]. Furthermore,
continuous retraining allows the neural networks to
continually adapt to a drifting machine and changes in ID
configurations during run periods, independent of static
physics models. The developed algorithm then learned the
complex nonlinear relationships between the ID settings
and vertical beam size and made corrections to negate the
blips. NNs stabilized the vertical beam size at 0.2 μm or
0.4% of the beam size compared to 2–3% without
correction [16].
Another light source facility that has heavily
investigated using machine learning for light source
facilities is the SSRL, a division of SLAC National
Accelerator Laboratory, operated by Stanford University
for the Department of Energy. A recent report which the
SSRL published shows the research activities and
achievements during the two-year R&D project of ’beam-
based optimization and machine learning for synchrotrons’
at SSRL [7]. R&D project was carried out in the
development of machine learning techniques for
synchrotron applications in three main areas: accelerator
design optimization, beam-based optimization, and
analysis of accelerator operation data. First, to implement
online optimizations, they developed the Teeport platform.
The control systems and programming environments on
different machines may be different. Therefore, online
optimization algorithms developed for one machine can
not be easily applied to other systems. Usually, the
optimizer is a Matlab script, and the evaluator is a Python
script. Teeport decouples the algorithm implementation
and the experimental systems by providing a universal
middle layer that communicates between the optimizer and
the evaluator. Therefore, they can communicate freely. In
Teeport, a middleware between the evaluator and optimizer
is inserted, acting as a data normalizer and signal forwarder.
The data flows through the middleware. Therefore, to make
the online optimization process more controllable and
visible, one can add the control and monitor layers to the
middleware (Fig. 5). The features of the platform include:
online optimization experiment, fast switch between
different optimization settings (the only necessary actions
needed to switch between the different optimization
settings concerning the code are switching the
evaluator/optimizer id and/or update the configurations of
18th Int. Conf. on Acc. and Large Exp. Physics Control Systems ICALEPCS2021, Shanghai, China JACoW Publishing
ISBN: 978-3-95450-221-9 ISSN: 2226-0358 doi:10.18429/JACoW-ICALEPCS2021-FRBL03
Data Analytics
FRBL03
1043
Content from this work may be used under the terms of the CC BY 3.0 licence (© 2022). Any distribution of this work must maintain attribution to the author(s), title of the work, publisher, and DOI
the evaluator/optimizer accordingly (Fig. 6)), optimization
performance comparison, and optimization algorithm
benchmark.
Figure 5: The architecture of Teeport. Teeport facilitates
the application of optimization algorithms created in one
programming environment to accelerators equipped with
many different control systems and programming
environments [7].
The Teeport platform can potentially become a
centralized service for advanced optimization applications.
It can be integrated lots of the optimization algorithms/test
problems from various platforms, such as PyGMO, pymoo,
PlatEMO, and Ocelot, to Teeport (Fig. 7). The R & D
project also developed two ML-based global optimization
algorithms for storage ring nonlinear beam dynamics
optimization. The first is the multi-generation Gaussian
process optimizer (MG-GPO) was used to solve multiple
objective optimization problems. The second one is a
neural network-based method to analyze accelerator
operation data and study underlying environmental factors'
impact on machine performance.
Figure 6: Fast switching between the simulation evaluator
and the experimental evaluator [7].
Figure 7: Teeport as a unified interface for the optimization
algorithms [7].
BESSY II Light Source is another synchrotron that has
started to use machine learning. The primary efforts made
by this synchrotron described in [39] are: (1) beam lifetime
can be successfully predicted in a time-series fashion using
supervised learning models trained only with 185
accelerator variables readbacks, i.e., excluding previous
lifetime measurements, and (2) the prototypes towards
self-tuning of machine parameters in different optimization
cases like injection efficiency and orbit correction using
deep reinforcement learning agents have been
implemented.
The Delta synchrotron for orbit correction used machine
learning techniques [40]. Conventional Feed-Forward
neural networks were trained on measured orbits to apply
local and global beam position corrections to the 1.5–GeV
storage ring DELTA. According to this study, it can be
demonstrated that ML techniques are an alternative
approach for automated orbit correction of the DELTA
storage ring.
Fol et al. [19] focused on applying ML for beam
diagnostics and incorporating ML concepts into
accelerator problems. They identified four main areas that
ML algorithms can be helpful, including virtual
diagnostics, optimization and operation, beam optics
correction, instrumentation fault detection. This study
shows how different ML approaches can be incorporated
for various functions of accelerators. For example, it has
been concluded that reinforcement learning is suitable for
solving complex control tasks. Or unsupervised learning is
helpful for anomaly detection tasks such as detecting
instrumentation defects, e.g., using clustering for faulty
beam position monitors signal so that these methods can be
performed directly without training in accelerator systems.
A SHORT SYNTHESIS OF THE
REVIEWED ARTICLES
• To advance the field of incorporating ML in
synchrotrons, a game-like project defining a reward
scheme to train models to optimize a beamline
efficiently is very effective.
• Good works have been done to facilitate data
management and computing at synchrotrons, but they
are ad-hoc based on different beamlines and
synchrotron facilities. Therefore, future works can
focus on converging these efforts in a seamlessly
integrated platform for diverse beamlines with
different requirements to provide impetus to
employing ML in different operations of synchrotrons
like accelerator design optimization, beam-based
optimization, and analysis of accelerator operation
data.
• Implementing remote data analysis workflows and the
use of distributed resources are very effective for
synchrotron practices. Because with the advances in
synchrotron technologies, the amount of beamtimes
needed for experiments is reduced tremendously, and
an experiment can be done quickly. Therefore,
physically visiting synchrotrons is no longer
worthwhile.
• As evident from the above discussions, many
researchers for developing ML models have used
Python. There are three main reasons for this. First, for
Python, such as HyperText Markup Language
18th Int. Conf. on Acc. and Large Exp. Physics Control Systems ICALEPCS2021, Shanghai, China JACoW Publishing
ISBN: 978-3-95450-221-9 ISSN: 2226-0358 doi:10.18429/JACoW-ICALEPCS2021-FRBL03
FRBL03
Content from this work may be used under the terms of the CC BY 3.0 licence (© 2022). Any distribution of this work must maintain attribution to the author(s), title of the work, publisher, and DOI
1044 Data Analytics
(HTML), many existing codes are widely available for
different use cases. By combining them, standalone
platforms can be implemented more easily and rapidly.
Considering such features, [14] suggested some people
at accelerator laboratories can be determined to focus
on the application of these tools to specific problems.
Second, many libraries for ML purposes have been
implemented in Python like Scikit-learn, TensorFlow,
Keras, PyTorch, etc. Third, Python is very popular
among non-computer experts due to its simplicity.
• Among existing ML algorithms and models, deep
learning has been considered significantly. Almost,
most of the reviewed articles have used NNs mainly
due to their ability to explore nonlinear and dynamic
behaviors and facilitate image-centric big data science.
• Unfortunately, on the Web, less publicly available
good and large datasets are available for synchrotrons
practices. It can mainly reduce the speed of using ML
models in particle accelerators. By providing suitable
datasets at cloud-based platforms like Kaggle in a
short time, a variety of solutions can be provided for a
given problem. For example, a successful competition
was organized in 2014 by the high energy physics
community, and it attracted over 1700 participants
[14].
• Advanced online optimization using ML algorithms
can support an efficient way of finding the ideal
machine configuration.
• Combining big data with ML is already crucial.
Storing, managing, and analyzing high-volume data
are challenging problems that can be solved using this
combination.
• In many contexts, usually, it is sufficient to use
previously stored data for ML purposes. However, in
synchrotrons, it is essential to develop online data
streaming and management platforms because real-
time usage of ML is vital for us in particle accelerators.
Besides, processing and validating data after
completing experiments lead to undetected problems
and prevent online steering. Online ML algorithms and
data processing platforms also can reduce the amount
of data needed to be stored.
• The existing literature does not provide many direct
comparisons between ML techniques using the same
publicly available datasets. Therefore, to choose the
best method that suits a given question, an empirical
approach investigating different proposed ML
methods on the same dataset is recommended.
• For storing data for ML techniques, mainly supervised
algorithms, besides providing large datasets, it is
essential to give techniques for tagging data and
separating valuable and useful data from less or no
useful data. This work reduces the amount of data
needed to be stored and improves the efficiency of ML
models.
CONCLUSION
The suitability of ML methods has been clearly shown
in the performed review for beam energy, brightness,
stability, etc. But much is expected from further application
and extension of these approaches to the most diverse
beamlines at synchrotrons globally, offering advanced
capabilities, exploring the most varied, time-varying, and
nonlinear relationships. There are many efforts to provide
data management platforms for machine learning models
and algorithms. However, these efforts are sparse and
heterogeneous based on different beamlines and
synchrotrons. It is beneficial to integrate them to propose a
more efficient environment for incorporating machine
learning in everyday synchrotron practices. Generally, in
large experimental facilities such as synchrotron, neutron,
and x-ray free-electron laser (XFEL), unifying ML-ready
solutions is needed such that they should be general and
transferrable to different beamlines and particle
accelerators. Moreover, ML algorithms should also have
the capability to be used online by providing online
powerful data analysis platforms. Otherwise, some errors
and anomalies may not be detected, and the amount of data
that need to be stored will increase. Feed-forward
correction, evaluation, and optimization (e.g., Feed-
Forward Neural Networks) are more beneficial than
feedback ones for many synchrotron practices, according
to our review. In future research, the aim will be to present
a more comprehensive synthesis with more details on how
to use the ML in synchrotrons.
REFERENCES
[1] Australian Synchrotron, “What is a synchrotron?” [Online].
Available:
http://archive.synchrotron.org.au/synchrotro
n-science/what-is-a-synchrotron [Accessed: 10-
Oct-2021].
[2] ESRF, “What is a synchrotron?” [Online]. Available:
https://www.esrf.fr/about/synchrotron-
science/synchrotron [Accessed: 10-Oct-2021].
[3] ALBA, “What is a synchrotron?” [Online]. Available:
https://intranet.cells.es/AboutUs/WhatIs
[Accessed: 10-Oct-2021].
[4] P. M. Maffettone, J. K. Lynch, T. A. Caswell, C. E. Cook,
S. I. Campbell, and D. Olds, “Gaming the beamlines—
employing reinforcement learning to maximize scientific
outcomes at large-scale user facilities,” Mach. Learn. Sci.
Technol., vol. 2, no. 2, p. 025025, Jun. 2021.
[5] Lightsources.org, “Lightsources.org.” [Online]. Available:
https://lightsources.org/ [Accessed: 10-Oct-2021].
[6] Lightsources.org, “Synchrotron facilities.” [Online].
Available:
https://lightsources.org/lightsources-of-
the-world/ [Accessed: 10-Oct-2021].
[7] X. Huang, M. Song, and Z. Zhang, “Report for Beam Based
Optimization and Machine Learning for Synchrotrons at
SSRL,” 2021.
[8] T. Connolley, C. M. Beavers, and P. Chater, “High-Energy
Adventures at Diamond Light Source,” Synchrotron Radiat.
News, vol. 33, no. 6, pp. 31–36, Dec. 2020.
18th Int. Conf. on Acc. and Large Exp. Physics Control Systems ICALEPCS2021, Shanghai, China JACoW Publishing
ISBN: 978-3-95450-221-9 ISSN: 2226-0358 doi:10.18429/JACoW-ICALEPCS2021-FRBL03
Data Analytics
FRBL03
1045
Content from this work may be used under the terms of the CC BY 3.0 licence (© 2022). Any distribution of this work must maintain attribution to the author(s), title of the work, publisher, and DOI
[9] C. Wang et al., “Deploying the Big Data Science Center at
the Shanghai Synchrotron Radiation Facility: the first
superfacility platform in China,” Mach. Learn. Sci. Technol.,
vol. 2, no. 3, p. 035003, Sep. 2021.
[10] C. Wang, U. Steiner, and A. Sepe, “Synchrotron Big Data
Science,” Small, vol. 14, no. 46, p. 1802291, Nov. 2018.
[11] PHYSICS TODAY, “Synchrotrons face a data deluge,”
2020. [Online]. Available:
https://physicstoday.scitation.org/do/10.106
3/PT.6.2.20200925a/full/ [Accessed: 06-Sep-2021].
[12] S. Alizadeh and A. Khaleghi, “The Study of Big Data Tools
Usages in Synchrotrons”, in Proc. 16th Int. Conf. on
Accelerator and Large Experimental Physics Control
Systems (ICALEPCS'17), Barcelona, Spain, Oct. 2017,
pp. 1428-1431. doi:10.18429/JACoW-ICALEPCS2017-
THPHA034
[13] P. Arpaia et al., “Machine learning for beam dynamics
studies at the CERN Large Hadron Collider,” Nucl.
Instruments Methods Phys. Res. Sect. A Accel.
Spectrometers, Detect. Assoc. Equip., vol. 985, no. June
2020, p. 164652, Jan. 2021.
[14] S. National, M. Park, and D. Bowring, “Opportunities in
Machine Learning for Particle Accelerators,” 2018.
[15] R. Lund, “4 Benefits Of Machine Learning,” 2021. [Online].
Available: https://techvera.com/4-benefits-of-
machine-learning/ [Accessed: 09-Sep-2021].
[16] S. C. Leemann et al., “Demonstration of Machine Learning-
Based Model-Independent Stabilization of Source
Properties in Synchrotron Light Sources,” Phys. Rev. Lett.,
vol. 123, no. 19, p. 194801, 2019.
[17] S. I. Campbell et al., “Outlook for artificial intelligence and
machine learning at the NSLS-II,” Mach. Learn. Sci.
Technol., vol. 2, no. 1, p. 013001, Mar. 2021.
[18] M. Giovannozzi, E. Maclean, C. E. Montanari, G. Valentino,
and F. F. Van der Veken, “Machine Learning Applied to the
Analysis of Nonlinear Beam Dynamics Simulations for the
CERN Large Hadron Collider and Its Luminosity Upgrade,”
Information, vol. 12, no. 2, p. 53, Jan. 2021.
[19] E. Fol, R. Tomás, G. Franchetti, and J. Coello de Portugal,
“Application of machine learning to beam diagnostics,”
Proc. 39th Int. Free. Laser Conf. (FEL'19), Hamburg,
Germany, Aug. 2019, pp. 311-317.
doi:10.18429/JACoW-FEL2019-WEB03
[20] R. A. Howard, Dynamic Programming and Markov
Processes, 1 st editi. New York: Technology Press and
Wiley, 1960.
[21] J. Schmidt, M. R. G. Marques, S. Botti, and M. A. L.
Marques, “Recent advances and applications of machine
learning in solid-state materials science,” npj Comput.
Mater., vol. 5, no. 1, 2019.
[22] E. Fol, F. Carlier, A. G. Valdivieso, and R. Tomás,
“Machine Learning Methods for Optics Measurements and
Corrections At LHC,” in 9th Int. Particle Accelerator
Conf.(IPAC’18), Vancouver, Canada, Apr.-May 2018, pp.
1967-1970.
doi:10.18429/JACoW-IPAC2018-WEPAF062
[23] A. L. Edelen, S. G. Biedron, B. E. Chase, D. Edstrom, S. V
Milton, and P. Stabile, “Neural Networks for Modeling and
Control of Particle Accelerators,” IEEE Trans. Nucl. Sci.,
vol. 63, no. 2, pp. 878–897, Apr. 2016.
[24] P. S. Reel, S. Reel, E. Pearson, E. Trucco, and E. Jefferson,
“Using machine learning approaches for multi-omics data
analysis: A review,” Biotechnol. Adv., vol. 49, no. May, p.
107739, Jul. 2021.
[25] R. Gehrke, A. Kopmann, E. Wintersberger, and F.
Beckmann, “The High Data Rate Processing and Analysis
Initiative of the Helmholtz Association in Germany,”
Synchrotron Radiat. News, vol. 28, no. 2, pp. 36–42, Mar.
2015.
[26] D. H. Barrett and A. Haruna, “Artificial intelligence and
machine learning for targeted energy storage solutions,”
Curr. Opin. Electrochem., vol. 21, pp. 160–166, Jun. 2020.
[27] J. Hill et al., “Future trends in synchrotron science at NSLS-
II,” J. Phys. Condens. Matter, vol. 32, no. 37, p. 374008,
Sep. 2020.
[28] B. Blaiszik, K. Chard, R. Chard, I. Foster, and L. Ward,
“Data automation at light sources,” in AIP Conference
Proceedings, 2019, vol. 2054, no. January, p. 020003.
[29] B. Wang et al., “Deep learning for analysing synchrotron
data streams,” 2016 New York Sci. Data Summit, NYSDS
2016 - Proc., 2016.
[30] Z. Liu, T. Bicer, R. Kettimuthu, and I. Foster, “Deep
Learning Accelerated Light Source Experiments,” in 2019
IEEE/ACM Third Workshop on Deep Learning on
Supercomputers (DLS), 2019, pp. 20–28.
[31] Argonne National Laboratory, “Xlearn,” 2016. [Online].
Available: https://xlearn.readthedocs.io/en/
latest/source/introduction.html
[32] X. Yang, F. De Carlo, C. Phatak, and D. Gürsoy, “A
convolutional neural network approach to calibrating the
rotation axis for X-ray computed tomography,” J.
Synchrotron Radiat., vol. 24, no. 2, pp. 469–475, Mar. 2017.
[33] Keras-team, “keras.” [Online]. Available:
https://github.com/fchollet/keras.git
[Accessed: 11-Sep-2021].
[34] Theano, “Theano.” [Online]. Available:
https://github.com/Theano/TheaNo [Accessed: 11-
Sep-2021].
[35] “xlearn.” [Online]. Available:
https://github.com/tomography/xlearn [Accessed:
08-Sep-2021].
[36] A. A. Hendriksen et al., “Deep denoising for multi-
dimensional synchrotron X-ray tomography without high-
quality reference data,” Sci. Rep., vol. 11, no. 1, p. 11895,
Dec. 2021.
[37] S. Kossman, “Software Developed at Brookhaven Lab
Could Advance Synchrotron Science Worldwide,”
Brookhaven National Laboratory, 2017. [Online].
Available:
https://www.bnl.gov/newsroom/news.php?a=2124
70 [Accessed: 08-Sep-2021].
[38] Diamond Light Source, “Topaz3.” [Online]. Available:
https://github.com/DiamondLightSource/python
-topaz3 [Accessed: 08-Sep-2021].
[39] L. V. Ramirez, T. Mertens, R. Mueller, J. Viefhaus, and G.
Hartmann, “Adding Machine Learning to the Analysis and
Optimization Toolsets at the Light Source BESSY II,” in
Proc. ICALEPCS2019, New York, NY, USA, Oct. 2019,
paper TUCPL01, pp.754-760.
doi:10.18429/JACoW-ICALEPCS2019-TUCPL01
[40] D. Schirmer, “Orbit Correction With Machine Learning
Techniques at the Synchrotron Light Source DELTA,” in
Proc. ICALEPCS2019, New York, NY, USA, Oct. 2019,
paper WEPHA138, pp. 1426–1430.
doi:10.18429/JACoW-ICALEPCS2019-WEPHA138
18th Int. Conf. on Acc. and Large Exp. Physics Control Systems ICALEPCS2021, Shanghai, China JACoW Publishing
ISBN: 978-3-95450-221-9 ISSN: 2226-0358 doi:10.18429/JACoW-ICALEPCS2021-FRBL03
FRBL03
Content from this work may be used under the terms of the CC BY 3.0 licence (© 2022). Any distribution of this work must maintain attribution to the author(s), title of the work, publisher, and DOI
1046 Data Analytics