Project

# Machine Learning and Dynamical Systems

Goal: Since its inception in the 19th century through the efforts of Poincaré and Lyapunov, the theory of dynamical systems addresses the qualitative behaviour of dynamical systems as understood from models. From this perspective, the modeling of dynamical processes in applications requires a detailed understanding of the processes to be analyzed. This deep understanding leads to a model, which is an approximation of the observed reality and is often expressed by a system of Ordinary/Partial, Underdetermined (Control), Deterministic/Stochastic differential or difference equations. While models are very precise for many processes, for some of the most challenging applications of dynamical systems (such as climate dynamics, brain dynamics, biological systems or the financial markets), the development of such models is notably difficult.
On the other hand, the field of machine learning is concerned with algorithms designed to accomplish a certain task, whose performance improves with the input of more data. Applications for machine learning methods include computer vision, stock market analysis, speech recognition, recommender systems and sentiment analysis in social media. The machine learning approach is invaluable in settings where no explicit model is formulated, but measurement data is available. This is frequently the case in many systems of interest, and the development of data-driven technologies is becoming increasingly important in many applications.

The intersection of the fields of dynamical systems and machine learning is largely unexplored, and the goal of this project is to bring together researchers from these fields to fill the gap between the theories of dynamical systems and machine learning in the following directions:

Machine Learning for Dynamical Systems: how to analyze dynamical systems on the basis of observed data rather than attempt to study them analytically.
Dynamical Systems for Machine Learning: how to analyze algorithms of Machine Learning using tools from the theory of dynamical systems.

0 new
66
Recommendations
0 new
140
Followers
1 new
466
22 new
7858

## Project log

The empirical laws governing human-curvilinear movements have been studied using various relationships, including minimum jerk, the 2/3 power law, and the piecewise power law. These laws quantify the speed-curvature relationships of human movements during curve tracing using critical speed and curvature as regressors. In this work, we provide a reservoir computing-based framework that can learn and reproduce human-like movements. Specifically, the geometric invariance of the observations, i.e., lateral distance from the closest point on the curve, instantaneous velocity, and curvature, when viewed from the moving frame of reference, are exploited to train the reservoir system. The artificially produced movements are evaluated using the power law to assess whether they are indistinguishable from their human counterparts. The generalisation capabilities of the trained reservoir to curves that have not been used during training are also shown.
We present a convolutional framework which significantly reduces the complexity and thus, the computational effort for distributed reinforcement learning control of dynamical systems governed by partial differential equations (PDEs). Exploiting translational invariances, the high-dimensional distributed control problem can be transformed into a multi-agent control problem with many identical, uncoupled agents. Furthermore, using the fact that information is transported with finite velocity in many cases, the dimension of the agents' environment can be drastically reduced using a convolution operation over the state space of the PDE. In this setting, the complexity can be flexibly adjusted via the kernel width or by using a stride greater than one. Moreover, scaling from smaller to larger systems -- or the transfer between different domains -- becomes a straightforward task requiring little effort. We demonstrate the performance of the proposed framework using several PDE examples with increasing complexity, where stabilization is achieved by training a low-dimensional deep deterministic policy gradient agent using minimal computing resources.
Regressing the vector field of a dynamical system from a finite number of observed states is a natural way to learn surrogate models for such systems. As shown in [27, 15, 36, 16, 39, 29, 48], a simple and interpretable way to learn a dynamical system from data is to interpolate its vector-field with a data-adapted kernel which can be learned by using Kernel Flows [42]. The method of Kernel Flows is a trainable machine learning method that learns the optimal parameters of a kernel based on the premise that a kernel is good if there is no significant loss in accuracy if half of the data is used. The objective function could be a short-term prediction or some other objective (cf. [27] and [37] for other variants of Kernel Flows). However, this method is limited by the choice of the base kernel. In this paper, we introduce the method of Sparse Kernel Flows in order to learn the "best" kernel by starting from a large dictionary of kernels. It is based on sparsifying a kernel that is a linear combination of elemental kernels. We apply this approach to a library of 132 chaotic systems.
The Earth’s climate system displays substantial variability on spatial and temporal scales over many orders of magnitude. This complexity of the physics underlying the dynamics of the system poses a significant challenge for any attempt to quantitatively model the climate system, as in any computationally tractable climate model there will be a broad range of scales that can not be explicitly resolved but whose aggregate effects on the resolved scales must be accounted for. Much of uncertainty in predictions of weather and climate variability is linked to parameterization issues, combined with our lack of understanding of the interaction between the mesoscale, synoptic and planetary scales, simulation uncertainty due to the computer power. Water vapor, as a greenhouse gas in the atmosphere, plays an important role in climate system. But its dynamics are extremely complicated owing to the fact that it is controlled by both cloud microphysical processes and by dynamical processes, and that neither of these controls is well understood or represented in climate models. Climate is sensitive to poorly known microphysical and dynamical processes. In general circulation models (GCMs), water vapor modeling and representation is one of the major uncertainties. Better parameterization scheme will improve the prediction of weather and climate. In this thesis, we present an observational data-driven, stochastic analysis-based method to represent these small-scale process forcing in order to provide an alternative way for traditional deterministic moist convective parameterizations, which has some limitations and drawbacks. There are two parts in this thesis. The first part is to estimate the moist convection by means of observation data and theoretical model and then to examine its role on the distribution of water vapor. The second part is to present a stochastic convective parameterization scheme, to develop an idealized stochastic moist model and finally to test its validity. We have devised an observational data-driven, stochastic analysis-based method for parameterizing the convective moistening. Convective forcing is much temporal fluctuating and shows relative large correlation in time. We represent it by a correlated (or colored) noisy process, in terms of a fractional Brownian motion. Diagnostic analysis shows that convective forcing has a closely relationship with specific humidity, so multiplicative noise is used to develop an empirical formula between the convective forcing and specific humidity. Based on the convergence theorem of power variation and stochastic calculus, optimal parameters are obtained. An idealized theoretical stochastic model for water vapor evolution is developed. To validate our stochastic parameterization scheme, we compare the numerical predictions based on the stochastic Advection-Diffusion-Condensation model with the ERA-40 observation. Results demonstrate that even we have simplified treatment for water vapor model, the established stochastic model can capture the characters not only in first-moment of water vapor in El Ni˜no and La Ni˜na year but also in second-moment and probability distribution. These results are quite promising. Both mathematical theory and numerical experiments verify that this data-based stochastic parameterization scheme is reasonable.
We consider the data-driven approximation of the Koopman operator for stochastic differential equations on reproducing kernel Hilbert spaces (RKHS). Our focus is on the estimation error if the data are collected from long-term ergodic simulations. We derive both an exact expression for the variance of the kernel cross-covariance operator, measured in the Hilbert-Schmidt norm, and probabilistic bounds for the finite-data estimation error. Moreover, we derive a bound on the prediction error of observables in the RKHS using a finite Mercer series expansion. Further, assuming Koopman-invariance of the RKHS, we provide bounds on the full approximation error. Numerical experiments using the Ornstein-Uhlenbeck process illustrate our results.
The advances in data science and machine learning have resulted in significant improvements regarding the modeling and simulation of nonlinear dynamical systems. It is nowadays possible to make accurate predictions of complex systems such as the weather, disease models or the stock market. Predictive methods are often advertised to be useful for control, but the specifics are frequently left unanswered due to the higher system complexity, the requirement of larger data sets and an increased modeling effort. In other words, surrogate modeling for autonomous systems is much easier than for control systems. In this paper we present the framework QuaSiModO (Quantization-Simulation-Modeling-Optimization) to transform arbitrary predictive models into control systems and thus render the tremendous advances in data-driven surrogate modeling accessible for control. Our main contribution is that we trade control efficiency by autonomizing the dynamics – which yields mixed-integer control problems – to gain access to arbitrary, ready-to-use autonomous surrogate modeling techniques. We then recover the complexity of the original problem by leveraging recent results from mixed-integer optimization. The advantages of QuaSiModO are a linear increase in data requirements with respect to the control dimension, performance guarantees that rely exclusively on the accuracy of the predictive model in use, and little prior knowledge requirements in control theory to solve complex control problems.
A universal kernel is constructed whose sections approximate any causal and time-invariant filter in the fading memory category with inputs and outputs in a finite-dimensional Euclidean space. This kernel is built using the reservoir functional associated with a state-space representation of the Volterra series expansion available for any analytic fading memory filter. It is hence called the Volterra reservoir kernel. Even though the state-space representation and the corresponding reservoir feature map are defined on an infinite-dimensional tensor algebra space, the kernel map is characterized by explicit recursions that are readily computable for specific data sets when employed in estimation problems using the representer theorem. We showcase the performance of the Volterra reservoir kernel in a popular data science application in relation to bitcoin price prediction.
Most existing results in the analysis of quantum reservoir computing (QRC) systems with classical inputs have been obtained using the density matrix formalism. This paper shows that alternative representations can provide better insights when dealing with design and assessment questions. More explicitly, system isomorphisms have been established that unify the density matrix approach to QRC with the representation in the space of observables using Bloch vectors associated with Gell-Mann bases. It has been shown that these vector representations yield state-affine systems (SAS) previously introduced in the classical reservoir computing literature and for which numerous theoretical results have been established. This connection has been used to show that various statements in relation to the fading memory (FMP) and the echo state (ESP) properties are independent of the representation, and also to shed some light on fundamental questions in QRC theory in finite dimensions. In particular, a necessary and sufficient condition for the ESP and FMP to hold has been formulated, and contractive quantum channels that have exclusively trivial semi-infinite solutions have been characterized in terms of the existence of input-independent fixed points.
To what extent can we forecast a time series without fitting to historical data? Can universal patterns of probability help in this task? Deep relations between pattern Kolmogorov complexity and pattern probability have recently been used to make a priori probability predictions in a variety of systems in physics, biology and engineering. Here we study simplicity bias (SB) — an exponential upper bound decay in pattern probability with increasing complexity — in discretised time series extracted from the World Bank Open Data collection. We predict upper bounds on the probability of discretised series patterns, without fitting to trends in the data. Thus we perform a kind of ‘forecasting without training data’, predicting time series shape patterns a priori, but not the actual numerical value of the series. Additionally we make predictions about which of two discretised series is more likely with accuracy of ∼80%, much higher than a 50% baseline rate, just by using the complexity of each series. These results point to a promising perspective on practical time series forecasting and integration with machine learning methods.
The Koopman operator has become an essential tool for data-driven approximation of dynamical (control) systems, e.g., via extended dynamic mode decomposition. Despite its popularity, convergence results and, in particular, error bounds are still scarce. In this paper, we derive probabilistic bounds for the approximation error and the prediction error depending on the number of training data points, for both ordinary and stochastic differential equations while using either ergodic trajectories or i.i.d. samples. We illustrate these bounds by means of an example with the Ornstein–Uhlenbeck process. Moreover, we extend our analysis to (stochastic) nonlinear control-affine systems. We prove error estimates for a previously proposed approach that exploits the linearity of the Koopman generator to obtain a bilinear surrogate control system and, thus, circumvents the curse of dimensionality since the system is not autonomized by augmenting the state by the control inputs. To the best of our knowledge, this is the first finite-data error analysis in the stochastic and/or control setting. Finally, we demonstrate the effectiveness of the bilinear approach by comparing it with state-of-the-art techniques showing its superiority whenever state and control are coupled.
Modeling geophysical systems as dynamical systems and regressing their vector field from data is a simple way to learn emulators for such systems. We show that when the kernel of these emulators is also learned from data (using kernel flows, a variant of cross-validation), then the resulting data-driven models are not only faster than equation-based models but are easier to train than neural networks such as the long short-term memory neural network. In addition, they are also more accurate and predictive than the latter. When trained on observational data for the global sea-surface temperature, considerable gains are observed by the proposed technique in comparison to classical partial differential equation-based models in terms of forecast computational cost and accuracy. When trained on publicly available re-analysis data for temperatures in the North-American continent, we see significant improvements over climatology and persistence based forecast techniques.
Modelling geophysical processes as low-dimensional dynamical systems and regressing their vector field from data is a promising approach for learning emulators of such systems. We show that when the kernel of these emulators is also learned from data (using kernel flows, a variant of cross-validation), then the resulting data-driven models are not only faster than equation-based models but are easier to train than neural networks such as the long short-term memory neural network. In addition, they are also more accurate and predictive than the latter. When trained on geophysical observational data, for example the weekly averaged global sea-surface temperature, considerable gains are also observed by the proposed technique in comparison with classical partial differential equation-based models in terms of forecast computational cost and accuracy. When trained on publicly available re-analysis data for the daily temperature of the North American continent, we see significant improvements over classical baselines such as climatology and persistence-based forecast techniques. Although our experiments concern specific examples, the proposed approach is general, and our results support the viability of kernel methods (with learned kernels) for interpretable and computationally efficient geophysical forecasting for a large diversity of processes.
For dynamical systems with a non hyperbolic equilibrium, it is possible to significantly simplify the study of stability by means of the center manifold theory. This theory allows to isolate the complicated asymptotic behavior of the system close to the equilibrium point and to obtain meaningful predictions of its behavior by analyzing a reduced order system on the so-called center manifold. Since the center manifold is usually not known, good approximation methods are important as the center manifold theorem states that the stability properties of the origin of the reduced order system are the same as those of the origin of the full order system. In this work, we establish a data-based version of the center manifold theorem that works by considering an approximation in place of an exact manifold. Also the error between the approximated and the original reduced dynamics are quantified. We then use an apposite data-based kernel method to construct a suitable approximation of the manifold close to the equilibrium, which is compatible with our general error theory. The data are collected by repeated numerical simulation of the full system by means of a high-accuracy solver, which generates sets of discrete trajectories that are then used as a training set. The method is tested on different examples which show promising performance and good accuracy.
Dear all, I am pleased to announce that the Machine Learning and Dynamical Systems (MLDS) seminars, hosted by The Alan Turing Institute, will resume on October 27th at 5.00 pm UK time. Our first speaker will be Frédéric BARBARESCO who will be speaking on "Souriau's Symplectic Foliation Structure of Dynamical Systems and Machine Learning on Lie Groups”. More details about this talk are at https://lnkd.in/gPGUFwa3 The talks will be in the usual Zoom meeting room at https://lnkd.in/gQUKXvq8 In case you missed it, the recordings and slides of the talks given at the 3rd symposium on MLDS are now at https://lnkd.in/gC2beP8V and https://lnkd.in/etZNf6kE As announced during the symposium, Marian Mrozek will give monthly lectures on “Combinatorial Topological Dynamics” starting in November (the day and time are still undecided). To get updates about future activities on MLDS, please fill the form at https://lnkd.in/g4AEUCiK

A simple and interpretable way to learn a dynamical system from data is to interpolate its vector-field with a kernel. In particular, this strategy is highly efficient (both in terms of accuracy and complexity) when the kernel is data-adapted using Kernel Flows (KF) [OY19] (which uses gradient-based optimization to learn a kernel based on the premise that a kernel is good if there is no significant loss in accuracy if half of the data is used for interpolation). In this work, we extend previous work on learning dynamical systems using Kernel Flows [HO21, DHS + 21, LBHO21, DTL + 21, Owh21a] to the case of learning vector-valued dynamical systems from time-series observations that are partial/incomplete in the state space. The method combines Kernel Flows with Computational Graph Completion.
Data-driven models for nonlinear dynamical systems based on approximating the underlying Koopman operator or generator have proven to be successful tools for forecasting, feature learning, state estimation, and control. It has become well known that the Koopman generators for control-affine systems also have affine dependence on the input, leading to convenient finite-dimensional bilinear approximations of the dynamics. Yet there are still two main obstacles that limit the scope of current approaches for approximating the Koopman generators of systems with actuation. First, the performance of existing methods depends heavily on the choice of basis functions over which the Koopman generator is to be approximated; and there is currently no universal way to choose them for systems that are not measure preserving. Secondly, if we do not observe the full state, we may not gain access to a sufficiently rich collection of such functions to describe the dynamics. This is because the commonly used method of forming time-delayed observables fails when there is actuation. To remedy these issues, we write the dynamics of observables governed by the Koopman generator as a bilinear hidden Markov model, and determine the model parameters using the expectation-maximization (EM) algorithm. The E-step involves a standard Kalman filter and smoother, while the M-step resembles control-affine dynamic mode decomposition for the generator. We demonstrate the performance of this method on three examples, including recovery of a finite-dimensional Koopman-invariant subspace for an actuated system with a slow manifold; estimation of Koopman eigenfunctions for the unforced Duffing equation; and model-predictive control of a fluidic pinball system based only on noisy observations of lift and drag.
Reservoir computing systems are constructed using a driven dynamical system in which external inputs can alter the evolving states of a system. These paradigms are used in information processing, machine learning, and computation. A fundamental question that needs to be addressed in this framework is the statistical relationship between the input and the system states. This paper provides conditions that guarantee the existence and uniqueness of asymptotically invariant measures for driven systems and shows that their dependence on the input process is continuous when the set of input and output processes are endowed with the Wasserstein distance. The main tool in these developments is the characterization of those invariant measures as fixed points of naturally defined Foias operators that appear in this context and which have been profusely studied in the paper. Those fixed points are obtained by imposing a newly introduced stochastic state contractivity on the driven system that is readily verifiable in examples. Stochastic state contractivity can be satisfied by systems that are not state-contractive, which is a need typically evoked to guarantee the echo state property in reservoir computing. As a result, it may actually be satisfied even if the echo state property is not present.
Dear all,
I am pleased to invite you to submit original research for the special issue in Physica D on Machine Learning and Dynamical Systems. Authors should select article type ‘VSI: MLDS’ during the submission.

Hi all,
I am looking for a graduate student, preferably at the PhD level, to work on a funded research project involving recurrent neural networks and dynamical systems. Some details about the project are described in the attached ad. Please contact me if you are interested and want more info.
Best regards,
Lorenzo Livi

In recent years, the artificial intelligence community has seen a continuous interest in research aimed at investigating dynamical aspects of both training procedures and machine learning models. Of particular interest among recurrent neural networks, we have the Reservoir Computing (RC) paradigm characterized by conceptual simplicity and a fast training scheme. Yet, the guiding principles under which RC operates are only partially understood. In this work, we analyze the role played by Generalized Synchronization (GS) when training a RC to solve a generic task. In particular, we show how GS allows the reservoir to correctly encode the system generating the input signal into its dynamics. We also discuss necessary and sufficient conditions for the learning to be feasible in this approach. Moreover, we explore the role that ergodicity plays in this process, showing how its presence allows the learning outcome to apply to multiple input trajectories. Finally, we show that satisfaction of the GS can be measured by means of the mutual false nearest neighbors index, which makes effective to practitioners theoretical derivations.
Training a residual neural network with L2 regularization on weights and biases is equivalent to minimizing a discrete least action principle and to controlling a discrete Hamiltonian system representing the propagation of input data across layers. The kernel/feature map analysis of this Hamiltonian system suggests a mean-field limit for trained weights and biases as the number of data points goes to infinity. The purpose of this paper is to investigate this mean-field limit and illustrate its existence through numerical experiments and analysis (for simple kernels).
Koopman operator theory has been successfully applied to problems from various research areas such as fluid dynamics, molecular dynamics, climate science, engineering, and biology. Applications include detecting metastable or coherent sets, coarse-graining, system identification, and control. There is an intricate connection between dynamical systems driven by stochastic differential equations and quantum mechanics. In this paper, we compare the ground-state transformation and Nelson's stochastic mechanics and demonstrate how data-driven methods developed for the approximation of the Koopman operator can be used to analyze quantum physics problems. Moreover, we exploit the relationship between Schrödinger operators and stochastic control problems to show that modern data-driven methods for stochastic control can be used to solve the stationary or imaginary-time Schrödinger equation. Our findings open up a new avenue toward solving Schrödinger's equation using recently developed tools from data science.
Starting September 1st, 2022, there are
three full-time PhD student positions
available in the Data Science for Engineering group (www.cs.upb.de/dse) at Paderborn University, Germany, within a newly founded Junior Research Group on "Multicriteria Machine Learning - Efficiency, Robustness, Interactivity and System Knowledge".
The research topic is centered around the development of multicriteria and physics-informed deep learning algorithms. More specifically, you will work on
• The development of efficient optimization algorithms for training neural networks regarding multiple conflicting objective functions
• The consideration of system knowledge, e.g., in the form of conservation laws or differential equations
• Interactive learning / adaptation of deep neural networks
• The development and publication of open-source code
• The publication of the results in scientific journals or international conferences
Applicants should have a Master's degree (or comparable) in Mathematics, Computer Science, Physics or the Engineering Sciences. Applications are welcome until July 15th, 2022. For more details, please refer to the official job posting (www.upb.de/fileadmin/zv/4-4/stellenangebote/Kennziffer5341_EN.pdf).

We propose a machine learning (ML) non-Markovian closure modelling framework for accurate predictions of statistical responses of turbulent dynamical systems subjected to external forcings. One of the difficulties in this statistical closure problem is the lack of training data, which is a configuration that is not desirable in supervised learning with neural network models. In this study with the 40-dimensional Lorenz-96 model, the shortage of data is due to the stationarity of the statistics beyond the decorrelation time. Thus, the only informative content in the training data is from the short-time transient statistics. We adopt a unified closure framework on various truncation regimes, including and excluding the detailed dynamical equations for the variances. The closure framework employs a Long-Short-Term-Memory architecture to represent the higher-order unresolved statistical feedbacks with a choice of ansatz that accounts for the intrinsic instability yet produces stable long-time predictions. We found that this unified agnostic ML approach performs well under various truncation scenarios. Numerically, it is shown that the ML closure model can accurately predict the long-time statistical responses subjected to various time-dependent external forces that have larger maximum forcing amplitudes and are not in the training dataset. This article is part of the theme issue ‘Data-driven prediction in dynamical systems’.
Two research opportunities at the Division of Mathematical Sciences of the Nanyang Technological University (NTU, Singapore)
I am looking for candidates to fill two positions in my research group at the graduate student and postdoctoral levels. In both cases, I am looking for mathematicians, statisticians, physicists, or candidates with diplomas in closely related fields, interested in research at the interface between geometry, dynamics, and learning.
Even though it is not a necessary condition, candidates with expertise in reservoir computing, geometric mechanics, variational and structure-preserving integration, dynamical systems, or statistical learning will be highly considered.
-Graduate scholarship: this is a four-year grant towards the obtention of a PhD in mathematics. Candidates must apply for admission to NTU by the beginning of September. Candidates holding diplomas from certain countries may need to provide GRE scores and/or proofs of proficiency in English. The successful candidate will start his/her graduate studies in January 2023.
-Postdoctoral position: this is two-year full-time research position with no teaching responsibilities. Applicants should hold a PhD diploma already at the time of application. For administrative reasons, the starting date for this position has to be fixed for a date before the end of December 2022.
Interested candidates should contact Prof. Juan-Pablo Ortega at Juan-Pablo.Ortega@ntu.edu.sg.

Dear all, I am pleased to announce that registration is now open for the Third Symposium on Machine Learning and Dynamical Systems at the Fields Institute (Toronto), September 26-30, 2022. There are two options: online and in-person.  If you would like to register as an online participant, simply do not select any of the tickets (attendance ticket and dinner banquet). This way, the system will not ask you for any fee payment. You will receive the event Zoom link upon successful registration. Registration link is at http://www.fields.utoronto.ca/cgi-bin/register?form_selection=3rd-machine-learning Best wishes, Boumediene

This technical note presents an application of kernel model decomposition (KMD) for detecting critical transitions in some fast-slow random dynamical systems. The approach rests upon using KMD for reconstructing an observable with a novel data-based time-frequency-phase kernel that allows to approximate signals with critical transitions. In particular, we apply the developed method for approximating the solution and detecting critical transitions in some pro-totypical slow-fast SDEs with critical transitions. We also apply it to detecting seizures in a multi-scale mesoscale model of brain activity. Keywords : Kernel Mode Decomposition (KMD), data-based kernels, micro-local kernel design, critical transitions, slow-fast stochastic differential equations, learning signal from data, learning noise from data.
To what extent can we forecast a time series without fitting to historical data? Can universal patterns of probability help in this task? Deep relations between pattern Kolmogorov complexity and pattern probability have recently been used to make a priori probability predictions in a variety of systems in physics, biology and engineering. Here we study simplicity bias (SB)-an exponential upper bound decay in pattern probability with increasing complexity-in discretised time series extracted from the World Bank Open Data collection. We predict upper bounds on the probability of discretised series patterns, without fitting to trends in the data. Thus we perform a kind of 'forecasting without training data'. Additionally we make predictions about which of two discretised series is more likely with accuracy of ∼80%, much higher than a 50% baseline rate, just by using the complexity of each series. These results point to a promising perspective on practical time series forecasting and integration with machine learning methods.
We present an approach for guaranteed constraint satisfaction by means of data-based optimal control, where the model is unknown and has to be obtained from measurement data. To this end, we utilize the Koopman framework and an eDMD-based bilinear surrogate modeling approach for control systems to show an error bound on predicted observables, i.e., functions of the state. This result is then applied to the constraints of the optimal control problem to show that satisfaction of tightened constraints in the purely data-based surrogate model implies constraint satisfaction for the original system.
Koopman operator theory has been successfully applied to problems from various research areas such as fluid dynamics, molecular dynamics, climate science, engineering, and biology. Applications include detecting metastable or coherent sets, coarse-graining, system identification, and control. There is an intricate connection between dynamical systems driven by stochastic differential equations and quantum mechanics. In this paper, we compare the ground-state transformation and Nelson's stochastic mechanics and demonstrate how data-driven methods developed for the approximation of the Koopman operator can be used to analyze quantum physics problems. Moreover, we exploit the relationship between Schr\"odinger operators and stochastic control problems to show that modern data-driven methods for stochastic control can be used to solve the stationary or imaginary-time Schr\"odinger equation. Our findings open up a new avenue towards solving Schr\"odinger's equation using recently developed tools from data science.
Dear all, I am pleased to announce that the Third Symposium on Machine Learning and Dynamical Systems will be at the Fields Institute in Toronto from Sep. 26 to Sept. 30, 2022. This event is scheduled to be in-person. Please contact me if you're interested in giving a talk or presenting a poster. The event's webpage is going to be at https://sites.google.com/site/boumedienehamzi/home/third-symposium-on-machine-learning-and-dynamical-systems Looking forward to your contributions, Best wishes, Boumediene

Dear all, There will be some MLDS seminars this month: i. On Thursday 20 Jan. at 17:00 UK, 9:00am PST, Omri Azencot will talk about "A Koopman Approach to Understanding Sequence Neural Models”. ii. On Thursday 27 Jan. at 17:00 UK, 9:00am PST, Hedy Attouch will give another lecture on “Dynamical Systems and Optimisation”. More details are at
Talks will be in the usual Zoom link
Previous talks are at

In previous work, we showed that learning dynamical system [21] with kernel methods can achieve state of the art, both in terms of accuracy and complexity, for predicting climate/weather time series [20], when the kernel is also learned from data. While the kernels considered in previous work were parametric, in this follow-up paper, we test a non-parametric approach and tune warping kernels (with kernel flows, a variant of cross-validation) for learning prototypical dynamical systems.
A simple and interpretable way to learn a dynamical system from data is to interpolate its vector-field with a kernel. In particular, this strategy is highly efficient (both in terms of accuracy and complexity) when the kernel is data-adapted using Kernel Flows (KF) [34] (which uses gradient-based optimization to learn a kernel based on the premise that a kernel is good if there is no significant loss in accuracy if half of the data is used for interpolation). Despite its previous successes, this strategy (based on interpolating the vector field driving the dynamical system) breaks down when the observed time series is not regularly sampled in time. In this work, we propose to address this problem by directly approximating the vector field of the dynamical system by incorporating time differences between observations in the (KF) data-adapted kernels. We compare our approach with the classical one over different benchmark dynamical systems and show that it significantly improves the forecasting accuracy while remaining simple, fast, and robust.
In this paper, we consider the density estimation problem associated with the stationary measure of ergodic Itô diffusions from a discrete-time series that approximate the solutions of the stochastic differential equations. To take an advantage of the characterization of density function through the stationary solution of a parabolic-type Fokker-Planck PDE, we proceed as follows. First, we employ deep neural networks to approximate the drift and diffusion terms of the SDE by solving appropriate supervised learning tasks. Subsequently, we solve a steady-state Fokker-Plank equation associated with the estimated drift and diffusion coefficients with a neural-network-based least-squares method. We establish the convergence of the proposed scheme under appropriate mathematical assumptions, accounting for the generalization errors induced by regressing the drift and diffusion coefficients, and the PDE solvers. This theoretical study relies on a recent perturbation theory of Markov chain result that shows a linear dependence of the density estimation to the error in estimating the drift term, and generalization error results of nonparametric regression and of PDE regression solution obtained with neural-network models. The effectiveness of this method is reflected by numerical simulations of a two-dimensional Student's t distribution and a 20-dimensional Langevin dynamics.
We propose a Machine Learning (ML) non-Markovian closure modeling framework for accurate predictions of statistical responses of turbulent dynamical systems subjected to external forcings. One of the difficulties in this statistical closure problem is the lack of training data, which is a configuration that is not desirable in supervised learning with neural network models. In this study with the 40-dimensional Lorenz-96 model, the shortage of data (in temporal) is due to the stationarity of the statistics beyond the decorrelation time, thus, the only informative content in the training data is on short-time transient statistics. We adopted a unified closure framework on various truncation regimes, including and excluding the detailed dynamical equations for the variances. The closure frameworks employ a Long-Short-Term-Memory architecture to represent the higher-order unresolved statistical feedbacks with careful consideration to account for the intrinsic instability yet producing stable long-time predictions. We found that this unified agnostic ML approach performs well under various truncation scenarios. Numerically, the ML closure model can accurately predict the long-time statistical responses subjected to various time-dependent external forces that are not (and maximum forcing amplitudes that are relatively larger than those) in the training dataset.
The Koopman operator has become an essential tool for data-driven approximation of dynamical (control) systems in recent years, e.g., via extended dynamic mode decomposition. Despite its popularity, convergence results and, in particular, error bounds are still quite scarce. In this paper, we derive probabilistic bounds for the approximation error and the prediction error depending on the number of training data points; for both ordinary and stochastic differential equations. Moreover, we extend our analysis to nonlinear control-affine systems using either ergodic trajectories or i.i.d. samples. Here, we exploit the linearity of the Koopman generator to obtain a bilinear system and, thus, circumvent the curse of dimensionality since we do not autonomize the system by augmenting the state by the control inputs. To the best of our knowledge, this is the first finite-data error analysis in the stochastic and/or control setting. Finally, we demonstrate the effectiveness of the proposed approach by comparing it with state-of-the-art techniques showing its superiority whenever state and control are coupled.
This paper shows that the celebrated Embedding Theorem of Takens is a particular case of a much more general statement according to which, randomly generated linear state-space representations of generic observations of an invertible dynamical system carry in their wake an embedding of the phase space dynamics into the chosen Euclidean state space. This embedding coincides with a natural generalized synchronization that arises in this setup and that yields a topological conjugacy between the state-space dynamics driven by the generic observations of the dynamical system and the dynamical system itself. This result provides additional tools for the representation, learning, and analysis of chaotic attractors and sheds additional light on the reservoir computing phenomenon that appears in the context of recurrent neural networks.
This event is the first of a new seminar series or­ga­nized by the IFAC TC on Optimal control.
Details:
• Date: July 8th 2021, 14h00 –17h30 (CET)
• Zoom: https://tu-dortmund.zoom.us/s/99932731634; no registration required
• Passcode: 540526
• Organizers: Timm Faulwasser (TU Dortmund, Germany) & Karl Worthmann (TU Ilmenau, Germany)
Schedule:
• 14h00–14h30: Gradient-enriched machine learning control –Taming turbulence made efficient, easy and fast! (Bernd Noack, Harbin Institute of Technology, China)
• 14h30–15h00: Convolutional autoencoders for low-dimensional parameterizations of Navier-Stokes flow (Jan Heiland, MPI Magdeburg, Germany)
• 15h00–15h30: Three perspectives on data-based optimal control (Matthias Müller, LU Hannover Germany)
• 15h30 –16h00: Coffee break
• 16h00–16h30: Data-Driven Skill Learning (Jan Peters, TU Darmstadt, Germany)
• 16h30–17h00: A deep neural network approach for computing Lyapunov functions (Lars Grüne, U Bayreuth, Germany)
• 17h00–17h30: On the universal transformation of data-driven models to control systems (Sebastian Peitz, U Paderborn, Germany)
The CoViD-19 pandemic continues to jeopardize many con­fe­rence activities. At the same time, all of us have also experienced successful editions of online events. Hence, the IFAC TC on Optimal Control is happy to announce its virtual seminar series comprising 2-3 events per year.

I am pleased to invite you to submit papers to the special issue of the Journal of Scientific Computing on "Beyond traditional AI: the impact of Machine Learning on Scientific Computing" that I am co-editing with Francesco Piccialli (University of Naples Federico II), Salvatore Cuomo (University of Naples Federico II) and Jan Hesthaven (EPFL)

This paper studies the theoretical underpinnings of machine learning of ergodic It\^o diffusions. The objective is to understand the convergence properties of the invariant statistics when the underlying system of stochastic differential equations (SDEs) is empirically estimated with a supervised regression framework. Using the perturbation theory of ergodic Markov chains and the linear response theory, we deduce a linear dependence of the errors of one-point and two-point invariant statistics on the error in the learning of the drift and diffusion coefficients. More importantly, our study shows that the usual $L^2$-norm characterization of the learning generalization error is insufficient for achieving this linear dependence result. We find that sufficient conditions for such a linear dependence result are through learning algorithms that produce a uniformly Lipschitz and consistent estimator in the hypothesis space that retains certain characteristics of the drift coefficients, such as the usual linear growth condition that guarantees the existence of solutions of the underlying SDEs. We examine these conditions on two well-understood learning algorithms: the kernel-based spectral regression method and the shallow random neural networks with the ReLU activation function.
Koopman operator theory, a powerful framework for discovering the underlying dynamics of nonlinear dynamical systems, was recently shown to be intimately connected with neural network training. In this work, we take the first steps in making use of this connection. As Koopman operator theory is a linear theory, a successful implementation of it in evolving network weights and biases offers the promise of accelerated training, especially in the context of deep networks, where optimization is inherently a non-convex problem. We show that Koopman operator theoretic methods allow for accurate predictions of weights and biases of feedforward, fully connected deep networks over a non-trivial range of training time. During this window, we find that our approach is >10x faster than various gradient descent based methods (e.g. Adam, Adadelta, Adagrad), in line with our complexity analysis. We end by highlighting open questions in this exciting intersection between dynamical systems and neural network theory. We highlight additional methods by which our results could be expanded to broader classes of networks and larger training intervals, which shall be the focus of future work.
Date: 04 May 2021 at 12:00 pm UK time.
Zoom meeting room: https://zoom.us/j/5261634867
Title: Extracting the ENSO cycle from observations and the birth and death of coherent sets. Abstract: In the first part of the talk I will describe how the spectrum and eigenfunctions of transfer operators can extract cycles from possibly high-dimensional spatio-temporal information.  This will be illustrated by extracting a canonical cycle for the El-Nino Southern Oscillation (ENSO) from sea-surface temperature data.  We are able to produce a "rectified" cycle that provides more detail in rapid transitions. In the second part of the talk I will introduce new theory to handle the birth and death of coherent sets. Coherent sets are regions of phase space that minimally deform and mix under general aperiodic advection. They are by definition material, evolving with the underlying flow or dynamics. Methods for identifying coherent sets, and Lagrangian coherent structures more generally, rely on coherence being present throughout a specified time interval. In reality, coherent structures are ephemeral, continually appearing and disappearing. I will present a new construction, based on the dynamic Laplacian, that relaxes this materiality requirement in a natural way, and provides the means to resolve the births, lifetimes, and deaths of coherent structures.

Many problems in science and engineering require the efficient numerical approximation of integrals, a particularly important application being the numerical solution of initial value problems for differential equations. For complex systems, an equidistant discretization is often inadvisable, as it either results in prohibitively large errors or computational effort. To this end, adaptive schemes have been developed that rely on error estimators based on Taylor series expansions. While these estimators a) rely on strong smoothness assumptions and b) may still result in erroneous steps for complex systems (and thus require step rejection mechanisms), we here propose a data-driven time stepping scheme based on machine learning, and more specifically on reinforcement learning (RL) and meta-learning. First, one or several (in the case of non-smooth or hybrid systems) base learners are trained using RL. Then, a meta-learner is trained which (depending on the system state) selects the base learner that appears to be optimal for the current situation. Several examples including both smooth and non-smooth problems demonstrate the superior performance of our approach over state-of-the-art numerical schemes. The code is available under https://github.com/lueckem/quadrature-ML.
Date: Wed Apr 14, 16:00 - 17:30 UK time.
Speaker: Romit Maulik
Title: Incorporating inductive biases for the surrogate modeling of dynamical systems
Abstract: In recent times, there has been great excitement about the use of surrogate modeling across various applications to bypass the traditional bottlenecks of numerical discretization. However, naive deployments of surrogate models are limited due to the fact that they do not preserve certain desirable properties, for instance, conservation of energy. This has implications for downstream tasks such as ensemble evaluations for UQ or sensitivity analyses, particularly in the case of deep learning surrogates which suffer from exceptionally poor performance during extrapolation. In this talk, we shall discuss results from two projects that aim to construct neural network surrogates for dynamical systems. The first part of the talk shall introduce the development of custom neural architectures that exactly reproduce Hamiltonian mechanics. This neural architecture is then used to generate Poincare maps for toroidal magnetic fields in an accelerated manner compared to a field-line integration method. Secondly, we shall investigate the learning of stochastic processes using multivariate temporal normalizing flows that are augmented to prevent scaling and translation in the temporal dimension. These are used to estimate solutions of non-local Fokker-Planck equations for stochastic processes with anomalous diffusion. The emulated stochastic processes shall then be utilized to identify the drift and diffusion in the stochastic process under mild assumptions.
The talk will be on the Zoom meeting room: https://zoom.us/j/5261634867 -

Dear all, You are cordially invited to attend three (online) seminars on "Topics at the Intersections of Machine Learning and Ergodic Theory” on Wednesday April 7th from 1:00pm UK time to 5:30pm UK time. Details are at https://agora.stream/SIG%20on%20Machine%20Learning%20and%20Dynamical%20Systems Talks will be at the Zoom meeting room: https://zoom.us/j/5261634867 -

Speaker: Johan Suykens
Date and Time: Wednesday 31 March at 4:00pm UK time, 11:00am EST.
Title: Kernel machines and dynamical systems modelling
Abstract: In the first part of this talk we discuss function estimation methods from the perspective of primal and dual model representations. An advantage of this kernel-based framework is that one can either work with explicit or implicit feature maps related to kernel functions. This enables to combine insights from parametric and non-parametric statistics and making several connections between neural networks, deep learning and kernel machines. In the second part we discuss its application to dynamical systems modelling with the use of different model structures, for input-output as well as nonlinear state space descriptions.
Zoom meeting room: https://zoom.us/j/5261634867 -

Speaker: Prof. Sebastian van Strien,  Imperial College London
Date and Time: Wednesday 17 March 2021 at 10:00am UK time.
Title:  Dynamics of learning algorithms in strategic environments.
Abstract: There are quite a few algorithms which aim to learn how players should “learn” to play a game. In this talk we will focus on several of these, namely,  best response dynamics, fictitious play, replicator dynamics, reinforcement learning, no regret learning.  Although these dynamics originate in different fields (economics, biology, computer science, game theory), there are surprising similarities between them. We will also discuss whether it is advantageous (in terms of payoff) for a player to use a learning algorithm over playing the Nash Equilibrium (NE).”
This talk will be the first of a sequence of talks on dynamics and games, organised by Aamal Hussain.
Zoom meeting room: https://zoom.us/j/5261634867
Meeting ID: 526 163 4867 One tap mobile +13462487799,,5261634867# US (Houston) +16699006833,,5261634867# US (San Jose) Dial by your location        +1 346 248 7799 US (Houston)        +1 669 900 6833 US (San Jose)        +1 929 205 6099 US (New York)        +1 253 215 8782 US (Tacoma)        +1 301 715 8592 US (Germantown)        +1 312 626 6799 US (Chicago) Meeting ID: 526 163 4867 Find your local number: https://zoom.us/u/adJzYyqVx6

Date and Time: March 9th at 1:00pm UK time
Abstract: In the spirit of optimal approximation and reduced order modelling the goal of DMD methods and variants is to describe the dynamical evolution as a linear evolution in an appropriately transformed lower rank space, as best as possible. That Koopman eigenfunctions follow a linear PDE that is solvable by the method of characteristics yields several interesting relationships between geometric and algebraic properties. We focus on contrasting cardinality, algebraic multiplicity and other geometric aspects with the introduction of an equivalence class, “primary eigenfunctions,” for those eigenfunctions with identical sets of level sets.  We present a construction that leads to functions on the data surface that yield optimal Koopman eigenfunction DMD, (oKEEDMD).  We will also describe that disparate systems can be “matched” transformed by a diffeomorphism constructed via eigenfunctions from each system, a reinterpretation of integrability, computationally stated by our “matching extended dynamic mode decomposition (EDMD)” (EDMD-M).
Zoom meeting room: https://zoom.us/j/5261634867 Meeting ID: 526 163 4867 One tap mobile +13462487799,,5261634867# US (Houston) +16699006833,,5261634867# US (San Jose) Dial by your location        +1 346 248 7799 US (Houston)        +1 669 900 6833 US (San Jose)        +1 929 205 6099 US (New York)        +1 253 215 8782 US (Tacoma)        +1 301 715 8592 US (Germantown)        +1 312 626 6799 US (Chicago) Meeting ID: 526 163 4867 Find your local number: https://zoom.us/u/adJzYyqVx6

Day and time: Thursday March 4th at 4:00pm UK time [new date]
Title: Explainable and Reliable Learning of Dynamic Processes with Reservoir Computing
Abstract: Many dynamical problems in engineering (including financial), control theory, signal processing, time series analysis, and forecasting can be described using input/output (IO) systems. Whenever a true functional IO relation cannot be derived from first principles, specific families of state-space systems can be used as universal approximants in different setups. In this talk we will focus on the so-called Reservoir Computing (RC) systems that have as defining features the ease of implementation due to the fact that some of their components (usually the state map) are randomly generated. From a machine learning (ML) perspective, RC systems can be seen as recurrent neural networks with randomly generated and non-trainable weights and a simple-to-train readout layer (often a linear map). RC systems serve as efficient, randomized, online computational tools for dynamic processes and constitute an explainable and reliable ML paradigm. We will discuss some theoretical developments, connections with contributions in other fields, and details of applications of RC systems for data processing. Zoom meeting room: https://zoom.us/j/5261634867 Meeting ID: 526 163 4867 One tap mobile +13462487799,,5261634867# US (Houston) +16699006833,,5261634867# US (San Jose) Dial by your location        +1 346 248 7799 US (Houston)        +1 669 900 6833 US (San Jose)        +1 929 205 6099 US (New York)        +1 253 215 8782 US (Tacoma)        +1 301 715 8592 US (Germantown)      +1 312 626 6799 US (Chicago) Meeting ID: 526 163 4867 Find your local number: https://zoom.us/u/adJzYyqVx6

Modeling geophysical systems as dynamical systems and regressing their vector field from data is a simple way to learn emulators for such systems. We show that when the kernel of these emulators is also learned from data (using kernel flows, a variant of cross-validation), then the resulting data-driven models are not only faster than equation-based models but are easier to train than neural networks such as the long short-term memory neural network. In addition, they are also more accurate and predictive than the latter. When trained on observational data for the global sea-surface temperature, considerable gains are observed by the proposed technique in comparison to classical partial differential equation based models in terms of forecast computational cost and accuracy. When trained on publicly available re-analysis data for temperatures in the North-American continent, we see significant improvements over climatology and persistence based forecast techniques.
Title: Kolmogorov complexity and explaining why neural networks generalise so well (using a function based picture)
Abstract: One of the most surprising properties of deep neural networks (DNNs) is that they perform best in the overparameterized regime. We are all taught in a basic statistics class that having more parameters than data points is a terrible idea. This intuition can be formalised in standard learning theory approaches, based for example on model capacity, which also predict that DNNs should heavily over-fit in this regime, and therefore not generalise at all. So why do DNNs work so well in a regime where theory says they should fail? A popular strategy in the literature has been to look for some dynamical property of stochastic gradient descent (SGD) acting on a non-convex loss-landscape in order to explain the bias towards functions with good generalisation. Here I will present a different argument, namely that DNNs are implicitly biased towards simple (low Kolmogorov complexity) solutions at initialisation [1]. This Occam's razor like effect fundamentally arises from a version of the coding theorem of algorithmic information theory, applied to input-output maps [2]. We also show that for DNNs in the chaotic regime, the bias can be tuned away, and the good generalisation disappears. For highly biased loss-landscapes, SGD converges to functions with a probability that can, to first order, be approximated by the probability at initialisation [3]. Thus, even though, to second order, tweaking optimisation hyperparameters can improve performance, SGD itself does not explain why DNNs generalise well in the overparameterized regime. Instead it is the intrinsic bias towards simple (low Kolmogorov complexity) functions that explains why they do not overfit. Finally, this function based picture allows us to derive rigorous PAC-Bayes bounds that closely track DNN learning curves and can be used to rationalise differences in performance across architectures. [1] Deep learning generalizes because the parameter-function map is biased towards simple functions, Guillermo Valle Pérez, Chico Q. Camargo, Ard A. Louis arxiv:1805.08522 [2] Input–output maps are strongly biased towards simple outputs, K. Dingle, C. Q. Camargo and A. A. Louis Nature Comm. 9, 761 (2018) [3] Is SGD a Bayesian sampler? Well, almost, Chris Mingard, Guillermo Valle-Pérez, Joar Skalse, Ard A. Louis arxiv:2006.15191

As in almost every other branch of science, the major advances in data science and machine learning have also resulted in significant improvements regarding the modeling and simulation of nonlinear dynamical systems. It is nowadays possible to make accurate medium to long-term predictions of highly complex systems such as the weather, the dynamics within a nuclear fusion reactor, of disease models or the stock market in a very efficient manner. In many cases, predictive methods are advertised to ultimately be useful for control, as the control of high-dimensional nonlinear systems is an engineering grand challenge with huge potential in areas such as clean and efficient energy production, or the development of advanced medical devices. However, the question of how to use a predictive model for control is often left unanswered due to the associated challenges, namely a significantly higher system complexity, the requirement of much larger data sets and an increased and often problem-specific modeling effort. To solve these issues, we present a universal framework (which we call QuaSiModO: Quantization-Simulation-Modeling-Optimization) to transform arbitrary predictive models into control systems and use them for feedback control. The advantages of our approach are a linear increase in data requirements with respect to the control dimension, performance guarantees that rely exclusively on the accuracy of the predictive model, and only little prior knowledge requirements in control theory to solve complex control problems. In particular the latter point is of key importance to enable a large number of researchers and practitioners to exploit the ever increasing capabilities of predictive models for control in a straight-forward and systematic fashion.
Regressing the vector field of a dynamical system from a finite number of observed states is a natural way to learn surrogate models for such systems. We present variants of cross-validation (Kernel Flows [31] and its variants based on Maximum Mean Discrepancy and Lyapunov exponents) as simple approaches for learning the kernel used in these emulators.
Regressing the vector field of a dynamical system from a finite number of observed states is a natural way to learn surrogate models for such systems. We present variants of cross-validation (Kernel Flows (Owhadi and Yoo, 2019) and its variants based on Maximum Mean Discrepancy and Lyapunov exponents) as simple approaches for learning the kernel used in these emulators.
Date and Time: Tuesday Feb. 16th at 1:00pm UK time
Title: Spectral approximation of transfer operators using dynamical mode decomposition
Abstract: A class of algorithms known as extended dynamic mode decomposition (EDMD) has been shown to be empirically effective at identifying intrinsic modes of a dynamical system from time-series data. The algorithm amounts to constructing an NxN matrix by observing a dynamical system through N observables at a sequence of M phase space points. While empirically successful, there are few rigorous results on the convergence of this algorithm. Moreover, the relationship between M and N remains obscure. In this talk I will focus on analytic expanding circle maps and show that spectral data of the EDMD matrices can be linked to spectral data (for example eigenvalues, also known as Ruelle resonances in this context) of the Perron-Frobenius operator or its dual, the Koopman operator, associated to the underlying map, provided both operators are considered on suitable function spaces. In particular, I will show that for equidistantly chosen phase space points, spectra of the EDMD matrices converge to the Ruelle resonances at exponential speed in N, provided that the number of data points M is chosen to be a constant multiple of N, where the constant depends on complex expansion properties of the underlying analytic circle map. This is joint work with Wolfram Just and Julia Slipantschuk. Zoom meeting room: https://zoom.us/j/5261634867 Meeting ID: 526 163 4867 One tap mobile +13462487799,,5261634867# US (Houston) +16699006833,,5261634867# US (San Jose) Dial by your location        +1 346 248 7799 US (Houston)        +1 669 900 6833 US (San Jose)        +1 929 205 6099 US (New York)        +1 253 215 8782 US (Tacoma)        +1 301 715 8592 US (Germantown)        +1 312 626 6799 US (Chicago) Meeting ID: 526 163 4867 Find your local number: https://zoom.us/u/adJzYyqVx6

Thursday, Feb. 11, 2021 at 3:00pm (PST) via Zoom Abstract: Neural networks for control system applications have been studied for decades.  During the last few years, research activities focusing on the interplay of deep learning and control theory have attracted increasing attention. In the first part of the talk, we review some selected topics at the intersection of deep learning and control theory, such as how does deep learning help to overcome unmet challenges in control theory? How does control theory help to improve the design and training of neural networks? Why is deep learning able to handle high dimensional models in dynamics and control? In the second part of the talk, we present some theorems and examples from our recent research in which deep neural networks are applied to solve problems of optimal control. Join ZoomGov Meeting https://nps-edu.zoomgov.com/j/1601490177 Meeting ID: 160 149 0177 Passcode: 9r*Ti2Nj%p One tap mobile +16692545252,,1601490177#,,,,*5465043168# US (San Jose) +16692161590,,1601490177#,,,,*5465043168# US (San Jose) Dial by your location      +1 669 254 5252 US (San Jose)      +1 669 216 1590 US (San Jose)      +1 646 828 7666 US (New York)      +1 551 285 1373 US Meeting ID: 160 149 0177 Passcode: 5465043168 Find your local number: https://nps-edu.zoomgov.com/u/adlcEhjdOO Join by SIP 1601490177@sip.zoomgov.com Join by H.323 161.199.138.10 (US West) 161.199.136.10 (US East) Meeting ID: 160 149 0177 Passcode: 5465043168

Day and time: Tuesday 23 February at 1:00pm UK time
Abstract: We show that general stochastic differential equations (SDEs) driven by Brownian motions or jump processes can be approximated by certain SDEs with random characteristics in the spirit of reservoir computing. Notions of semi-martingale signatures and its randomized versions are applied here.
Zoom meeting room: https://zoom.us/j/5261634867
Meeting ID: 526 163 4867 One tap mobile +13462487799,,5261634867# US (Houston) +16699006833,,5261634867# US (San Jose) Dial by your location        +1 346 248 7799 US (Houston)        +1 669 900 6833 US (San Jose)        +1 929 205 6099 US (New York)        +1 253 215 8782 US (Tacoma)        +1 301 715 8592 US (Germantown)      +1 312 626 6799 US (Chicago) Meeting ID: 526 163 4867 Find your local number: https://zoom.us/u/adJzYyqVx6

Date: 9 Feb. 2021, 1:00pm UK time.
Abstract: Transport and mixing processes in fluid flows are crucially influenced by coherent structures and the
characterisation of these Lagrangian objects is a topic of intense current research. While established
mathematical approaches such as variational or transfer operator based schemes require full knowledge
of the flow field or at least high resolution trajectory data, this information may not be available in applications.
In this talk, we review different spatio-temporal clustering approaches and show how these can be used to
identify coherent behaviour in flows directly from Lagrangian trajectory data. We demonstrate the applicability
of these methods in a number of example systems, including geophysical flows and turbulent convection.
Zoom meeting room: https://zoom.us/j/5261634867 Meeting ID: 526 163 4867 One tap mobile +13462487799,,5261634867# US (Houston) +16699006833,,5261634867# US (San Jose) Dial by your location        +1 346 248 7799 US (Houston)        +1 669 900 6833 US (San Jose)        +1 929 205 6099 US (New York)        +1 253 215 8782 US (Tacoma)        +1 301 715 8592 US (Germantown)        +1 312 626 6799 US (Chicago) Meeting ID: 526 163 4867 Find your local number: https://zoom.us/u/adJzYyqVx6

Day and time: Wednesday, 10th February 2021, 11 am (Paderborn time)
Speaker: Sebastian Peitz (Paderborn University) Title: On the Universal Transformation of Data-Driven Models to Control Systems
Abstract: As in almost every other branch of science, the advances in data science and machine learning have also resulted in improved modeling and simulation of nonlinear dynamical systems. In many cases, predictive methods are advertised to ultimately be useful for control. However, the question of how to use a predictive model for control is left unanswered in most cases due to the associated challenges, namely a significantly higher system complexity, the requirement of much larger data sets and an increased and often problem-specific modeling effort. To solve these issues, we present a universal framework to transform arbitrary predictive models into control systems and use them for feedback control. The advantages are a linear increase in data requirements with respect to the control dimension, performance guarantees that rely exclusively on the accuracy of the predictive model, and only little prior knowledge requirements in control theory to solve complex control problems. %< — This talk will be given at our Applied Math Seminar on Wednesday, 10th February 2021, 11 am (Paderborn time). Corresponding Zoom-Data:Date / Time: 10.Feb..2021 11:00 AM Amsterdam, Berlin, Rom, Stockholm, Wien
Meeting-ID: 921 6631 8554
Code: 306435

Abstract: Given a control system on a smooth manifold M, any admissible control function generates a flow, i.e. a one-parametric family of diffeomorphisms of M. We give a sufficient condition for the system that guarantees the existence of an arbitrary good uniform approximation of any isotopic to the identity diffeomorphism by an admissible diffeomorphism and provide simple examples of control systems on \mathbb R^n, \mathbb T^n and \mathbb S^2 that satisfy this condition. This work is a joint work with A. Sarychev (Florence) motivated by the deep learning of artificial neural networks treated as a kind of interpolation technique.

Title: Topological Methods in Combinatorial Dynamics
Abstract: The ease of collecting enormous amounts of data in the present world together with problems in gaining useful knowledge out of it stimulate the development of mathematical tools to deal with the situation. In particular, in the case of data collected from dynamic processes, the methods developed for computer assisted proofs in dynamics have been adapted to the new challenges. Closely related to the techniques of multivalued maps used in computer assisted proofs in dynamics are the methods centered around the concept of combinatorial vector field, a concept introduced twenty years ago by R. Forman. In the talk I will review some recent results concerning topological invariants for combinatorial vector fields and their extensions.

A nonparametric method to predict non-Markovian time series of partially observed dynamics is developed. The prediction problem we consider is a supervised learning task of finding a regression function that takes a delay-embedded observable to the observable at a future time. When delay-embedding theory is applicable, the proposed regression function is a consistent estimator of the flow map induced by the delay-embedding. Furthermore, the corresponding Mori-Zwanzig equation governing the evolution of the observable simplifies to only a Markovian term, represented by the regression function. We realize this supervised learning task with a class of kernel-based linear estimators, the kernel analog forecast (KAF), which are consistent in the limit of large data. In a scenario with a high-dimensional covariate space, we employ a Markovian kernel smoothing method which is computationally cheaper than the Nyström projection method for realizing KAF. In addition to the guaranteed theoretical convergence, we numerically demonstrate the effectiveness of this approach on higher-dimensional problems where the relevant kernel features are difficult to capture with the Nyström method. Given noisy training data, we propose a nonparametric smoother as a de-noising method. Numerically, we show that the proposed smoother is more accurate than EnKF and 4Dvar in de-noising signals corrupted by independent (but not necessarily identically distributed) noise, even if the smoother is constructed using a data set corrupted by white noise. We show skillful prediction using the KAF constructed from the denoised data.
Live Talks
1. September 21 - Dynamical Systems for Machine Learning, https://youtu.be/HYG6bVDdeN0
2. September 22 - Reservoir Computing & Dynamical Systems for Machine Learning, https://youtu.be/lak3OjvE_44
3. September 23 - Deep Learning, Kernel Methods and Gaussian Processes, https://youtu.be/5UzN57sTFAo
4. September 24 - Data-driven modelling, https://youtu.be/JqFoeWgWN0M
5. September 25 - Learning Theory & Signature-based methods, https://youtu.be/Cn4xm-1uL5Y
6. September 28 - Data-driven modelling, https://youtu.be/kM7ZL6m9y5M
7. September 29 - Geometric and Topological Data Analysis, https://youtu.be/_UqNkcPSCZk

Many dimensionality and model reduction techniques rely on estimating dominant eigenfunctions of associated dynamical operators from data. Important examples include the Koopman operator and its generator, but also the Schrödinger operator. We propose a kernel-based method for the approximation of differential operators in reproducing kernel Hilbert spaces and show how eigenfunctions can be estimated by solving auxiliary matrix eigenvalue problems. The resulting algorithms are applied to molecular dynamics and quantum chemistry examples. Furthermore, we exploit that, under certain conditions, the Schrödinger operator can be transformed into a Kolmogorov backward operator corresponding to a drift-diffusion process and vice versa. This allows us to apply methods developed for the analysis of high-dimensional stochastic differential equations to quantum mechanical systems.
We study the problem of estimating linear response statistics under external perturbations using time series of unperturbed dynamics. Based on the fluctuation-dissipation theory, this problem is reformulated as an unsupervised learning task of estimating a density function. We consider a nonparametric density estimator formulated by the kernel embedding of distributions with "Mercer-type" kernels, constructed based on the classical orthogonal polynomials defined on non-compact domains. While the resulting representation is analogous to Polynomial Chaos Expansion (PCE), the connection to the reproducing kernel Hilbert space (RKHS) theory allows one to establish the uniform convergence of the estimator and to systematically address a practical question of identifying the PCE basis for a consistent estimation. We also provide practical conditions for the well-posedness of not only the estimator but also of the underlying response statistics. Finally, we provide a statistical error bound for the density estimation that accounts for the Monte-Carlo averaging over non-i.i.d time series and the biases due to a finite basis truncation. This error bound provides a means to understand the feasibility as well as limitation of the kernel embedding with Mercer-type kernels. Numerically, we verify the effectiveness of the estimator on two stochastic dynamics with known, yet, non-trivial equilibrium densities.
Speaker: Juan-Pablo Ortega
Title: Reservoir Computing and the Learning of Dynamic Processes
Abstract: dynamic processes regulate the behavior of virtually any artificial and biological agent, from stock markets to epidemics, from driverless cars to healthcare robots. The problem of modeling, forecasting, and generally speaking learning dynamic processes is one of the most classical, sophisticated, and strategically significant problems in the natural and the social sciences. In this talk we shall discuss both classical and recent results on the modeling and learning of dynamical systems and input/output systems using an approach generically known as reservoir computing. This information processing framework is characterized by the use of cheap-to-train randomly generated state-space systems for which promising high-performance physical realizations with dedicated hardware have been proposed in recent years. In our presentation we shall put a special emphasis in the approximation properties of these constructions.

We present a novel algorithm that allows us to gain detailed insight into the effects of sparsity in linear and nonlinear optimization, which is of great importance in many scientific areas such as image and signal processing, medical imaging, compressed sensing, and machine learning (e.g., for the training of neural networks). Sparsity is an important feature to ensure robustness against noisy data, but also to find models that are interpretable and easy to analyze due to the small number of relevant terms. It is common practice to enforce sparsity by adding the $\ell_1$-norm as a weighted penalty term. In order to gain a better understanding and to allow for an informed model selection, we directly solve the corresponding multiobjective optimization problem (MOP) that arises when we minimize the main objective and the $\ell_1$-norm simultaneously. As this MOP is in general non-convex for nonlinear objectives, the weighting method will fail to provide all optimal compromises. To avoid this issue, we present a continuation method which is specifically tailored to MOPs with two objective functions one of which is the $\ell_1$-norm. Our method can be seen as a generalization of well-known homotopy methods for linear regression problems to the nonlinear case. Several numerical examples - including neural network training - demonstrate our theoretical findings and the additional insight that can be gained by this multiobjective approach.
We study the problem of predicting rare critical transition events for a class of slow-fast nonlinear dynamical systems. The state of the system of interest is described by a slow process, whereas a faster process drives its evolution and induces critical transitions. By taking advantage of recent advances in reservoir computing, we present a data-driven method to predict the future evolution of the state. We show that our method is capable of predicting a critical transition event at least several numerical time steps in advance. We demonstrate the success as well as the limitations of our method using numerical experiments on three examples of systems, ranging from low dimensional to high dimensional. We discuss the mathematical and broader implications of our results.
For dynamical systems with a non hyperbolic equilibrium, it is possible to significantly simplify the study of stability by means of the center manifold theory. This theory allows to isolate the complicated asymptotic behavior of the system close to the equilibrium point and to obtain meaningful predictions of its behavior by analyzing a reduced order system on the so-called center manifold. Since the center manifold is usually not known, good approximation methods are important as the center manifold theorem states that the stability properties of the origin of the reduced order system are the same as those of the origin of the full order system. In this work, we establish a data-based version of the center manifold theorem that works by considering an approximation in place of an exact manifold. Also the error between the approximated and the original reduced dynamics are quantified. We then use an apposite data-based kernel method to construct a suitable approximation of the manifold close to the equilibrium, which is compatible with our general error theory. The data are collected by repeated numerical simulation of the full system by means of a high-accuracy solver, which generates sets of discrete trajectories that are then used as a training set. The method is tested on different examples which show promising performance and good accuracy.
This article presents a general framework for recovering missing dynamical systems using available data and machine learning techniques. The proposed framework reformulates the prediction problem as a supervised learning problem to approximate a map that takes the memories of the resolved and identifiable unresolved variables to the missing components in the resolved dynamics. We demonstrate the effectiveness of the proposed framework with a strong convergence error bound of the resolved variables up to finite time and numerical tests on prototypical models in various scientific domains. These include the 57-mode barotropic stress models with multiscale interactions that mimic the blocked and unblocked patterns observed in the atmosphere, the nonlinear Schrödinger equation which found many applications in physics such as optics and Bose-Einstein-Condense, the Kuramoto-Sivashinsky equation which spatiotemporal chaotic pattern formation models trapped ion mode in plasma and phase dynamics in reaction-diffusion systems. While many machine learning techniques can be used to validate the proposed framework, we found that recurrent neural networks outperform kernel regression methods in terms of recovering the trajectory of the resolved components and the equilibrium one-point and two-point statistics. This superb performance suggests that a recurrent neural network is an effective tool for recovering the missing dynamics that involves approximation of high-dimensional functions.
This paper investigates the formulation and implementation of Bayesian inverse problems to learn input parameters of partial differential equations (PDEs) defined on manifolds. Specifically, we study the inverse problem of determining the diffusion coefficient of a second-order elliptic PDE on a closed manifold from noisy measurements of the solution. Inspired by manifold learning techniques, we approximate the elliptic differential operator with a kernel-based integral operator that can be discretized via Monte Carlo without reference to the Riemannian metric. The resulting computational method is mesh-free and easy to implement, and can be applied without full knowledge of the underlying manifold, provided that a point cloud of manifold samples is available. We adopt a Bayesian perspective to the inverse problem, and establish an upper bound on the total variation distance between the true posterior and an approximate posterior defined with the kernel forward map. Supporting numerical results show the effectiveness of the proposed methodology.
The details of the next MLDS colloquium are as follows.
Date and time: Tuesday 15 December at 3:00 pm UK time Speaker: Juan-Pablo ORTEGA, Universität Sankt Gallen Title: Reservoir Computing and the Learning of Dynamic Processes
Abstract: dynamic processes regulate the behavior of virtually any artificial and biological agent, from stock markets to epidemics, from driverless cars to healthcare robots. The problem of modeling, forecasting, and generally speaking learning dynamic processes is one of the most classical, sophisticated, and strategically significant problems in the natural and the social sciences. In this talk we shall discuss both classical and recent results on the modeling and learning of dynamical systems and input/output systems using an approach generically known as reservoir computing. This information processing framework is characterized by the use of cheap-to-train randomly generated state-space systems for which promising high-performance physical realizations with dedicated hardware have been proposed in recent years. In our presentation we shall put a special emphasis in the approximation properties of these constructions.
Zoom meeting room: https://zoom.us/j/5261634867
Meeting ID: 526 163 4867 One tap mobile +13462487799,,5261634867# US (Houston) +16699006833,,5261634867# US (San Jose) Dial by your location       +1 346 248 7799 US (Houston)       +1 669 900 6833 US (San Jose)       +1 929 205 6099 US (New York)       +1 253 215 8782 US (Tacoma)       +1 301 715 8592 US (Germantown)       +1 312 626 6799 US (Chicago) Meeting ID: 526 163 4867

Speaker: Weinan E
Title: Machine Learning and Dynamical Systems
Inaugural lecture of the Special Interest Group at the Alan Turing Institute on Machine Learning and Dynamical Systems. For more details about the SIG, cf. https://www.turing.ac.uk/research/interest-groups/machine-learning-and-dynamical-systems
Abstract: I will discuss three central issues on the topic of machine learning and dynamical systems: machine learning of dynamical systems, by dynamical systems, for dynamical systems. For the first topic, we are interested in learning a dynamical system with short term quantitative accuracy and long term qualitative consistency. For the second topic, we ask how ideas from dynamical systems and control theory can be used to help developing machine learning models and algorithms. For the third topic, we are interested in some particular aspects (say, some observational data) of a complex dynamical system, and we want to use machine learning to develop simple models for that purpose.

Dear all,
I am pleased to invite you to the online inaugural lecture of the new Special Interest Group at the Alan Turing Institute on Machine learning and Dynamical Systems, cf. https://www.turing.ac.uk/research/interest-groups/machine-learning-and-dynamical-systems for more details. To get updates about the activities of the SIG, please fill the following form  https://docs.google.com/forms/d/e/1FAIpQLScsLNmNyciIMBgCXL1AuXD3Du33Lu2tgbghGGUEAyXnnqvhFQ/viewform The inaugural lecture will be online by Weinan E (Princeton University) on Wednesday 18th November at 10:00am UK time. Details are as follows, please disseminate this information among your colleagues.
Date and time: Wednesday 18th November at 10:00am UK time
Speaker: Weinan E
Title: Machine Learning and Dynamical Systems
Abstract: I will discuss three central issues on the topic of machine learning and dynamical systems: machine learning of dynamical systems, by dynamical systems, for dynamical systems. For the first topic, we are interested in learning a dynamical system with short term quantitative accuracy and long term qualitative consistency. For the second topic, we ask how ideas from dynamical systems and control theory can be used to help developing machine learning models and algorithms. For the third topic, we are interested in some particular aspects (say, some observational data) of a complex dynamical system, and we want to use machine learning to develop simple models for that purpose. Zoom meeting room: https://zoom.us/j/5261634867 Meeting ID: 526 163 4867 One tap mobile +13462487799,,5261634867# US (Houston) +16699006833,,5261634867# US (San Jose) Dial by your location        +1 346 248 7799 US (Houston)        +1 669 900 6833 US (San Jose)        +1 929 205 6099 US (New York)        +1 253 215 8782 US (Tacoma)        +1 301 715 8592 US (Germantown)        +1 312 626 6799 US (Chicago) Meeting ID: 526 163 4867 Find your local number: https://zoom.us/u/adJzYyqVx6

Echo state networks (ESNs) have been recently proved to be universal approximants for input/output systems with respect to various $L ^p$-type criteria. When $1\leq p< \infty$, only $p$-integrability hypotheses need to be imposed, while in the case $p=\infty$ a uniform boundedness hypotheses on the inputs is required. This note shows that, in the last case, a universal family of ESNs can be constructed that contains exclusively elements that have the echo state and the fading memory properties. This conclusion could not be drawn with the results and methods available so far in the literature.
In recent years, the machine learning community has seen a continuous growing interest in research aimed at investigating dynamical aspects of both training procedures and perfected models. Of particular interest among recurrent neural networks, we have the Reservoir Computing (RC) paradigm for its conceptual simplicity and fast training scheme. Yet, the guiding principles under which RC operates are only partially understood. In this work, we study the properties behind learning dynamical systems with RC and propose a new guiding principle based on Generalized Synchronization (GS) granting its feasibility. We show that the well-known Echo State Property (ESP) implies and is implied by GS, so that theoretical results derived from the ESP still hold when GS does. However, by using GS one can profitably study the RC learning procedure by linking the reservoir dynamics with the readout training. Notably, this allows us to shed light on the interplay between the input encoding performed by the reservoir and the output produced by the readout optimized for the task at hand. In addition, we show that - as opposed to the ESP - satisfaction of the GS can be measured by means of the Mutual False Nearest Neighbors index, which makes effective to practitioners theoretical derivations.
This paper shows that a large class of fading memory state-space systems driven by discrete-time observations of dynamical systems defined on compact manifolds always yields continuously differentiable synchronizations. This general result provides a powerful tool for the representation, reconstruction, and forecasting of chaotic attractors. It also improves previous statements in the literature for differentiable generalized synchronizations, whose existence was so far guaranteed for a restricted family of systems and was detected using H\"older exponent-based criteria.
In recent years, the success of the Koopman operator in dynamical systems analysis has also fueled the development of Koopman operator-based control frameworks. In order to preserve the relatively low data requirements for an approximation via dynamic mode decomposition, a quantization approach was recently proposed in [S. Peitz and S. Klus, Automatica J. IFAC, 106 (2019), pp. 184--191]. This way, control of nonlinear dynamical systems can be realized by means of switched systems techniques, using only a finite set of autonomous Koopman operator-based reduced models. These individual systems can be approximated very efficiently from data. The main idea is to transform a control system into a set of autonomous systems for which the optimal switching sequence has to be computed. In this article, we extend these results to continuous control inputs using relaxation. This way, we combine the advantages of the data efficiency of approximating a finite set of autonomous systems with continuous controls, as the data requirements increase only linearly with the input dimension. We show that when using the Koopman generator, this relaxation---realized by linear interpolation between two operators---does not introduce any error for control affine systems. This allows us to control high-dimensional nonlinear systems using bilinear, low-dimensional surrogate models. The efficiency of the proposed approach is demonstrated using several examples with increasing complexity, from the Duffing oscillator to the chaotic fluidic pinball.
This short review describes mathematical techniques for statistical analysis and prediction in dynamical systems. Two problems are discussed, namely (i) the supervised learning problem of forecasting the time evolution of an observable under potentially incomplete observations at forecast initialization; and (ii) the unsupervised learning problem of identification of observables of the system with a coherent dynamical evolution. We discuss how ideas from from operator-theoretic ergodic theory combined with statistical learning theory provide an effective route to address these problems, leading to methods well-adapted to handle nonlinear dynamics, with convergence guarantees as the amount of training data increases.
A new explanation of geometric nature of the reservoir computing phenomenon is presented. Reservoir computing is understood in the literature as the possibility of approximating input/output systems with randomly chosen recurrent neural systems and a trained linear readout layer. Light is shed on this phenomenon by constructing what is called strongly universal reservoir systems as random projections of a family of state-space systems that generate Volterra series expansions. This procedure yields a state-affine reservoir system with randomly generated coefficients in a dimension that is logarithmically reduced with respect to the original system. This reservoir system is able to approximate any element in the fading memory filters class just by training a different linear readout for each different filter. Explicit expressions for the probability distributions needed in the generation of the projected reservoir system are stated and bounds for the committed approximation error are provided.
The schedule of the live talks at the Second Symposium on Machine Learning and Dynamical Systems starting next week and hosted online by the Fields Institute can be found at
Pre-recorded talks are being regularly uploaded to the Youtube channel of the Fields institute and can be found at http://www.fields.utoronto.ca/activities/20-21/dynamical
Finally, please don’t forget to register to the event at http://www.fields.utoronto.ca/activities/20-21/dynamical (top right). This is how you’ll get the links on Zoom for the live lectures.