Michelle Girvan’s research while affiliated with Santa Fe Institute and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (135)


(a) Reservoir computer (RC) schematic. Time series observations u ( t ) are fed into a high-dimensional reservoir with state r ( t ) via an input matrix B, then an output matrix W is trained to predict the next data point in the series (i.e., at time t + τ). Predictions at times t > t train are made by switching to autonomous mode in which outputs of the reservoir are repeatedly fed back in as input (dashed line). (b) Next-generation reservoir computers (NGRCs) replace the reservoir with a nonlinear feature vector O ( t ) that is constructed using time-delayed observations. (c) Our hybrid RC-NGRC prediction approach uses a hybrid feature vector H ( t ) that is the concatenation of a reservoir state with an NGRC feature vector in order to produce a prediction.
(a) Representative examples of RC, NGRC, and hybrid RC-NGRC autonomous predictions of the Lorenz system ( x component shown), where a small reservoir ( N = 50) and large time steps ( τ = 0.06) are used, limiting RC and NGRC performance. Valid prediction time (VPT) is indicated by the vertical dashed line. (b) Distributions of VPTs for RC, NGRC, and RC-NGRC predictions, where each trial is done on new initial conditions using a new reservoir realization. The hybrid RC-NGRC shows substantially stronger short-term predictive power than either the RC or NGRC alone. Horizontal lines: quartiles (100 trials).
(a) Representative examples of long-term phase space trajectories of the RC, NGRC, and hybrid RC-NGRC autonomous predictions [predictions extended from Fig. 2(a)]. Though RC and NGRC reconstruct the Lorenz attractor in some trials, only the hybrid RC-NGRC prediction reliably reconstructs the attractor of the true system across trials. (b) Power spectra of z component of autonomous predictions for the different methods. Only the hybrid RC-NGRC prediction reliably reproduces the spectrum of the Lorenz system.
(a) Mean valid prediction times for the Lorenz system vs number of nodes in the reservoir. Although RC performance is poor at small N and NGRC performance is modest due to using a large time step ( τ = 0.06), the hybrid RC-NGRC performs well throughout, specifically providing a substantial advantage over both RC and NGRC at small N. Note that the hybrid RC-NGRC approach with reservoir size N = 100 approximately matches that of a pure RC with N = 500. (b) Mean valid prediction times for the Lorenz system vs time step size τ in the training data. As time step is adjusted, the number of training data points n train is kept constant. The hybrid RC-NGRC shows the greatest advantage in predictive power over the RC or NGRC alone when using a large time step. Reservoir size N = 50. Error bars and band: standard error of the mean (64 trials).
The hybrid RC-NGRC approach exhibits reduced sensitivity to reservoir hyperparameters compared to RCs. Shown here: mean VPT vs input matrix scaling σ (other examples shown in the supplementary material). Error bars and band: standard error of the mean (64 trials).

+2

Hybridizing traditional and next-generation reservoir computing to accurately and efficiently forecast dynamical systems
  • Article
  • Publisher preview available

June 2024

·

17 Reads

·

4 Citations

R. Chepuri

·

D. Amzalag

·

T. M. Antonsen

·

M. Girvan

Reservoir computers (RCs) are powerful machine learning architectures for time series prediction. Recently, next generation reservoir computers (NGRCs) have been introduced, offering distinct advantages over RCs, such as reduced computational expense and lower training data requirements. However, NGRCs have their own practical difficulties, including sensitivity to sampling time and type of nonlinearities in the data. Here, we introduce a hybrid RC-NGRC approach for time series forecasting of dynamical systems. We show that our hybrid approach can produce accurate short-term predictions and capture the long-term statistics of chaotic dynamical systems in situations where the RC and NGRC components alone are insufficient, e.g., due to constraints from limited computational resources, sub-optimal hyperparameters, sparsely sampled training data, etc. Under these conditions, we show for multiple model chaotic systems that the hybrid RC-NGRC method with a small reservoir can achieve prediction performance approaching that of a traditional RC with a much larger reservoir, illustrating that the hybrid approach can offer significant gains in computational efficiency over traditional RCs while simultaneously addressing some of the limitations of NGRCs. Our results suggest that the hybrid RC-NGRC approach may be particularly beneficial in cases when computational efficiency is a high priority and an NGRC alone is not adequate.

View access options


t-ConvESN: Temporal Convolution-Readout for Random Recurrent Neural Networks

September 2023

·

53 Reads

Lecture Notes in Computer Science

While deep neural networks have excelled at static data such as images, temporal data - the data that these networks will need to process in the real world - remains an open challenge. Handling temporal data with neural networks requires one of three options: backpropagation through time using recurrent neural networks (RNNs), treating the time series as static data for a convolutional neural network (CNNs) or attention-based transformer architectures. RNNs are an elegant autoregressive network type that naturally keep a memory of the past while performing computations. Although recurrent networks such as LSTMs have shown strong success across a multitude of fields and tasks such as natural language processing, they can be difficult to train. Transformers and 1-D CNNs, two feed-forward alternatives for temporal data, have gained popularity but in their base forms lack memory to keep track of past activations. Random recurrent networks, also known as Reservoir Computing, have shown that one need not necessarily backpropagate the error through the recurrent component. Here, we propose a novel hybrid approach that brings together the temporal memory capabilities of a random recurrent network with the powerful learning capacity of a deep temporal convolutional readout, which we call t-ConvESN. We experimentally verify that although the recurrent component remains random and unlearned, its combination with a deep readout achieves superior accuracy on a number of datasets from the UCR time series classification dataset collection compared to other state of the art deep learning architectures. Our experiments also show that our proposed method excels in datasets in the low-data regime.


Stabilizing Machine Learning Prediction of Dynamics: Noise and Noise-inspired Regularization

November 2022

·

177 Reads

·

2 Citations

Recent work has shown that machine learning (ML) models can be trained to accurately forecast the dynamics of unknown chaotic dynamical systems. Such ML models can be used to produce both short-term predictions of the state evolution and long-term predictions of the statistical patterns of the dynamics (``climate''). Both of these tasks can be accomplished by employing a feedback loop, whereby the model is trained to predict forward one time step, then the trained model is iterated for multiple time steps with its output used as the input. In the absence of mitigating techniques, however, this technique can result in artificially rapid error growth, leading to inaccurate predictions and/or climate instability. In this article, we systematically examine the technique of adding noise to the ML model input during training as a means to promote stability and improve prediction accuracy. Furthermore, we introduce Linearized Multi-Noise Training (LMNT), a regularization technique that deterministically approximates the effect of many small, independent noise realizations added to the model input during training. Our case study uses reservoir computing, a machine-learning method using recurrent neural networks, to predict the spatiotemporal chaotic Kuramoto-Sivashinsky equation. We find that reservoir computers trained with noise or with LMNT produce climate predictions that appear to be indefinitely stable and have a climate very similar to the true system, while reservoir computers trained without regularization are unstable. Compared with other types of regularization that yield stability in some cases, we find that both short-term and climate predictions from reservoir computers trained with noise or with LMNT are substantially more accurate. Finally, we show that the deterministic aspect of our LMNT regularization facilitates fast hyperparameter tuning when compared to training with noise.


Parallel Machine Learning for Forecasting the Dynamics of Complex Networks

April 2022

·

29 Reads

·

30 Citations

Physical Review Letters

Forecasting the dynamics of large, complex, sparse networks from previous time series data is important in a wide range of contexts. Here we present a machine learning scheme for this task using a parallel architecture that mimics the topology of the network of interest. We demonstrate the utility and scalability of our method implemented using reservoir computing on a chaotic network of oscillators. Two levels of prior knowledge are considered: (i) the network links are known, and (ii) the network links are unknown and inferred via a data-driven approach to approximately optimize prediction.


Deep-Readout Random Recurrent Neural Networks for Real-World Temporal Data

April 2022

·

186 Reads

·

2 Citations

SN Computer Science

Echo State Networks (ESN) are a class of recurrent neural networks that can learn to regress on or classify sequential data by keeping the recurrent component random and training only on a set of readout weights, which is of interest to the current edge computing and neuromorphic community. However, they have struggled to perform well with regression and classification tasks and therefore, could not compete in performance with traditional RNNs, such as LSTM and GRU networks. To address this limitation, we have developed a novel hybrid network, called Parallelized Deep Readout Echo State Network (PDR-ESN) that combines the deep learning readout with a fast random recurrent component, with multiple ESNs computing in parallel. We show the PDR-ESN architecture allows for different configurations of the sub-reservoirs, leading to different variants which we explore. Our findings suggest that different variants of the PDR-ESN offer various advantages in different task domains, with some performing better in regression and others in classification. In all cases, our PDR-ESN architecture outperforms the corresponding gradient-based LSTM and GRU architectures in terms of training time as well as accuracy. To further evaluate, we also compared against a Transformer encoder classifier, where the PDR-ESN outperformed on all tasks. We conclude that our proposed network demonstrates a good trade-off between the fast training times of traditional ESNs with the accuracy of deep backpropagation for real-world tasks. We hope that this architecture offers an alternative approach to sequential processing for edge computing as well as more biologically-realistic network development.


MYC amplifies gene expression through global changes in transcription factor dynamics

January 2022

·

186 Reads

·

43 Citations

Cell Reports

The MYC oncogene has been studied for decades, yet there is still intense debate over how this transcription factor controls gene expression. Here, we seek to answer these questions with an in vivo readout of discrete events of gene expression in single cells. We engineered an optogenetic variant of MYC (Pi-MYC) and combined this tool with single-molecule RNA and protein imaging techniques to investigate the role of MYC in modulating transcriptional bursting and transcription factor binding dynamics in human cells. We find that the immediate consequence of MYC overexpression is an increase in the duration rather than in the frequency of bursts, a functional role that is different from the majority of human transcription factors. We further propose that the mechanism by which MYC exerts global effects on the active period of genes is by altering the binding dynamics of transcription factors involved in RNA polymerase II complex assembly and productive elongation.


Figure 1: A schematic illustration of the MARC algorithm. At left, a library of related time series (possibly from a combination of measurements and approximate simulated processes) is used to train an associated library of RC models. These RC features are used to train an autoencoder with a low-dimensional hidden layer. At right, the decoder portion of the autoencoder is used to map latent variables onto RC features, which then generates a time series. The latent variables are optimized such that this generated time series best agrees with a signal consisting of a limited amount of time series data to be predicted.
Figure 2: Examination of the MARC algorithm applied to a toy regression problem. Upper left: 10,000 randomly sampled phase/amplitude points, plotted in the true 2-d parameter space, colored by location. Upper right: The first two principal components of latent variable space, where the color of each point corresponds to its original phase and amplitude. As we can see, the RC features map onto a smooth, compact manifold within latent variable space. Bottom: Two examples of the MARC algorithm applied to the sinusoidal regression problem. The blue curve represents the ground truth, while the dashed red line is the optimal reservoir time seriesˆsseriesˆseriesˆs. The 10 points, randomly sampled from the ground truth to form s, are indicated by black triangles.
Figure 3: An example of the MARC algorithm applied to the Lorenz system. Left: An example Lorenz attractor (blue) generated by the result of the MARC approach using the 10 observed data points (red). As can be seen, the decoded RC yields a textbook example, like decoded RCs in Sec. 4.1 all yield sines. Right: The three components of the Lorenz system plotted vs. time illustrating the valid time calculation, both the true system (blue) and the MARC-based prediction (dashed red). The 10 points, sequentially sampled from the ground truth to form s, are in green. The prediction remains valid for approximately 1.5 Lyapunov times, or until the dashed, vertical line, after which the systems become decorrelated due to the presence of chaos.
Figure 5: Illustrative examples of the MARC algorithm applied to the multimodal example. The first, second, third, and fourth rows correspond to the sine, line, quadratic, and cubic test signals, respectively. The first (second) column depicts the test with the median (worst) MSE out of 800 total experiments. Black triangles indicate the observed points in s. Blue lines indicate the ground truth, while dashed red lines indicate the MARC prediction. The third column contains the distribution of errors in logarithmic space.
Figure 6: Optimizer loss versus final MSE for the multi-modal experiments. Black, blue, green, and red dots correspond to sines, lines, quadratic functions, and cubic functions, respectively.
A Meta-learning Approach to Reservoir Computing: Time Series Prediction with Limited Data

October 2021

·

157 Reads

Recent research has established the effectiveness of machine learning for data-driven prediction of the future evolution of unknown dynamical systems, including chaotic systems. However, these approaches require large amounts of measured time series data from the process to be predicted. When only limited data is available, forecasters are forced to impose significant model structure that may or may not accurately represent the process of interest. In this work, we present a Meta-learning Approach to Reservoir Computing (MARC), a data-driven approach to automatically extract an appropriate model structure from experimentally observed "related" processes that can be used to vastly reduce the amount of data required to successfully train a predictive model. We demonstrate our approach on a simple benchmark problem, where it beats the state of the art meta-learning techniques, as well as a challenging chaotic problem.


An integrated model for interdisciplinary graduate education: Computation and mathematics for biological networks

September 2021

·

97 Reads

·

8 Citations

The current challenges at the forefront of data-enabled science and engineering require interdisciplinary solutions. Yet most traditional doctoral programs are not structured to support successful interdisciplinary research. Here we describe the design of and students’ experiences in the COMBINE (Computation and Mathematics for Biological Networks) interdisciplinary graduate program at the University of Maryland. COMBINE focuses on the development and application of network science methods to biological systems for students from three primary domains: life sciences, computational/engineering sciences, and mathematical/physical sciences. The program integrates three established models (T-shaped, pi-shaped and shield-shaped) for interdisciplinary training. The program components largely fall into three categories: (1) core coursework that provides content expertise, communication, and technical skills, (2) discipline-bridging elective courses in the two COMBINE domains that complement the student’s home domain, (3) broadening activities such as workshops, symposiums, and formal peer-mentoring groups. Beyond these components, the program builds community through both formal and informal networking and social events. In addition to the interactions with other program participants, students engage with faculty in several ways beyond the conventional adviser framework, such as the requirement to select a second out-of-field advisor, listening to guest speakers, and networking with faculty through workshops. We collected data through post-program surveys, interviews and focus groups with students, alumni and faculty advisors. Overall, COMBINE students and alumni reported feeling that the program components supported their growth in the three program objectives of Network Science & Interdisciplinarity, Communication, and Career Preparation, but also recommended ways to improve the program. The value of the program can be seen not only through the student reports, but also through the students’ research products in network science which include multiple publications and presentations. We believe that COMBINE offers an effective model for integrated interdisciplinary training that can be readily applied in other fields.


Figure 1. Largest Lyapunov Exponent as a function of the coupling constant, K. The dashed line represents the chosen value of K for our studies.
Figure 4. Performance of the different Reservoir Computing methods as a function of the Kuramoto oscillator network size.
Parallel Machine Learning for Forecasting the Dynamics of Complex Networks

August 2021

·

131 Reads

Forecasting the dynamics of large complex networks from previous time-series data is important in a wide range of contexts. Here we present a machine learning scheme for this task using a parallel architecture that mimics the topology of the network of interest. We demonstrate the utility and scalability of our method implemented using reservoir computing on a chaotic network of oscillators. Two levels of prior knowledge are considered: (i) the network links are known; and (ii) the network links are unknown and inferred via a data-driven approach to approximately optimize prediction.


Citations (61)


... Chaos theory, as a relatively nascent research field, is an in-depth study of a class of systems characterized by highly intricate and unpredictable properties [1,2]. Chaotic systems exhibit a range of distinctive characteristics, including sensitivity to initial conditions, aperiodicity, and statistical regularities [3,4], which makes them valuable in different practical applications such as time series forecasting [5], video encryption [6], secure communication [7], geolocation-based hardware encryptor [8], compressive ghost imaging [9]. Within chaotic systems, nonlinearity plays a crucial role in generating complex orbital structures and chaotic behaviors, such as scroll attractors [10], extreme multistability [11], and various patterns of coexisting attractors [12]. ...

Reference:

Initials-dependent dynamics and synchronization in a memristor coupled memristive map
Hybridizing traditional and next-generation reservoir computing to accurately and efficiently forecast dynamical systems

... Alternatively, the RC is carefully designed to reconstruct the trajectories of the Lorenz and Rössler system with highintensity Lévy noise [36]. It has also been proposed that the deliberate introduction of noise to the precise data may stabilize the training and prediction of the RC [37]. For the partially observable system, various techniques are employed to fill the missing values. ...

Stabilizing machine learning prediction of dynamics: Novel noise-inspired regularization tested with reservoir computing
  • Citing Article
  • November 2023

Neural Networks

... In general, the underlying numerical method provides a strong inductive bias, as such, these methods require less training data than purely-data driven ones, and they have empirically proven to capture the correct dynamics of the systems [67,41]. Unfortunately, they struggle to remain reliably stable for very long time-horizons [68,143,136,141]. They are also challenging to implement, as they require the integration of the ML components, usually in the form of closures, into the code base of existing climate models -which are generally non-differentiable and written in different languages [87]. ...

Stabilizing Machine Learning Prediction of Dynamics: Noise and Noise-inspired Regularization

... As the idea of the malestrom was born about of reservoir computing (Evanusa et al., 2022), it is natural that it would be connected. In (Evanusa et al., 2022;, what can be thought of as a "partial maelstrom network" had been introduced; this included ideas of the maelstrom -the recurrent component -and a readout (or, output network) for a task. ...

Deep-Readout Random Recurrent Neural Networks for Real-World Temporal Data

SN Computer Science

... MYC is an oncogenic transcription factor, overexpressed in many malignancies and linked to aggressive tumor progression and poor survival outcomes (62). Recent studies, however, propose that MYC functions as a global gene expression amplifier (63). Research has shown that MYC promotes efferocytosis. ...

MYC amplifies gene expression through global changes in transcription factor dynamics

Cell Reports

... At the graduate level, the National Science Foundation (NSF) has provided support to develop pedagogical approaches for interdisciplinary training via the Integrative Graduate Education and Research Traineeship (IGERT) and NSF Research Traineeship (NRT) programs, which seek to effectively train graduate students in convergent research areas to better align with the needs of a dynamically changing workforce (22). Innovative models for interdisciplinary graduate training have been developed through these programs, with many sharing an emphasis on exploration, communication, and collaboration across disciplines (23)(24)(25)(26)(27). ...

An integrated model for interdisciplinary graduate education: Computation and mathematics for biological networks

... When the governing equations of a dynamical system are unknown or too costly to simulate, machine learning can often assist DA in two ways. First, machine learning can be used to build data-driven forecast models for DA [56,57,58,59,60,61,62,63]. Second, machine learning can streamline the entire DA process by developing end-to-end learning frameworks [64,65,66,67,68,69]. ...

Using data assimilation to train a hybrid forecast system that combines machine-learning and knowledge-based components
  • Citing Article
  • May 2021

... Moreover, the inferred GRNs, and particularly those generated by the Rank-full and the Z-score-full integration schemes, exhibit a disassortative topology, i.e., a negative degree-degree assortativity Pearson correlation. This is a characteristic that has been observed in several technological and biological networks [37,38] and could be an important characteristic related to the network robustness to perturbations [37]. Considering both the network topology and the evaluation scores, Z-score-full or Rank-full seem good integration schemes to be applied in further inference projects. ...

Phase transitions and assortativity in models of gene regulatory networks evolved under different selection processes

... It is worth emphasizing that the phenomenon of tipping in its original context [1][2][3][4][5][6][7][8] is characteristically distinct from the more commonly studied critical transitions from an oscillatory state to some final state. Examples of such transitions include a crisis through which a chaotic attractor is destroyed and replaced by a chaotic transient [37], the onset of synchronization from a desynchronization state [38], amplitude death [39], and the encountering with a periodical window [40]. While machine learning, in particular reservoir computing, has been applied to predicting such critical transitions [41][42][43][44], a shared characteristic among the existing works is the system's oscillatory behavior before the transition. ...

Using machine learning to predict statistical properties of non-stationary dynamical processes: System climate,regime transitions, and the effect of stochasticity
  • Citing Article
  • March 2021