ArticlePublisher preview available

A Bayesian Machine Learning Algorithm for Predicting ENSO Using Short Observational Time Series

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract and Figures

Plain Language Summary One major challenge in applying machine learning algorithms for predicting the El Nino‐Southern Oscillation is the shortage of observational training data. In this article, a simple and efficient Bayesian machine learning (BML) training algorithm is developed, which exploits only a 20‐year observational time series for training a neural network. In this new BML algorithm, a long simulation from an approximate parametric model is used as the prior information while the short observational data plays the role of the likelihood which corrects the intrinsic model error in the prior data during the training process. The BML algorithm is applied to predict the Nino 3 sea surface temperature (SST) index. Forecast from the BML algorithm outperforms standard machine learning forecasts and model‐based ensemble predictions. The BML algorithm also allows a multiscale input consisting of both the SST and the wind bursts that greatly facilitate the forecast of the Nino 3 index. Remarkably, the BML forecast overcomes the spring predictability barrier to a large extent. Moreover, the BML algorithm reduces the forecast uncertainty and is robust to the input perturbations.
Panels (a–c): Comparison of different setups in the machine learning (ML) algorithms. Panel (a): forecasting using the standard ML algorithm. Here, no validation criterion (or step 2 of Bayesian machine learning (BML)) is used to guide the training in Equation 3. The red and yellow curves indicate the forecasts, in which only the prior data and only the short observations are used for training, respectively. The green curve shows the forecast skill, where the training data is the concatenation of the long prior data and the short observations. Panel (b): forecasting using the BML algorithm. The red curve is the one that is, the same as that in Figure 1. It exploits the prior data for the parameter updating (Equation 3) in the training process while the short observations are adopted for the data‐driven validation (Equation 4) as the stopping criterion. The yellow curve uses the short observations for both parameter updating and the data‐driven validation. Panel (c): forecasting using the BML algorithm but using different validation sets. The red curve is the same as that in Panel (b). The blue one uses the prior data for validation (Equation 4) while the green one uses the mixed prior data and the observations for validation. The shading area shows the uncertainty (95% confidence interval) based on 10 repeated experiments, which contain different realizations of the prior data. Panels (d and e): Comparison of the standard and the BML algorithms in the situation with random perturbed parameters in the prior parametric model. The red and yellow curves indicate the forecasts, in which only the prior data and only the short observations are used for training, respectively. The blue curve shows the forecast, where the training data is the concatenation of the prior and posterior time series. The shading area shows the uncertainty based on 10 repeated experiments, which contain different realizations of the prior data. In all the panels, the input variables are (TE,HW,τ).
… 
This content is subject to copyright. Terms and conditions apply.
1. Introduction
As the most prominent interannual climate variability, the El Niño–Southern Oscillation (ENSO) manifests
as a basin-scale air-sea interaction phenomenon characterized by sea surface temperature (SST) anom-
alies in the equatorial central to eastern Pacific (Clarke,2008; Rasmusson & Carpenter,1982; Zebiak &
Cane,1987). It has a strong impact on climate, ecosystems, and economies around the world through global
circulation (Ashok & Yamagata,2009; Ropelewski & Halpert,1987). Classically, ENSO is regarded as a cy-
clic phenomenon (Jin,1997; Wyrtki,1975), in which the positive and negative phases are known as El Niño
and La Niña, respectively.
The traditional ensemble forecast using physics-based models has been widely used for predicting the
ENSO (Moore & Kleeman,1998; Tang etal.,2018; B. P. Kirtman & Min,2009). A hierarchy of models rang-
ing from the general circulation models (GCMs) to many intermediate and low-order models are employed
for forecasting the refined and large-scale ENSO features, respectively. However, model error, which leads
to large predictive uncertainty, is ubiquitous in these parametric models and often results in ineffective
forecasts. The model error often comes from the incomplete understanding of nature and/or the inadequate
spatiotemporal resolutions in these models (Kalnay,2003; Majda & Chen,2018; Palmer,2001). More recent-
ly, machine learning (ML) techniques have become prevalent in forecasting ENSO and many other climate
Abstract A simple and efficient Bayesian machine learning (BML) training algorithm, which exploits
only a 20-year short observational time series and an approximate prior model, is developed to predict
the Niño 3 sea surface temperature (SST) index. The BML forecast significantly outperforms model-based
ensemble predictions and standard machine learning forecasts. Even with a simple feedforward neural
network (NN), the BML forecast is skillful for 9.5months. Remarkably, the BML forecast overcomes
the spring predictability barrier to a large extent: the forecast starting from spring remains skillful for
nearly 10months. The BML algorithm can also effectively utilize multiscale features: the BML forecast
of SST using SST, thermocline, and windburst improves on the BML forecast using just SST by at least
2months. Finally, the BML algorithm also reduces the forecast uncertainty of NNs and is robust to input
perturbations.
Plain Language Summary One major challenge in applying machine learning algorithms
for predicting the El Nino-Southern Oscillation is the shortage of observational training data. In this
article, a simple and efficient Bayesian machine learning (BML) training algorithm is developed, which
exploits only a 20-year observational time series for training a neural network. In this new BML algorithm,
a long simulation from an approximate parametric model is used as the prior information while the short
observational data plays the role of the likelihood which corrects the intrinsic model error in the prior data
during the training process. The BML algorithm is applied to predict the Nino 3 sea surface temperature
(SST) index. Forecast from the BML algorithm outperforms standard machine learning forecasts and
model-based ensemble predictions. The BML algorithm also allows a multiscale input consisting of
both the SST and the wind bursts that greatly facilitate the forecast of the Nino 3 index. Remarkably, the
BML forecast overcomes the spring predictability barrier to a large extent. Moreover, the BML algorithm
reduces the forecast uncertainty and is robust to the input perturbations.
CHEN ET AL.
© 2021. American Geophysical Union.
All Rights Reserved.
A Bayesian Machine Learning Algorithm for Predicting
ENSO Using Short Observational Time Series
Nan Chen1 , Faheem Gilani2, and John Harlim3
1Department of Mathematics, University of Wisconsin-Madison, Madison, WI, USA, 2Department of Mathematics, The
Pennsylvania State University, University Park, PA, USA, 3Department of Mathematics, Department of Meteorology
and Atmospheric Science, Institute for Computational and Data Sciences, The Pennsylvania State University, University
Park, PA, USA
Key Points:
A new Bayesian machine learning
(BML) framework is developed
to accommodate the shortage of
observations when training neural
networks
The new BML forecast significantly
outperforms model-based ensemble
predictions and standard machine
learning forecasts
The new BML algorithm reduces
forecast uncertainty and overcomes
the spring predictability barrier to a
large extent
Supporting Information:
Supporting Information may be found
in the online version of this article.
Correspondence to:
N. Chen,
chennan@math.wisc.edu
Citation:
Chen, N., Gilani, F., & Harlim, J. (2021).
A Bayesian machine learning algorithm
for predicting ENSO using short
observational time series. Geophysical
Research Letters, 48, e2021GL093704.
https://doi.org/10.1029/2021GL093704
Received 6 APR 2021
Accepted 5 AUG 2021
10.1029/2021GL093704
RESEARCH LETTER
1 of 12
... To estimate the skilful prediction range of this model, we compute the maximum lag k for which the model meets a suitable forecast skill score evaluation. In this case, the maximum acceptable lag k is determined as the lag corresponding to the condition that the generated model outputs have a 50 percent chance of correctly predicting a MHW day, indicating that lags longer than k do not provide superior prediction skill than a true -or -false guess (Chen et al., 2021;Silini et al., 2021). The true positive rate (TPR), also known as recall or sensitivity, is utilized to determine the prediction range in this instance. ...
Article
Full-text available
A systematic analysis of historical and modeled marine heatwaves (MHWs) off eastern Tasmania has been performed based on satellite observations and a high–resolution regional ocean model simulation, over the period from 1994–2016. Our analysis suggests that the distribution of large and intense mesoscale warm core eddies off northeast Tasmania contribute to the development of MHWs further south associated with changes in the circulation and transports. Importantly, we find that eddy distributions in the Tasman Sea can act as predictors of MHWs off eastern Tasmania. We used self-organizing maps to distinguish sea surface height anomalies (SSHA) and MHWs into different, but connected, patterns. We found the statistical model performs best (precision ~ 0.75) in the southern domain off eastern Tasmania. Oceanic mean states and heat budget analysis for true positive and false negative marine heatwave events revealed that the model generally captures ocean advection dominated MHWs. Using SSHA as predictor variable, we find that our statistical model can forecast MHWs off southeast Tasmania up to 7 days in advance above random chance. This study provides improved understanding of the role of circulation anomalies associated with oceanic mesoscale eddies on MHWs off eastern Tasmania and highlights that individual MHWs in this region are potentially predictable up to 7 days in advance using mesoscale eddy-tracking methods.
Preprint
Extreme weather events are simultaneously the least likely and the most impactful features of the climate system, increasingly so as climate change proceeds. Extreme events are multi-faceted, highly variable processes which can be characterized in many ways: return time, worst-case severity, and predictability are all sought-after quantities for various kinds of rare events. A unifying framework is needed to define and calculate the most important quantities of interest for the purposes of near-term forecasting, long-term risk assessment, and benchmarking of reduced-order models. Here we use Transition Path Theory (TPT) for a comprehensive analysis of sudden stratospheric warming (SSW) events in a highly idealized wave-mean flow interaction system with stochastic forcing. TPT links together probabilities, dynamical behavior, and other risk metrics associated with rare events that represents their full statistical variability. At face value, fulfilling this promise demands extensive direct simulation to generate the rare event many times. Instead, we implement a highly parallel computational method that launches a large ensemble of short simulations, estimating long-timescale rare event statistics from short-term tendencies. We specifically investigate properties of SSW events including passage time distributions and large anomalies in vortex strength and heat flux. We visualize high-dimensional probability densities and currents, obtaining a nuanced picture of critical altitude-dependent interactions between waves and the mean flow that fuel SSW events. We find that TPT more faithfully captures the statistical variability between events as compared to the more conventional minimum action method.
Article
Full-text available
Acute and chronic wounds have varying etiologies and are an economic burden to healthcare systems around the world. The advanced wound care market is expected to exceed $22 billion by 2024. Wound care professionals rely heavily on images and image documentation for proper diagnosis and treatment. Unfortunately lack of expertise can lead to improper diagnosis of wound etiology and inaccurate wound management and documentation. Fully automatic segmentation of wound areas in natural images is an important part of the diagnosis and care protocol since it is crucial to measure the area of the wound and provide quantitative parameters in the treatment. Various deep learning models have gained success in image analysis including semantic segmentation. This manuscript proposes a novel convolutional framework based on MobileNetV2 and connected component labelling to segment wound regions from natural images. The advantage of this model is its lightweight and less compute-intensive architecture. The performance is not compromised and is comparable to deeper neural networks. We build an annotated wound image dataset consisting of 1109 foot ulcer images from 889 patients to train and test the deep learning models. We demonstrate the effectiveness and mobility of our method by conducting comprehensive experiments and analyses on various segmentation neural networks. The full implementation is available at https://github.com/uwm-bigdata/wound-segmentation.
Article
Full-text available
El Niño-Southern Oscillation (ENSO) is the dominant interseasonal–interannual variability in the tropical Pacific and substantial efforts have been dedicated to predicting its occurrence and variability because of its extensive global impacts. However, ENSO predictability has been reduced in the 21st century, and the impact of extratropical atmosphere on the tropics has intensified during the past 2 decades, making the ENSO more complicated and harder to predict. Here, by combining tropical preconditions/ocean–atmosphere interaction with extratropical precursors, we provide a novel approach to noticeably increase the ENSO prediction skill beyond the spring predictability barrier. The success of increasing the prediction skill results mainly from the longer lead-time of the extratropical–tropical ocean-to-atmosphere interaction process, especially for the first 2 decades of the 21st century.
Article
Full-text available
Variations in the El Niño/Southern Oscillation (ENSO) are associated with a wide array of regional climate extremes and ecosystem impacts¹. Robust, long-lead forecasts would therefore be valuable for managing policy responses. But despite decades of effort, forecasting ENSO events at lead times of more than one year remains problematic². Here we show that a statistical forecast model employing a deep-learning approach produces skilful ENSO forecasts for lead times of up to one and a half years. To circumvent the limited amount of observation data, we use transfer learning to train a convolutional neural network (CNN) first on historical simulations³ and subsequently on reanalysis from 1871 to 1973. During the validation period from 1984 to 2017, the all-season correlation skill of the Nino3.4 index of the CNN model is much higher than those of current state-of-the-art dynamical forecast systems. The CNN model is also better at predicting the detailed zonal distribution of sea surface temperatures, overcoming a weakness of dynamical forecast models. A heat map analysis indicates that the CNN model predicts ENSO events using physically reasonable precursors. The CNN model is thus a powerful tool for both the prediction of ENSO events and for the analysis of their associated complex mechanisms.
Article
Full-text available
Forecasting the El Nino-Southern Oscillation (ENSO) has been a subject of vigorous research due to the important role of the phenomenon in climate dynamics and its worldwide socioeconomic impacts. Over the past decades, numerous models for ENSO prediction have been developed, among which statistical models approximating ENSO evolution by linear dynamics have received significant attention owing to their simplicity and comparable forecast skill to first-principles models at short lead times. Yet, due to highly nonlinear and chaotic dynamics, such models have limited skill for longer-term forecasts beyond half a year. To resolve this limitation, here we employ a new nonparametric statistical approach based on analog forecasting, called kernel analog forecasting (KAF), which avoids assumptions on the underlying dynamics through the use of for nonlinear kernel methods for machine learning. Through a rigorous connection with Koopman operator theory for dynamical systems, KAF yields statistically optimal predictions of future ENSO states as conditional expectations, given noisy and potentially incomplete data at forecast initialization. Here, using industrial-era Indo-Pacific sea surface temperature (SST) as training data, the method is shown to successfully predict the Nino 3.4 index in a 1988-2017 verification period out to a 13-month lead, which corresponds to an increase of 6 months over a benchmark linear inverse model (LIM), while significantly improving upon the ENSO predictability "spring barrier". Additionally, analysis of a 1300-yr control integration of a comprehensive climate model demonstrates that the enhanced skill of KAF holds over potentially much longer leads, extending to 24 months versus 11 months in the benchmark LIM. Probabilistic forecasts for the occurrence of El Nino/La Nina events are also performed, and assessed via information-theoretic metrics.
Article
Full-text available
ENSO is the strongest interannual signal in the global climate system with worldwide climatic, ecological, and societal impacts. Over the past decades, the researches about ENSO prediction and predictability have attracted a broad attention. With the development of coupled models, the improvement in initialization schemes, and the advance in the theoretical studies, ENSO has become the most predictable climate mode at the time scales from months to seasons. This paper reviews in details the progress in ENSO predictions and predictability studies achieved in recent years. An emphasis is placed on two fundamental issues: the improvement in practical prediction skill and the advance in the theoretical study of the intrinsic predictability limit. The former includes the progress in the couple models, data assimilations, ensemble predictions and so on, and the latter focuses on the efforts in the study of the optimal error growth and in the estimate of the intrinsic predictability limit.
Article
Full-text available
Complex multiscale systems are ubiquitous in many areas. This research expository article discusses the development and applications of a recent information-theoretic framework as well as novel reduced-order nonlinear modeling strategies for understanding and predicting complex multiscale systems. The topics include the basic mathematical properties and qualitative features of complex multiscale systems, statistical prediction and uncertainty quantification, state estimation or data assimilation, and coping with the inevitable model errors in approximating such complex systems. Here, the information-theoretic framework is applied to rigorously quantify the model fidelity, model sensitivity and information barriers arising from different approximation strategies. It also succeeds in assessing the skill of filtering and predicting complex dynamical systems and overcomes the shortcomings in traditional path-wise measurements such as the failure in measuring extreme events. In addition, information theory is incorporated into a systematic data-driven nonlinear stochastic modeling framework that allows effective predictions of nonlinear intermittent time series. Finally, new efficient reduced-order nonlinear modeling strategies combined with information theory for model calibration provide skillful predictions of intermittent extreme events in spatially-extended complex dynamical systems. The contents here include the general mathematical theories, effective numerical procedures, instructive qualitative models, and concrete models from climate, atmosphere and ocean science.
Article
Full-text available
Seasonal forecasts made by coupled atmosphere–ocean general circulation models (CGCMs) undergo strong climate drift and initialization shock, driving the model state away from its long-term attractor. Here we explore initializing directly on a model’s own attractor, using an analog approach in which model states close to the observed initial state are drawn from a “library” obtained from prior uninitialized CGCM simulations. The subsequent evolution of those “model-analogs” yields a forecast ensemble, without additional model integration. This technique is applied to four of the eight CGCMs comprising the North American Multimodel Ensemble (NMME) by selecting from prior long control runs those model states whose monthly tropical Indo-Pacific SST and SSH anomalies best resemble the observations at initialization time. Hindcasts are then made for leads of 1–12 months during 1982–2015. Deterministic and probabilistic skill measures of these model-analog hindcast ensembles are comparable to those of the initialized NMME hindcast ensembles, for both the individual models and the multimodel ensemble. In the eastern equatorial Pacific, model-analog hindcast skill exceeds that of the NMME. Despite initializing with a relatively large ensemble spread, model-analogs also reproduce each CGCM’s perfect-model skill, consistent with a coarse-grained view of tropical Indo-Pacific predictability. This study suggests that with little additional effort, sufficiently realistic and long CGCM simulations provide the basis for skillful seasonal forecasts of tropical Indo-Pacific SST anomalies, even without sophisticated data assimilation or additional ensemble forecast integrations. The model-analog method could provide a baseline for forecast skill when developing future models and forecast systems.
Article
Predicting complex nonlinear turbulent dynamical systems is an important and practical topic. However, due to the lack of a complete understanding of nature, the ubiquitous model error may greatly affect the prediction performance. Machine learning algorithms can overcome the model error, but they are often impeded by inadequate and partial observations in predicting nature. In this article, an efficient and dynamically consistent conditional sampling algorithm is developed, which incorporates the conditional path-wise temporal dependence into a two-step forward-backward data assimilation procedure to sample multiple distinct nonlinear time series conditioned on short and partial observations using an imperfect model. The resulting sampled trajectories succeed in reducing the model error and greatly enrich the training data set for machine learning forecasts. For a rich class of nonlinear and non-Gaussian systems, the conditional sampling is carried out by solving a simple stochastic differential equation, which is computationally efficient and accurate. The sampling algorithm is applied to create massive training data of multiscale compressible shallow water flows from highly nonlinear and indirect observations. The resulting machine learning prediction significantly outweighs the imperfect model forecast. The sampling algorithm also facilitates the machine learning forecast of a highly non-Gaussian climate phenomenon using extremely short observations.
Article
With the era of big data, the utilization of machine learning algorithms in radiation oncology is rapidly growing with applications including: treatment response modeling, treatment planning, contouring, organ segmentation, image‐guidance, motion tracking, quality assurance, and more. Despite this interest, practical clinical implementation of machine learning as part of the day‐to‐day clinical operations is still lagging. The aim of this white paper is to further promote progress in this new field of machine learning in radiation oncology by highlighting its untapped advantages and potentials for clinical advancement, while also presenting current challenges and open questions for future research. The targeted audience of this paper includes newcomers as well as practitioners in the field of medical physics/radiation oncology. The paper also provides general recommendations to avoid common pitfalls when applying these powerful data analytic tools to medical physics and radiation oncology problems and suggests some guidelines for transparent and informative reporting of machine learning results.
Book
This book presents the Statistical Learning Theory in a detailed and easy to understand way, by using practical examples, algorithms and source codes. It can be used as a textbook in graduation or undergraduation courses, for self-learners, or as reference with respect to the main theoretical concepts of Machine Learning. Fundamental concepts of Linear Algebra and Optimization applied to Machine Learning are provided, as well as source codes in R, making the book as self-contained as possible. It starts with an introduction to Machine Learning concepts and algorithms such as the Perceptron, Multilayer Perceptron and the Distance-Weighted Nearest Neighbors with examples, in order to provide the necessary foundation so the reader is able to understand the Bias-Variance Dilemma, which is the central point of the Statistical Learning Theory. Afterwards, we introduce all assumptions and formalize the Statistical Learning Theory, allowing the practical study of different classification algorithms. Then, we proceed with concentration inequalities until arriving to the Generalization and the Large-Margin bounds, providing the main motivations for the Support Vector Machines. From that, we introduce all necessary optimization concepts related to the implementation of Support Vector Machines. To provide a next stage of development, the book finishes with a discussion on SVM kernels as a way and motivation to study data spaces and improve classification results.