ArticlePDF Available

Abstract and Figures

We present a novel approach to simulate and reconstruct annual glacier-wide surface mass balance (SMB) series based on a deep artificial neural network (ANN; i.e. deep learning). This method has been included as the SMB component of an open-source regional glacier evolution model. While most glacier models tend to incorporate more and more physical processes, here we take an alternative approach by creating a parameterized model based on data science. Annual glacier-wide SMBs can be simulated from topo-climatic predictors using either deep learning or Lasso (least absolute shrinkage and selection operator; reg-ularized multilinear regression), whereas the glacier geometry is updated using a glacier-specific parameterization. We compare and cross-validate our nonlinear deep learning SMB model against other standard linear statistical methods on a dataset of 32 French Alpine glaciers. Deep learning is found to outperform linear methods, with improved explained variance (up to + 64 % in space and +108 % in time) and accuracy (up to +47 % in space and +58 % in time), resulting in an estimated r 2 of 0.77 and a root-mean-square error (RMSE) of 0.51 m w.e. Substantial nonlinear structures are captured by deep learning, with around 35 % of nonlinear behaviour in the temporal dimension. For the glacier geometry evolution, the main uncertainties come from the ice thickness data used to initialize the model. These results should encourage the use of deep learning in glacier modelling as a powerful nonlinear tool, capable of capturing the nonlinear-ities of the climate and glacier systems, that can serve to reconstruct or simulate SMB time series for individual glaciers in a whole region for past and future climates.
Content may be subject to copyright.
The Cryosphere, 14, 565–584, 2020
© Author(s) 2020. This work is distributed under
the Creative Commons Attribution 4.0 License.
Deep learning applied to glacier evolution modelling
Jordi Bolibar1,2, Antoine Rabatel1, Isabelle Gouttevin3, Clovis Galiez4, Thomas Condom1, and Eric Sauquet2
1Univ. Grenoble Alpes, CNRS, IRD, G-INP, Institut des Géosciences de l’Environnement
(IGE, UMR 5001), Grenoble, France
2INRAE, UR RiverLy, Villeurbanne, Lyon, France
3Univ. Grenoble Alpes, Université de Toulouse, Météo-France, CNRS, CNRM,
Centre d’Études de la Neige, Grenoble, France
4Univ. Grenoble Alpes, CNRS, Grenoble INP, LJK, Grenoble, France
Correspondence: Jordi Bolibar (
Received: 8 July 2019 – Discussion started: 26 July 2019
Revised: 16 January 2020 – Accepted: 18 January 2020 – Published: 13 February 2020
Abstract. We present a novel approach to simulate and re-
construct annual glacier-wide surface mass balance (SMB)
series based on a deep artificial neural network (ANN;
i.e. deep learning). This method has been included as the
SMB component of an open-source regional glacier evolu-
tion model. While most glacier models tend to incorporate
more and more physical processes, here we take an alterna-
tive approach by creating a parameterized model based on
data science. Annual glacier-wide SMBs can be simulated
from topo-climatic predictors using either deep learning or
Lasso (least absolute shrinkage and selection operator; reg-
ularized multilinear regression), whereas the glacier geome-
try is updated using a glacier-specific parameterization. We
compare and cross-validate our nonlinear deep learning SMB
model against other standard linear statistical methods on a
dataset of 32 French Alpine glaciers. Deep learning is found
to outperform linear methods, with improved explained vari-
ance (up to +64 % in space and +108 % in time) and ac-
curacy (up to +47 % in space and +58 % in time), result-
ing in an estimated r2of 0.77 and a root-mean-square error
(RMSE) of 0.51 m w.e. Substantial nonlinear structures are
captured by deep learning, with around 35 % of nonlinear be-
haviour in the temporal dimension. For the glacier geometry
evolution, the main uncertainties come from the ice thick-
ness data used to initialize the model. These results should
encourage the use of deep learning in glacier modelling as a
powerful nonlinear tool, capable of capturing the nonlinear-
ities of the climate and glacier systems, that can serve to re-
construct or simulate SMB time series for individual glaciers
in a whole region for past and future climates.
1 Introduction
Glaciers are arguably one of the most important icons of cli-
mate change, being climate proxies which can depict the evo-
lution of climate for the global audience (IPCC, 2018). In
the coming decades, mountain glaciers will be some of the
most important contributors to sea level rise and will most
likely drive important changes in the hydrological regime of
glacierized catchments (Beniston et al., 2018; Vuille et al.,
2018; Hock et al., 2019). The reduction in ice volume may
produce an array of hydrological, ecological and economic
consequences in mountain regions which require being prop-
erly predicted. These consequences will strongly depend on
the future climatic scenarios, which will determine the tim-
ing and magnitude for the transition of hydrological regimes
(Huss and Hock, 2018). Understanding these future transi-
tions is key for societies to adapt to future hydrological and
climate configurations.
Glacier and hydro-glaciological models can help answer
these questions, giving several possible outcomes depend-
ing on multiple climate scenarios. (a) Surface mass balance
(SMB) and (b) glacier dynamics both need to be modelled
to understand glacier evolution on regional and sub-regional
scales. Models of varying complexity exist for both pro-
cesses. In order to model these processes at a large scale (i.e.
on several glaciers at a catchment scale), some compromises
need to be made, which can be approached in different ways.
Published by Copernicus Publications on behalf of the European Geosciences Union.
566 J. Bolibar et al.: Deep learning applied to glacier evolution modelling
a. Regarding SMB, the following apply:
1. Empirical models, like the temperature-index
model (e.g. Hock, 2003), simulate glacier SMB
through empirical relationships between air temper-
ature and melt and snow accumulation.
2. Statistical or machine learning models describe
and predict glacier SMB based on statistical rela-
tionships found in data from a selection of topo-
graphical and climate predictors (e.g. Martin, 1974;
Steiner et al., 2005).
3. Physical and surface energy balance (SEB) mod-
els take into account all energy exchanges between
the glacier and the atmosphere and can simulate the
spatial and temporal variability in snowmelt and the
changes in albedo (e.g. Gerbaux et al., 2005).
b. Regarding glacier dynamics, the following apply:
1. Parameterized models do not explicitly resolve any
physical processes but implicitly take them into ac-
count using parameterizations, based on statistical
or empirical relationships, in order to modify the
glacier geometry. This type of model ranges from
very simple statistical models (e.g. Carlson et al.,
2014) to more complex ones based on different
approaches, such as a calibrated equilibrium-line
altitude (ELA) model (e.g. Zemp et al., 2006), a
glacier retreat parameterization specific to glacier
size groups (Huss and Hock, 2015), or volume
and length–area scaling (e.g. Marzeion et al., 2012;
c et al., 2014).
2. Process-based models, like GloGEMflow (e.g.
Zekollari et al., 2019) and the Open Global Glacier
Model (OGGM; e.g. Maussion et al., 2019), ap-
proximate a number of glacier physical processes
involved in ice flow dynamics using the shallow ice
3. Physics-based models, like the finite-element
Elmer/Ice model (e.g. Gagliardini et al., 2013), ap-
proach glacier dynamics by explicitly simulating
physical processes and solving the full Stokes equa-
tions (e.g. Jouvet et al., 2009; Réveillet et al., 2015).
At the same time, the use of these different approaches
strongly depends on available data, whose spatial and tem-
poral resolutions have an important impact on the results’
quality and uncertainties (e.g. Réveillet et al., 2018). Param-
eterized glacier dynamics models and empirical and statis-
tical SMB models require a reference or training dataset to
calibrate the relationships, which can then be used for pro-
jections with the hypothesis that relationships remain sta-
tionary in time. On the contrary, process-based and specially
physics-based glacier dynamics and SMB models have the
advantage of representing physical processes, but they re-
quire larger datasets at higher spatial and temporal resolu-
tions with a consequently higher computational cost (Réveil-
let et al., 2018). For SMB modelling, meteorological reanal-
yses provide an attractive alternative to sparse point observa-
tions, although their spatial resolution and suitability to com-
plex high-mountain topography are often not good enough
for high-resolution physics-based glacio-hydrological appli-
cations. However, parameterized models are much more flex-
ible, equally dealing with fewer and coarser meteorological
data as well as the state-of-the-art reanalyses, which allows
them to work at resolutions much closer to the glaciers’ scale
and to reduce uncertainties. The current resolution of climate
projections is still too low to adequately drive most glacier
physical processes, but the ever-growing datasets of histori-
cal data are paving the way for the training of parameterized
machine learning models.
In glaciology, statistical models have been applied for
more than half a century, starting with simple multiple lin-
ear regressions on few meteorological variables (Hoinkes,
1968; Martin, 1974). Statistical modelling has made enor-
mous progress in the last decades, especially thanks to the
advent of machine learning. Compared to other fields in geo-
sciences, such as oceanography (e.g. Ducournau and Fablet,
2016; Lguensat et al., 2018), climatology (e.g. Rasp et al.,
2018; Jiang et al., 2018) and hydrology (e.g. Marçais and
de Dreuzy, 2017; Shen, 2018), we believe that the glacio-
logical community has not yet exploited the full capabilities
of these approaches. Despite this fact, a number of studies
have taken steps towards statistical approaches. Steiner et al.
(2005) pioneered the very first study to use artificial neural
networks (ANNs) in glaciology to simulate mass balances
of the Great Aletsch Glacier in Switzerland. They showed
that a nonlinear model is capable of better simulating glacier
mass balances compared to a conventional stepwise multi-
ple linear regression. Furthermore, they found a significant
nonlinear part within the climate–glacier mass balance rela-
tionship. This work was continued in Steiner et al. (2008)
and Nussbaumer et al. (2012) for the simulation of glacier
length instead of mass balances. Later on, Maussion et al.
(2015) developed an empirical statistical downscaling tool
based on machine learning in order to retrieve glacier sur-
face energy and mass balance (SEB–SMB) fluxes from large-
scale atmospheric data. They used different machine learning
algorithms, but all of them were linear, which is not neces-
sarily the most suitable for modelling the nonlinear climate
system (Houghton et al., 2001). Nonetheless, more recent de-
velopments in the field of machine learning and optimization
enabled the use of deeper network structures than the three-
layer ANN of Steiner et al. (2005). These deeper ANNs,
which remain unexploited in glaciology, allow us to capture
more nonlinear structures in the data even for relatively small
datasets (Ingrassia and Morlini, 2005; Olson et al., 2018).
Here, we present a parameterized regional open-source
glacier model: the ALpine Parameterized Glacier Model
The Cryosphere, 14, 565–584, 2020
J. Bolibar et al.: Deep learning applied to glacier evolution modelling 567
(ALPGM; Bolibar, 2019). While most glacier evolution
models tend to incorporate more and more physical processes
in SMB or ice dynamics (e.g. Maussion et al., 2019; Zekol-
lari et al., 2019), ALPGM takes an alternative approach based
on data science for SMB modelling and parameterizations
for glacier dynamics simulation. ALPGM simulates annual
glacier-wide SMB and the evolution of glacier volume and
surface area over timescales from a few years to a century
at a regional scale. Glacier-wide SMBs are computed us-
ing a deep ANN, fed by several topographical and climatic
variables, an approach which is compared to different lin-
ear methods in the present paper. In order to distribute these
annual glacier-wide SMBs and to update the glacier geom-
etry, a refined version of the 1h methodology (e.g. Huss
et al., 2008) is used, for which we dynamically compute
glacier-specific 1h functions. In order to validate this ap-
proach, we use a case study with 32 French Alpine glaciers
for which glacier-wide annual SMBs are available over the
period 1984–2014 and 1959–2015 for certain glaciers. High-
resolution meteorological reanalyses for the same time pe-
riod are used (SAFRAN; Durand et al., 2009), while the
initial ice thickness distribution of glaciers is taken from
Farinotti et al. (2019), for which we performed a sensitivity
analysis based on field observations.
In the next section, we present an overview of the pro-
posed glacier evolution model framework with a detailed de-
scription of the two components used to simulate the annual
glacier-wide SMB and the glacier geometry update. Then, a
case study using French Alpine glaciers is presented, which
enables us to illustrate an example of application of the pro-
posed framework, including a rich dataset, the parameterized
functions, and the results and their performance. In the end,
several aspects regarding machine and deep learning mod-
elling in glaciology are discussed, from which we make some
recommendations and draw the final conclusions.
2 Model overview and methods
In this section we present an overview of ALPGM. More-
over, the two components of this model are presented in de-
tail: the Glacier-wide SMB Simulation component and the
Glacier Geometry Update component.
2.1 Model overview and workflow
ALPGM is an open-source glacier model coded in Python.
The source code of the model is accessible in the project
repository (see Code and data availability). It is structured
in multiple files which execute specific separate tasks. The
model can be divided into two main components: (1) the
Glacier-wide SMB Simulation and (2) the Glacier Geome-
try Update. The Glacier-wide SMB Simulation component
is based on machine learning, taking both meteorological
and topographical variables as inputs. The Glacier Geome-
try Update component generates the glacier-specific param-
eterized functions and modifies annually the geometry of the
glacier (e.g. ice thickness distribution, glacier outline) based
on the glacier-wide SMB models generated by the Glacier-
wide SMB simulation component.
Figure 1 presents ALPGM’s basic workflow. The work-
flow execution can be configured via the model interface, al-
lowing the user to run or skip any of the following steps:
1. The meteorological forcings are preprocessed in order
to extract the necessary data closest to each glacier’s
centroid. The meteorological features are stored in in-
termediate files in order to reduce computation times
for future runs, automatically skipping this preprocess-
ing step when the files have already been generated.
2. The SMB machine learning component retrieves the
preprocessed climate predictors from the stored files,
retrieves the topographical predictors from the multi-
temporal glacier inventories and assembles the train-
ing dataset by combining all the necessary topo-climatic
predictors. A machine learning algorithm is chosen for
the SMB model, which can be loaded from a previous
run, or it can be trained again with a new dataset. Then,
the SMB model or models are trained with the full topo-
climatic dataset. These models are stored in intermedi-
ate files, allowing the user to skip this step for future
3. Performances of the SMB models can be evaluated with
a leave-one-glacier-out (LOGO) or a leave-one-year-out
(LOYO) cross validation. This step can be skipped when
using already-established models. Basic statistical per-
formance metrics are given for each glacier and model
as well as plots with the simulated cumulative glacier-
wide SMBs compared to their reference values with
uncertainties for each of the glaciers from the training
4. The Glacier Geometry Update component starts with
the generation of the glacier-specific parameterized
functions, using a raster containing the difference of the
two preselected digital elevation models (DEMs) cov-
ering the study area for two separate dates as well as
the glacier contours. These parameterized functions are
then stored in individual files to be used in the final sim-
5. Once all previous steps have been run, the glacier evo-
lution simulations are launched. For each glacier, the
initial ice thickness and DEM rasters and the glacier ge-
ometry update function are retrieved. Then, in a loop,
for every glacier and year, the topographical data are
computed from these raster files. The climate predic-
tors at the glacier’s current centroid are retrieved from
the climate data (e.g. reanalysis or projections), and
with all this data the input topo-climatic data for the The Cryosphere, 14, 565–584, 2020
568 J. Bolibar et al.: Deep learning applied to glacier evolution modelling
Figure 1. ALPGM structure and workflow.
glacier-wide SMB model are assembled. Afterwards,
the glacier-wide SMB for this glacier and year is simu-
lated, which, combined with the glacier-specific geom-
etry update function, allows us to update the glacier’s
ice thickness and DEM rasters. This process is repeated
in a loop, therefore updating the glacier’s geometry with
an annual time step and taking into account the glacier’s
morphological and topographical changes in the glacier-
wide SMB simulations. For the simulation of the fol-
lowing year’s SMB, the previously updated ice thick-
ness and DEM rasters are used to recompute the topo-
graphical parameters, which in turn are used as input
topographical predictors for the glacier-wide SMB ma-
chine learning model. If all the ice thickness raster pix-
els of a glacier become zero, the glacier is considered
to have disappeared and is removed from the simula-
tion pipeline. For each year, multiple results are stored
in data files as well as the raster DEM and ice thickness
values for each glacier.
2.2 Glacier-wide surface mass balance simulation
Annual glacier-wide SMBs are simulated using machine
learning. Due to the regional characteristics and specifici-
ties of topographical and climate data, this glacier-wide SMB
modelling method is, for now, a regional approach.
2.2.1 Selection of explanatory topographical and
climatic variables
In order to narrow down which topographical and climatic
variables best explain glacier-wide SMB in a given study
area, a literature review as well as a statistical sensitivity
analysis are performed. Typically used topographical predic-
tors are longitude, latitude, glacier slope and mean altitude.
As for meteorological predictors, cumulative positive degree
days (CPDDs) but also mean monthly temperature, snowfall
and possibly other variables that influence the surface energy
budget are often used in the literature. Examples of both to-
pographic and meteorological predictors can be found in the
case study in Sect. 3. A way to prevent biases when mak-
ing predictions with different climate data is to work with
anomalies, calculated as differences of the variable with re-
spect to its average value over a chosen reference period.
For the machine learning training, the relevant predictors
must be selected, so we perform a sensitivity study of the
annual glacier-wide SMB to topographical and climatic vari-
ables over the study training period. This can be performed
with individual linear regressions between each variable and
glacier-wide SMB data. After identification of the topograph-
ical and climatic variables that can potentially explain an-
nual glacier-wide SMB variability for the region of interest,
a training dataset is built. An effective way of expanding the
training dataset in order to dig deeper into the available data
The Cryosphere, 14, 565–584, 2020
J. Bolibar et al.: Deep learning applied to glacier evolution modelling 569
Figure 2. Glacier-wide SMB simulation component workflow. Ma-
chine learning models are dynamically created based on training
is to combine the climatic and topographical input variables
(Weisberg, 2014). Such combinations can be expressed fol-
lowing Eq. (1):
SMBg,y=f ( ˆ
where ˆ
is a vector of the selected topographical predictors,
Cis a vector with the selected climatic features and εg,yis
the residual error for each annual glacier-wide SMB value,
Once the training dataset is created, different algorithms f
(two linear and one nonlinear, for the case of this study) can
be chosen to create the SMB model: (1) OLS (ordinary-least-
square) all-possible multiple linear regressions, (2) Lasso
(least absolute shrinkage and selection operator; Tibshirani,
1996) and (3) a deep ANN. ALPGM uses some of the
most popular machine learning Python libraries: statsmodels
(Seabold and Perktold, 2010), scikit-learn (Pedregosa et al.,
2011) and Keras (Chollet, 2015) with a TensorFlow backend.
The overall workflow of the machine learning glacier-wide
SMB model production in ALPGM is summarized in Fig. 2.
2.2.2 All-possible multiple linear regressions
With the OLS all-possible multiple linear regressions, we at-
tempt to find the best subset of predictors in Eq. (1) based
on the resulting r2adjusted while at the same time avoiding
overfitting (Hawkins, 2004) and collinearity and limiting the
complexity of the model. As its name indicates, the goal is
to minimize the residual sum of squares for each subset of
predictors (Hastie et al., 2009). nmodels are produced by se-
lecting all possible subsets of kpredictors. It is advisable to
narrow down the number of predictors for each subset in the
search to reduce the computational cost. Models with low
performance are filtered out, keeping only models with the
highest r2adjusted possible, a variance inflation factor (VIF)
<1.2 and a pvalue <0.01/n (in order to ensure the Bonfer-
roni correction). Retained models are combined by averag-
ing their predictions, thereby avoiding the pitfalls related to
stepwise single-model selection (Whittingham et al., 2006).
These criteria ensure that the models explain as much vari-
ability as possible, avoid collinearity and are statistically sig-
2.2.3 Lasso
The Lasso (Tibshirani, 1996) is a shrinkage method which
attempts to overcome the shortcomings of the simpler step-
wise and all-possible regressions. In these two classical ap-
proaches, predictors are discarded in a discrete way, giving
subsets of variables which have the lowest prediction error.
However, due to its discrete selection, these different subsets
can exhibit high variance, which does not reduce the predic-
tion error of the full model. The Lasso performs a more con-
tinuous regularization by shrinking some coefficients and set-
ting others to zero, thus producing more interpretable models
(Hastie et al., 2009). Because of its properties, it strikes a bal-
ance between subset selection (like all-possible regressions)
and ridge regression (Hoerl and Kennard, 1970). All input
data are normalized by removing the mean and scaling to unit
variance. In order to determine the degree of regularization
applied to the coefficients used in the linear OLS regression,
an alpha-parameter needs to be chosen using cross valida-
tion. ALPGM performs different types of cross validations
to choose from: the Akaike information criterion (AIC), the
Bayesian information criterion (BIC) and a classical cross
validation with iterative fitting along a regularization path
(used in the case study). Alternatively, a Lasso model with
least-angle regression, also known as Lasso Lars (Tibshirani
et al., 2004), can also be chosen with a classical cross valida-
2.2.4 Deep artificial neural network
ANNs are nonlinear statistical models inspired by biological
neural networks (Fausett, 1994; Hastie et al., 2009). A neu-
ral network is characterized by (1) the architecture or pattern
of connections between units and the number of layers (in-
put, output and hidden layers), (2) the optimizer, which is
the method for determining the weights of the connections
between units, and (3) its (usually nonlinear) activation func-
tions (Fausett, 1994). When ANNs have more than one hid-
den layer (e.g. Fig. 3), they are referred to as deep ANNs or
deep learning. The description of neural networks is beyond
the scope of this study, so for more details and a full explana-
tion, please refer to Fausett (1994), Hastie et al. (2009) and
Steiner et al. (2005, 2008), where the reader can find a thor-
ough introduction to the use of ANNs in glaciology. ANNs
gained recent interest thanks to improvements of optimiza-
tion algorithms allowing the training deep neural networks,
which lead to better representation of complex data patterns.
As their learnt parameters are difficult to interpret, ANNs are
adequate tools when the quality of predictions prevails over
the interpretability of the model (the latter likely involving
causal inference, sensitivity testing or modelling of ancillary
variables). This is precisely the case in our study context The Cryosphere, 14, 565–584, 2020
570 J. Bolibar et al.: Deep learning applied to glacier evolution modelling
here, where abundant knowledge about glacier physics fur-
ther helps in choosing adequate variables as input to deep
learning. Their ability to model complex functions of the
input parameters makes them particularly suitable for mod-
elling complex nonlinear systems such as the climate system
(Houghton et al., 2001) and glacier systems (Steiner et al.,
ALPGM uses a feed-forward fully connected ANN
(Fig. 3). In such an architecture, the processing units – or
neurons – are grouped into layers, where all the units of a
given layer are fully connected to all units of the next layer.
The flow of information is directional, from the input layer
(i.e. in which each neuron corresponds to one of the Nex-
planatory variables) to the output neuron (i.e. correspond-
ing to the target variable of the model, the SMB). For each
connection of the ANN, weights are initialized in a random
fashion following a specific distribution (generally centred
around 0). In each unit of each hidden layer, the weighted
values are summed before going through a nonlinear activa-
tion function, responsible for introducing the nonlinearities
in the model. Using a series of iterations known as epochs,
the ANN will try to minimize a specific loss function (the
mean-squared error – MSE – in our case), comparing the pro-
cessed values of the output layer with the ground truth (y).
In order to avoid falling into local minima of the loss func-
tion, some regularization is needed to prevent the ANN from
overfitting (Hastie et al., 2009). To prevent overfitting during
the training process (i.e.. to increase the ability of the model
to generalize to new data), we used a classical regulariza-
tion method called dropout, consisting of training iteratively
smaller subparts of the ANN by randomly disconnecting a
certain number of connections between units. The introduc-
tion of Gaussian noise at the input of the ANN also helped
to generalize, as it performs a similar effect to data augmen-
tation. The main consequence of regularization is general-
ization, for which the produced model is capable of better
adapting to different configurations of the input data.
The hyperparameters used to configure the ANN are de-
termined using cross validation in order to find the best-
performing combination of number of units, hidden layers,
activation function, learning rate and regularization method.
Due to the relatively small size of our dataset, we encoun-
tered the best performances with a quite small deep ANN,
with a total of six layers (four hidden layers) with a (N, 40,
20, 10, 5, 1) architecture (Fig. 3), where Nis the number
of selected features. Since the ANN already performs all the
possible combinations between features (predictors), we use
a reduced version of the training matrix from Eq. (1), with no
combination of climatic and topographical features. Due to
the relatively small size of the architecture, the best dropout
rates are small (Srivastava et al., 2014) and range between 0.3
and 0.01, depending on the number of units of each hidden
layer. Leaky ReLUs have been chosen as the activation func-
tion because of their widespread reliability and the fact that
they help prevent the “dead ReLU” problem, where certain
neurons can stop “learning” (Xu et al., 2015). The He uni-
form initialization (He et al., 2015) was used, as it is shown
to work well with Leaky ReLUs, and all unit bias were ini-
tialized to zero. In order to optimize the weights of the gra-
dient descent, we used the RMSprop optimizer, for which
we fine-tuned the learning rate, obtaining the best results at
0.0005 in space and 0.02 in time. Each batch was normalized
before applying the activation function in order to accelerate
the training (Ioffe and Szegedy, 2015).
Like for many other geophysical processes found in na-
ture, extreme annual glacier-wide SMB values occur much
less often than average values, approximately following an
unbounded Gumbel-type distribution (Thibert et al., 2018).
From a statistical point of view, this means that ANN will
“see” few extreme values and will accord less importance to
them. For future projections in a warmer climate, extreme
positive glacier-wide SMB balances should not be the main
concern of glacier models. However, extreme negative an-
nual glacier-wide SMB values should likely increase in fre-
quency, so it is in the modeller’s interest to reproduce them as
well as possible. Setting the sample weights as the inverse of
the probability density function during the ANN training can
partly compensate for the imbalance of a dataset. This boosts
the performance of the model for the extreme values at the
cost of sacrificing some performance on more average val-
ues, which can be seen as a r2–RMSE trade-off (see Figs. 6
and 9 from the case study). The correct setting of the sample
weights allows the modeller to adapt the ANN to each dataset
and application.
2.3 Glacier geometry update
Since the first component of ALPGM simulates annual
glacier-wide SMBs, these changes in mass need to be redis-
tributed over the glacier surface area in order to reproduce
glacier dynamics. This redistribution is applied using the 1h
parameterization. The idea was first developed by Jóhannes-
son et al. (1989) and then adapted and implemented by Huss
et al. (2008). The main idea behind it is to use two or more
DEMs covering the study area. These DEMs should have
dates covering a period long enough (which will be later dis-
cussed in detail). By subtracting them, the changes in glacier
surface elevation over time can be computed, which corre-
sponds to a change in thickness (considering no basal ero-
sion). Then, these thickness changes are normalized and con-
sidered to be a function of the normalized glacier altitude.
This 1h function is specific to each glacier and represents
the normalized glacier thickness evolution over its altitudi-
nal range. One advantage of such a parameterized approach
is that it implicitly considers the ice flow which redistributes
the mass from the accumulation to the ablation area. In order
to make the glacier volume evolve in a mass-conserving fash-
ion, we apply this function to the annual glacier-wide SMB
values in order to scale and distribute its change in volume.
The Cryosphere, 14, 565–584, 2020
J. Bolibar et al.: Deep learning applied to glacier evolution modelling 571
Figure 3. Deep artificial neural network architecture used in ALPGM. The numbers indicate the number of neurons in each layer.
As discussed in Vincent et al. (2014), the time period be-
tween the two DEMs used to calibrate the method needs to
be long enough to show important ice thickness differences.
The criteria will of course depend on each glacier and each
period, but it will always be related to the achievable signal-
to-noise ratio. Vincent et al. (2014) concluded that for their
study on the Mer de Glace glacier (28.8 km2; mean altitude
of 2868 m.a.s.l.) in the French Alps, the 2003–2008 period
was too short due to the delayed response of glacier geom-
etry to a change in surface mass balance. Indeed, the results
for that 5-year period diverged from the results from longer
periods. Moreover, the period should be long enough to be
representative of the glacier evolution, which will often en-
compass periods with strong ablation and others with no re-
treat or even with positive SMBs.
Therefore, by subtracting the two DEMs, the ice thickness
difference is computed for each specific glacier. These val-
ues can then be classified by altitude, thus obtaining an av-
erage glacier thickness difference for each pixel altitude. As
a change to previous studies (Vincent et al., 2014; Huss and
Hock, 2015; Hanzer et al., 2018; Vincent et al., 2019), we
no longer work with altitudinal transects but with individ-
ual pixels. In order to filter noise and artefacts coming from
the DEM raster files, different filters are applied to remove
outliers and pixels with unrealistic values, namely at the bor-
der of glaciers or where the surface slopes are high (refer to
Supplement for detailed information). Our methodology thus
allows us to better exploit the available spatial information
based on its quality and not on an arbitrary location within
3 Case study: French Alpine glaciers
3.1 Data
All data used in this case study are based on the French Alps
(Fig. 4), located in the westernmost part of the European
Alps, between 44 and 46130N, 5.08 and 7.67E. This region
is particularly suited for the validation of a glacier evolu-
tion model because of the wealth of available data. Moreover,
ALPGM has been developed as part of a hydro-glaciological
study to understand the impact of the retreat of French Alpine
glaciers in the Rhône river catchment (97800km2).
3.1.1 Glacier-wide surface mass balance
An annual glacier-wide SMB dataset, reconstructed using
remote-sensing based on changes in glacier volume and the
snow line altitude, is used (Rabatel et al., 2016). This dataset
is constituted by annual glacier-wide SMB values for 30
glaciers in the French Alps (Fig. 4) for 31 years, between
1984 and 2014. The great variety in topographical character-
istics of the glaciers included in the dataset, with good cov-
erage of the three main clusters or groups of glaciers in the
French Alps (Fig. 4), makes them an ideal training dataset
for the model. Each of the clusters represents a different
setup of glaciers with different contrasting latitudes (Écrins
and Mont-Blanc), longitudes (Écrins and Vanoise), glacier
size (smaller glaciers in Écrins and Vanoise vs. larger ones
in Mont-Blanc) and climatic characteristics, with a Mediter-
ranean influence towards the south of the study region. For
more details regarding this dataset refer to Rabatel et al.
(2016). Data from the Mer de Glace, Saint-Sorlin, Sarennes
and Argentière glaciers are also used, coming from field ob-
servations from the GLACIOCLIM observatory. For some of
these glaciers, glacier-wide SMB values have been available
since 1949, although only values from 1959 onwards were
used to match the meteorological reanalysis. This makes
a total of 32 glaciers (Argentière and Saint-Sorlin glaciers
belonging to the two datasets), representing 1048 annual
glacier-wide SMB values (taking into account some gaps in
the dataset).
3.1.2 Topographical glacier data and altimetry
The topographical data used for the training of the glacier-
wide SMB machine learning models are taken from the
multitemporal inventory of the French Alps glaciers (e.g.
Gardent et al., 2014), partly available through the GLIMS
Glacier Database (NSIDC, 2005). We worked with the 1967, The Cryosphere, 14, 565–584, 2020
572 J. Bolibar et al.: Deep learning applied to glacier evolution modelling
Figure 4. French Alpine glaciers used for model training and
validation and their classification into three clusters or regions
(Écrins, Vanoise, Mont-Blanc). Coordinates of bottom left map cor-
ner: 44320N, 5400E. Coordinates of the top right map corner:
46080N, 7170E.
1985, 2003 and 2015 inventories (Gardent et al., 2014, with
2015 update). Between these dates, the topographical predic-
tors are linearly interpolated. On the other hand, in the glacier
evolution component of ALPGM (Fig. 1; step 5), the topo-
graphical data are recomputed every year for each glacier
from the evolving and annually updated glacier-specific ice
thickness and DEM rasters (Sect. 3.1.3). Since these raster
files are estimates for the year 2003 (Farinotti et al., 2019 for
the ice thickness), the full glacier evolution simulations can
start on this date at the earliest. For the computation of the
glacier-specific geometry update functions, two DEMs cov-
ering the whole French Alps have been used: (1) one from
2011 generated from SPOT5 stereo-pair images, acquired
on 15 October 2011, and (2) a 1979 aerial photogrammetric
DEM from the French National Geographic Institute (Institut
Géographique National – IGN), processed from aerial pho-
tographs taken around 1979. Both DEMs have an accuracy
between 1 and 4 m (Rabatel et al., 2016), and their uncertain-
ties are negligible compared to many other parameters in this
3.1.3 Glacier ice thickness
Glacier ice thickness data come from Farinotti et al. (2019),
hereafter F19, based on the Randolph Glacier Inventory v6.0
(RGI; Consortium, 2017). The ice thickness values represent
the latest consensus estimate, averaging an ensemble of dif-
ferent methods based on the principles of ice flow dynamics
to invert the ice thickness from surface characteristics.
We also have ice thickness data acquired by diverse
field methods (seismic, ground-penetrating radar or hot-
water drilling; Rabatel et al., 2018) for four glaciers of
the GLACIOCLIM observatory. We compared these in situ
thickness data with the simulated ice thicknesses from F19
(refer to Supplement for detailed information). Although dif-
ferences can be found (locally up to 100 % in the worst
cases), no systematic biases were found with respect to
glacier local slope or glacier altitude; therefore, no sys-
tematic correction was applied to the dataset. The simu-
lated ice thicknesses for Saint-Sorlin (2 km2; mean altitude
of 2920 m a.s.l.; Écrins cluster) and Mer de Glace (28 km2;
mean altitude of 2890 m a.s.l.; Mont-Blanc cluster) glaciers
are satisfactorily modelled by F19. Mer de Glace’s tongue
presents local errors of about 50 m, peaking at 100 m (30 %
error) at around 2000–2100 m a.s.l., but the overall distri-
bution of the ice is well represented. Saint-Sorlin Glacier
follows a similar pattern, with maximum errors of around
20 m (20 % error) at 2900 m a.s.l. and a good representa-
tion of the ice distribution. The ice thicknesses for Argen-
tière Glacier (12.8 km2; mean altitude of 2808 ma.s.l.; Mont-
Blanc cluster) and Glacier Blanc (4.7 km2; mean altitude of
3196 ma.s.l.; Écrins cluster) are underestimated by F19, with
an almost constant bias with respect to altitude, as seen in
Rabatel et al. (2018). Therefore, a manual correction was ap-
plied to the F19 datasets for these two glaciers based on the
field observations from the GLACIOCLIM observatory. A
detailed plot (Fig. S2) presenting these results can be found
in the Supplement.
3.1.4 Climate data
In our French Alps case study, ALPGM is forced with
daily mean near-surface (2m) temperatures, daily cumula-
tive snowfall and rain. The SAFRAN dataset is used to pro-
vide these data close to the glaciers’ centroids. SAFRAN
meteorological data (Durand et al., 2009) are a reanalysis of
weather data including observations from different networks
and specific to the French mountain regions (Alps, Pyrenees
and Corsica). Instead of being structured as a grid, data are
provided at the scale of massifs, which are in turn divided
into altitude bands of 300 m and into five different aspects
(north, south, east, west and flat).
The Cryosphere, 14, 565–584, 2020
J. Bolibar et al.: Deep learning applied to glacier evolution modelling 573
3.2 Glacier-wide surface mass balance simulations:
validation and results
In this section, we go through the selection of SMB pre-
dictors, we introduce the procedure for building machine
learning SMB models, we assess their performance in space
and time and we show some results of simulations using the
French Alpine glacier dataset.
3.2.1 Selection of predictors
Statistical relationships between meteorological and topo-
graphical variables with respect to glacier-wide SMB are fre-
quent in the literature for the European Alps (Hoinkes, 1968).
Martin (1974) performed a sensitivity study on the SMB of
the Saint-Sorlin and Sarennes glaciers (French Alps) with
respect to multi-annual meteorological observations for the
1957–1972 period. Martin (1974) obtained a multiple linear
regression function based on annual precipitation and sum-
mer temperatures, and he concluded that it could be further
improved by differentiating winter and summer precipitation.
Six and Vincent (2014) studied the sensitivity of the SMB
to climate change in the French Alps from 1998 to 2014.
They found that the variance in summer SMB is responsi-
ble for over 90% of the variance in the annual glacier-wide
SMB. Rabatel et al. (2013); Rabatel et al. (2016) performed
an extensive sensitivity analysis of different topographical
variables (slope of the lowermost 20% of the glacier area,
mean elevation, surface area, length, minimum elevation,
maximum elevation, surface area change and length change)
with respect to glacier ELA and annual glacier-wide SMBs
of French Alpine glaciers. Together with Huss (2012), who
performed a similar study with SMB, the most significant sta-
tistical relationships were found for the lowermost 20 % area
slope, the mean elevation, glacier surface area, aspect and
easting, and northing. Rabatel et al. (2013) also determined
that the climatic interannual variability is mainly responsi-
ble for driving the glacier equilibrium-line altitude temporal
variability, whereas the topographical characteristics are re-
sponsible for the spatial variations in the mean ELA.
Summer ablation is often accounted for by means of
CPDDs. However, in the vast majority of studies, accumu-
lation and ablation periods are defined between fixed dates
(e.g. 1 October–30 April for the accumulation period in the
northern mid-latitudes) based on optimizations. As discussed
in Zekollari and Huybrechts (2018), these fixed periods may
not be the best to describe SMB variability through statis-
tical correlation. Moreover, the ablation season will likely
evolve in the coming century due to climate warming. In or-
der to overcome these limitations, we dynamically calculate
each year the transition between accumulation and ablation
seasons (and vice versa) based on a chosen quantile in the
CPDD (Fig. S3). We found higher correlations between an-
nual SMB and ablation-period CPDD calculated using this
dynamical ablation season. On the other hand, this was not
the case for the separation between summer and winter snow-
fall. Therefore, we decided to keep constant periods to ac-
count for winter (1 October–1 May) and summer (1 May–
1 October) snowfalls and to keep them dynamical for the
CPDD calculation.
Following this literature review, vectors ˆ
and ˆ
(Eq. 1) read as
=Z Zmax α20 % Area Lat Lon 8,(2)
C=1CPDD 1WS 1SS 1T mon 1Smon ,(3)
where Zis the mean glacier altitude, Zmax is the maximum
glacier altitude, α20 % is the slope of the lowermost 20%
glacier altitudinal range, “Area” is the glacier surface area,
“Lat” is glacier latitude, “Lon” is glacier longitude, 8is the
cosine of the glacier’s aspect (north is 0), 1CPDD is the
CPDD anomaly, 1WS is the winter snow anomaly, 1SS is
the summer snow anomaly, 1T mon is the average tempera-
ture anomaly for each month for the hydrological year and
1Smon is the average snowfall anomaly for each month for
the hydrological year.
For the linear machine learning model training, we chose
a function fthat linearly combines ˆ
and ˆ
C, generating new
combined predictors (Eq. 4). In ˆ
C, only 1CPDD, 1WS and
1SS are combined to avoid generating an unnecessary num-
ber of predictors with the combination of ˆ
with 1T mon and
1Smon :
SMBg,y=((a1Z+a2Zmax +a3α20 % +a4Area
+a5Lat +a6Lon +a78+a8)1CPDD
+(b1Z+b2Zmax +b3α20 % +b4Area +b5Lat
+b6Lon +b78+b8)1SS +(c1Z+c2Zmax
+c3α20 % +c4Area +c5Lat +c6Lon +c78
+c8)1WS +d1Z+d2Zmax +d3α20 % +d4Area
+d5Lat +d6Lon +d78+d8+dn1T mon
+dm1Smon +ε)g,y.
Thirty-two glaciers over variable periods between 31 and
57 years result in 1048 glacier-wide SMB ground truth val-
ues. For each glacier-wide SMB value, 55 predictors were
produced following Eq. (4): 33 combined predictors, with
1T mon and 1Smon accounting for 12 predictors each, one
for each month of the year. All these values combined pro-
duce a 1048 ×55 matrix, given as input data to the OLS
and Lasso machine learning libraries. Early Lasso tests (not
shown here) using only the predictors from Eqs. (2) and (3)
demonstrated the benefits of expanding the number of pre-
dictors, as is later shown in Fig. 5. For the training of the
ANN, no combination of topo-climatic predictors is done as
previously mentioned (Sect. 2.2.4), since it is already done
internally by the ANN. The Cryosphere, 14, 565–584, 2020
574 J. Bolibar et al.: Deep learning applied to glacier evolution modelling
3.2.2 Causal analysis
By running the Lasso algorithm on the dataset based on
Eqs. (2) and (3), we obtain the contribution of each predic-
tor in order to explain the annual glacier-wide SMB vari-
ance. Regarding the climatic variables, accumulation-related
predictors (winter snowfall; summer snowfall; and several
winter, spring and even summer months) appear to be the
most important predictors. Ablation-related predictors also
seem to be relevant, mainly with CPDD and summer- and
shoulder-season months (Fig. 5). Interestingly, meteorologi-
cal conditions in the transition months are crucial for the an-
nual glacier-wide SMB in the French Alps. (1) October tem-
perature is determinant for the transition between the abla-
tion and the accumulation season, favouring a lengthening of
melting when temperature remains positive or conversely al-
lowing snowfalls that protect the ice and contribute to the ac-
cumulation when temperatures are negative. (2) March snow-
fall has a similar effect: positive anomalies contribute to the
total accumulation at the glacier surface, and a thicker snow-
pack will delay the snow–ice transition during the ablation
season, leading to a less negative ablation rate (e.g. Fig. 6b;
Réveillet et al., 2018). Therefore, meteorological conditions
of these transition months seem to strongly impact the annual
glacier-wide SMB variability, since their variability oscil-
lates between positive and negative values, unlike the months
in the heart of summer or winter.
In a second term, topographical predictors do play a role,
albeit a secondary one. The slope of the 20 % lowermost alti-
tudinal range, the glacier area, the glacier mean altitude and
aspect help to modulate the glacier-wide SMB signal, which,
unlike point or altitude-dependent SMB, partially depends
on glacier topography (Huss et al., 2012). Moreover, lati-
tude and longitude are among the most relevant topographi-
cal predictors, which for this case study are likely to be used
as bias correctors of precipitation of the SAFRAN climate
reanalysis. SAFRAN is suspected of having a precipitation
bias, with higher uncertainties for high-altitude precipitation
(Vionnet et al., 2016). Since the French Alps present an alti-
tudinal gradient, with higher altitudes towards the eastern and
the northern massifs, we found that the coefficients linked to
latitude and longitude enhanced glacier-wide SMBs with a
north-east gradient.
3.2.3 Spatial predictive analysis
In order to evaluate the performance of the machine learn-
ing SMB models in space, we perform a LOGO (leave-one-
glacier-out) cross validation. For relatively small datasets
like the one used in this study, cross validation ensures that
the model is validated on the full dataset. Such validation
aims at understanding the model’s performance for predic-
tions on other glaciers for the same time period as during the
Figure 5. Contribution to the total variance in the 30 top topo-
climatic predictors out of 55 predictors using Lasso. Green bars
indicate predictors including topographical features, blue ones in-
clude accumulation-related features and red ones include ablation-
related features.
An important aspect is the comparison between the lin-
ear and nonlinear machine learning algorithms used in this
study. Steiner et al. (2005) already proved that a nonlinear
ANN improved the results with respect a classic stepwise
multiple linear regression. Here, we draw a similar com-
parison using more advanced methods for a larger dataset:
OLS and Lasso as linear machine learning algorithms and a
deep ANN as a nonlinear one. We observed significant dif-
ferences between OLS, Lasso and deep learning in terms of
both explained variance (r2) and accuracy (RMSE) of pre-
dicted glacier-wide SMBs. On average, we found improve-
ments between +55 % and +61 % in the explained variance
(from 0.49 to 0.76–0.79) using the nonlinear deep ANN com-
pared to Lasso, whereas the accuracy was improved up to
45 % (from 0.74 to 0.51–0.62). This means that 27 % more
variance is explained with a nonlinear model in the spatial
dimension for glacier-wide SMB in this region. See Fig. 6
for a full summary of the results. An interesting consequence
of the nonlinearity of the ANN is the fact that it captures
extreme SMB values better compared to a linear model. A
The Cryosphere, 14, 565–584, 2020
J. Bolibar et al.: Deep learning applied to glacier evolution modelling 575
linear model can correctly approximate the main cluster of
values around the median, but the linear approximation per-
forms poorly for extreme annual glacier-wide SMB values.
The ANN solves this problem, with an increased explained
variance which translates into a better accuracy for extreme
SMB values, even without the use of sample weights (Fig. 6).
As a consequence, the added value of deep learning is es-
pecially relevant on glaciers with steeper annual changes in
glacier-wide SMB (Fig. 7a). The use of sample weights can
scale this factor up or down, thus playing with a performance
trade-off depending on how much one wants to improve the
model’s behaviour for extreme SMB values.
Overall, deep learning results in a lower error throughout
all the glaciers in the dataset when evaluated using LOGO
cross validation (Fig. 8). Moreover, the bias is also system-
atically reduced, but it is strongly correlated to the one from
3.2.4 Temporal predictive analysis
In order to evaluate the performance of the machine learn-
ing SMB models in time, we perform a LOYO (leave-one-
year-out) cross validation. This validation serves to under-
stand the model’s performance for past or future periods out-
side the training time period. The best results achieved for
Lasso make no use of any monthly average temperature or
snowfall, suggesting that these features are not relevant for
temporal predictions unlike the spatial case.
As in Sect. 3.2.3, the results between the linear and non-
linear machine learning algorithms were compared. Inter-
estingly, using LOYO, the differences between the different
models were even greater than for spatial validation, reveal-
ing the more complex nature of the information in the tempo-
ral dimension. As illustrated by Fig. 9, we found remarkable
improvements between the linear Lasso and the nonlinear
deep learning in both the explained variance (between +94 %
and +108 %) and accuracy (between +32 % and +58 %).
This implies that 35 % more variance is explained using a
nonlinear model in the temporal dimension for glacier-wide
SMB balance in this region. Deep learning manages to keep
very similar performances between the spatial and tempo-
ral dimensions, whereas the linear methods see their perfor-
mance affected most likely due to the increased nonlinearity
of the SMB reaction to meteorological conditions.
A more detailed year-by-year analysis reveals interest-
ing information about the glacier-wide SMB data structure.
As seen in Fig. 10, the years with the worst deep learning
precision are 1984, 1985 and 1990. All three hydrological
years present high spatial variability in observed (or remotely
sensed) SMBs: very positive SMB values in general for 1984
and 1985, with few slightly negative values, and extremely
negative SMB values in general for 1990, with few almost
neutral values. These complex configurations are clearly out-
liers within the dataset which push the limits of the nonlin-
ear patterns found by the ANN. The situation becomes even
more evident with Lasso, which struggles to resolve these
complex patterns and often performs poorly where the ANN
succeeds (e.g. years 1996, 2012 or 2014). The important bias
present only with Lasso is representative of its lack of com-
plexity towards nonlinear structures, which results in an un-
derfitting of the data. The average error is not bad, but it
shows a high negative bias for the first half of the period,
which mostly has slightly negative glacier-wide SMBs, and
a high positive bias for the second half of the period, which
mostly has very negative glacier-wide SMB values.
3.2.5 Spatiotemporal predictive analysis
Once the specific performances in the spatial and temporal
dimensions have been assessed, the performance in both di-
mensions at the same time is evaluated using leave-some-
years-and-glaciers-out (LSYGO) cross validation. Sixty-four
folds were built, with test folds being comprised of data for
two random glaciers on two random years and train folds of
all the data except the two years (for all glaciers) and the two
glaciers (for all years) present in the test fold. These combi-
nations are quite strict, implying that for every four tested
values, we need to drop between 123 and 126 values for
training, depending on the glacier and year, to respect the
spatiotemporal independence (Roberts et al., 2017).
The performance of LSYGO is similar to LOYO, with a
RMSE of 0.51 m w.e. a1and a coefficient of determination
of 0.77 (Fig. S5). This is reflected in the fact that very sim-
ilar ANN hyperparameters were used for the training. This
means that the deep learning SMB model is successful in
generalizing, and it does not overfit the training data.
3.3 Glacier geometry evolution: validation and results
As mentioned in Sect. 2.3, the 1h parameterization has been
widely used in many studies (e.g. Huss et al., 2008, 2010;
Vincent et al., 2014; Huss and Hock, 2015, 2018; Hanzer
et al., 2018; Vincent et al., 2019). It is not in the scope of
this study to evaluate the performance of this method, but we
present the approach developed in ALPGM to compute the
1h functions and show some examples for single glaciers to
illustrate how these glacier-specific functions perform com-
pared to observations. For the studied French Alpine glaciers,
the 1979–2011 period is used. This period was proved by
Vincent et al. (2014) to be representative of Mer de Glace’s
secular trend. Other sub-periods could have been used, but
it was shown that they did not necessarily improve the per-
formance. In addition, the 1979 and 2011 DEMs are the
only ones available that cover all the French Alpine glaciers.
Within this period, some years with neutral to even positive
surface mass balances in the late 1970s and early 1980s can
be found as well as a remarkable change from 2003 onward
with strongly negative surface mass balances, following the
heatwave that severely affected the western Alps in summer
2003. The Cryosphere, 14, 565–584, 2020
576 J. Bolibar et al.: Deep learning applied to glacier evolution modelling
Figure 6. Evaluation of modelled annual glacier-wide SMB against the ground truth SMB data (both in mw.e.a1) using leave-one-glacier-
out cross validation. The colour (purple–orange for linear; blue–green for nonlinear) indicates frequency based on the probability density
function. The black line indicates the reference one-to-one line. (a) Scatter plot of the OLS model results. (b) Scatter plot of the Lasso linear
model results. Scatter plots of the deep artificial neural network nonlinear models without (c) and with (d) sample weights.
Figure 7. Examples of cumulative glacier-wide SMB (mw.e.) simulations against the ground truth SMB data. The pink envelope indicates
the accumulated uncertainties from the ground truth data. The deep learning SMB model has not been trained with sample weights in these
The Cryosphere, 14, 565–584, 2020
J. Bolibar et al.: Deep learning applied to glacier evolution modelling 577
Figure 8. Mean average error (MAE) and bias (vertical bars) for each glacier of the training dataset structured by clusters for the 1984–2014
LOGO glacier-wide SMB simulation. No clear regional error patterns arise.
Figure 9. Evaluation of modelled annual glacier-wide SMB against the ground truth SMB data (both in mw.e.a1) using leave-one-year-
out cross validation. The colour (purple–orange for linear; blue–green for nonlinear) indicates frequency based on the probability density
function. The black line indicates the reference one-to-one line. (a) Scatter plot of the OLS model results. (b) Scatter plot of the Lasso linear
model results. Scatter plots of the deep artificial neural network nonlinear models without (c) and with (d) sample weights. The Cryosphere, 14, 565–584, 2020
578 J. Bolibar et al.: Deep learning applied to glacier evolution modelling
Figure 10. Mean average error (MAE) and bias (vertical bars) for each year of the training dataset for the 1984–2014 LOYO glacier-wide
SMB simulation.
The glacier-specific 1h functions are computed for
glaciers 0.5 km2, which represented about 80 % of the
whole glacierized surface of the French Alps in 2015 (some
examples are illustrated in the Supplement Fig. S4). For the
rest of very small glaciers (<0.5 km2), a standardized flat
function is used in order to make them shrink equally at all
altitudes. This is done to simulate the fact that generally, the
equilibrium line of very small glaciers surpasses the glacier’s
maximum altitude, thus shrinking from all directions and al-
titudes in summer. Moreover, due to their reduced size and
altitudinal range, the ice flow no longer has the same impor-
tance as for larger- or medium-sized glaciers.
In order to evaluate the performance of the parameterized
glacier dynamics of ALPGM, coupled with the glacier-wide
SMB component, we compared the simulated glacier area of
the 32 studied glaciers with the observed area in 2015 from
the most up-to-date glacier inventory in the French Alps.
Simulations were started in 2003, for which we used the F19
ice thickness dataset. In order to take into account the ice
thickness uncertainties, we ran three simulations with differ-
ent versions of the initial ice thickness: the original data and
30 % and +30 % of the original ice thickness, in agreement
with the uncertainty estimated by the authors. Moreover, in
order to take into account the uncertainties in the 1h glacier
geometry update function computation, we added a ±10 %
variation in the parameterized functions (Fig. 11).
Overall, the results illustrated in Fig. 11 show good agree-
ment with the observations. Even for a 12-year period, the
initial ice thickness still has the largest uncertainty, with al-
most all glaciers falling within the observed area when tak-
ing it into account. The mean error in simulated surface area
was 10.7 % with the original F19 ice thickness dataset. Other
studies using the 1h parameterization already proved that
the initial ice thickness is the most important uncertainty in
glacier evolution simulations, together with the choice of a
global climate model (GCM) for future projections (Huss and
Hock, 2015).
4 Discussion and perspectives
4.1 Linear methods still matter
Despite the fact that deep learning often outperforms linear
machine learning and statistical methods, there is still a place
for such methods in modelling. Indeed, unlike ANNs, sim-
pler regularized linear models such as Lasso allow an easy
interpretation of the coefficients associated to each input fea-
ture, which helps to understand the contribution of each of
the chosen variables to the model. This means that linear ma-
chine learning methods can be used for both prediction and
causal analysis. Training a linear model in parallel with an
ANN therefore has the advantage of providing a simpler lin-
ear alternative which can be used to understand the dataset.
Moreover, seeing the contribution of each coefficient, one
can reduce the complexity of the dataset by keeping only
the most significant predictors. Finally, a linear model also
serves as a reference to highlight and quantify the nonlinear
gains obtained by deep learning.
4.2 Training deep learning models with spatiotemporal
The creation and training of a deep ANN requires a certain
knowledge and strategy with respect to the data and study fo-
cus. When working with spatiotemporal data, the separation
between training and validation becomes tricky. The spatial
and temporal dimensions in the dataset cannot be ignored
and strongly affect the independence between training and
validation data (Roberts et al., 2017; Oliveira et al., 2019).
Depending on how the cross validation is performed, the ob-
tained performance will be indicative of one of these two di-
The Cryosphere, 14, 565–584, 2020
J. Bolibar et al.: Deep learning applied to glacier evolution modelling 579
Figure 11. Simulated glacier areas for the 2003–2015 period for the 32 study glaciers using a deep learning SMB model without weights.
Squares indicate the different F19 initial ice thicknesses used, taking into account their uncertainties, and triangles indicate the uncertainties
linked to the glacier-specific geometry update functions. For better visualization, the figure is split into two, with the two largest French
glaciers on the right.
mensions. As is shown in Sect. 3.2.3, the ANNs and espe-
cially the linear modelling approaches had more success in
predicting SMB values in space than in time. This is mostly
due to the fact that the glacier-wide SMB signal has a greater
variability and nonlinearities in time than in space, with cli-
mate being the main driver of the annual fluctuations in SMB,
whereas geography, and in particular the local topography,
modulates the signal between glaciers (Huss, 2012; Raba-
tel et al., 2016; Vincent et al., 2017). Consequently, linear
models find it easier to make predictions in a given period of
time for other glaciers elsewhere in space than for time peri-
ods outside the training. Nonetheless, the deep learning SMB
models were capable of equally capturing the complex non-
linear patterns in both the spatial and temporal dimensions.
In order to cope with the specific challenges related to
each type of cross validation, there are several hyperparam-
eters that can be modified to adapt the ANN’s behaviour.
Due to the long list of hyperparameters intervening in an
ANN, it is not advisable to select them using brute force
with a grid search or cross validation. Instead, initial tests
are performed in a subset of random folds to narrow down
the range of best-performing values before moving to the
full final cross validations for the final hyperparameter se-
lection. Moreover, the ANN architecture plays an important
role: the number of neurons as well as the number of hidden
layers will determine the ANN’s complexity and its capabil-
ities to capture hidden patterns in the data. But the larger the
architecture, the higher the chances are to overfit the data.
This undesired effect can be counterbalanced using regular-
ization. The amount of regularization (dropout and Gaussian
noise in our case; see Sect. 2.2.4) used in the training of
the ANN necessarily introduces some trade-offs. The greater
the dropout, the more we will constrain the learning of the
ANN, so the higher the generalization will be, until a cer-
tain point, where relevant information will start to become
lost and performance will drop. On the other hand, the learn-
ing rate to compute the stochastic gradient descent, which
tries to minimize the loss function, also plays an important
role: smaller learning rates generally result in a slower con-
vergence towards the absolute minima, thus producing mod-
els with better generalization. By balancing all these different
effects, one can achieve the ratio of accuracy to generaliza-
tion that best suits a certain dataset and model in terms of
performance. Nonetheless, one key aspect in machine learn-
ing models is data: expanding the training dataset in the fu-
ture will allow us to increase the complexity of the model and
its performance. Consequently, machine learning models see
their performance improved as time goes by, with new data
becoming available for training.
Although the features used as input for the model are
classical descriptors of the topographical and meteorological
conditions of the glaciers, it is worth mentioning that apply-
ing the model in different areas or with different data sources
would likely require a re-training of the model due to pos-
sible biases: different regions on the globe may have other
descriptors of importance, but also different measuring tech-
niques will likely have different biases.
4.3 Perspectives on future applications of deep learning
in glaciology
The currently used meteorological variables in the deep ANN
of ALPGM’s SMB component are based on the classic
degree-day approach, which relies only on temperature and
precipitation. However, the model could be trained with vari-
ables involved in more complex models, such as SEB-type
models, for which the longwave and shortwave radiation as
well as the turbulent fluxes and albedo intervene. The cur-
rent model framework allows flexibility in the choice and
number of input variables that can reflect different degrees The Cryosphere, 14, 565–584, 2020
580 J. Bolibar et al.: Deep learning applied to glacier evolution modelling
of complexity for the resolved processes. Despite the fact
that it has been shown that for glaciers in the European Alps
there is almost no added value in transitioning from a sim-
ple degree day to a SEB model for annual glacier-wide SMB
simulations (e.g. Réveillet et al., 2017), it could be an in-
teresting way to expand the training dataset for glaciers in
tropical and subtropical regions, where shortwave radiation
plays a much more important role (Benn and Evans, 2014).
Maussion et al. (2015) followed a similar approach with lin-
ear machine learning in order to calibrate a regression-based
downscaling model that linked local SEB–SMB fluxes to at-
mospheric reanalysis variables.
In this work, we also evaluated the resilience of the deep
learning approach: since many glacierized regions in the
world do not have the same number of data used in this
study, we trained an ANN only with monthly average tem-
perature and snowfall, without any topographical predictors,
to see until which point the algorithm is capable of learn-
ing from minimal data. The results were quite interesting,
with a coefficient of determination of 0.68 (against 0.76 from
the full model) and a RMSE of 0.59 m w.e.a1(against 0.51
from the full model). These results indicate that meteorolog-
ical data are the primary source of information, determin-
ing the interannual high-frequency variability in the glacier-
wide SMB signal. On the other hand, the “bonus” of topo-
graphical data helps to modulate the high-frequency climate
signal by adding a low-frequency component to better dif-
ferentiate glaciers and the topographical characteristics in-
cluded in the glacier-wide SMB data (Huss et al., 2012).
The fact that glacier-wide SMB is influenced by glacier to-
pography poses the question of determining if the simulated
glacier geometries can correctly reproduce topographical ob-
servations, needed to represent the topographical feedback
present in glacier-wide SMB signals. These aspects are anal-
ysed and discussed in Sect. S3 of the Supplement, showing
small differences between the observed and simulated topo-
graphical parameters for the 2003–2015 period (Table S1).
Additionally, the simulated glacier-wide SMBs using simu-
lated topographical parameters show very small differences
(0.069 m w.e. a1on average) compared to simulations us-
ing topographical observations (Fig. S6). Since glacier ice
thickness estimates date from the year 2003 (Farinotti et al.,
2019), our validation period can only encompass 12 years.
According to all the available data for validation, our model
seems to be able to correctly reproduce the glacier geometry
evolution, but since the 2003–2015 validation period is quite
short, the validation performance might not be representa-
tive when dealing with future glacier evolution projections
of several decades. Consequently, these aspects will have to
be taken into account for future studies using this modelling
approach for projections. Moreover, the cross-validation re-
sults of the SMB model or models (Figs. 6–10) are represen-
tative of the performance of predictions using topographical
observations. Despite the small differences found between
simulated and observed topographical parameters, the SMB
model’s performance might be slightly different than the per-
formance found in the cross-validation analysis. Therefore, it
would be interesting for future studies to investigate the use
of point SMB data, which could avoid the complexities re-
lated to the influence of glacier topography in glacier-wide
A nonlinear deep learning SMB component like the one
used for ALPGM could provide an interesting alternative to
classical SMB models used for regional modelling. The com-
parison with other SMB models is beyond the scope of this
study, but it would be worth investigating to quantify the spe-
cific gains that could be achieved by switching to a deep
learning modelling approach. Nonetheless, the linear ma-
chine learning models trained with the CPDD and cumulative
snowfall used in this study behave in a similar way to a cali-
brated temperature-index model. Even so, we believe that fu-
ture efforts should be taken towards physics-informed data-
science glacier SMB and evolution modelling. Adding phys-
ical constraints in ANNs, with the use of physics-based loss
functions and/or architectures (e.g. Karpatne et al., 2018),
would allow us to improve our understanding and confidence
in predictions, to reduce our dependency on big datasets, and
to start bridging the gap between data science and physi-
cal methods (Karpatne et al., 2017; de Bezenac et al., 2018;
Lguensat et al., 2019; Rackauckas et al., 2020). Deep learn-
ing can be of special interest once applied in the reconstruc-
tion of SMB time series. More and more SMB data are be-
coming available thanks to the advances in remote sensing
(e.g. Brun et al., 2017; Zemp et al., 2019; Dussaillant et al.,
2019), but these datasets often cover limited areas and the
most recent time period in the studied regions. An interesting
way of expanding a dataset would be to use a deep learning
approach to fill the data gaps based on the relationships found
in a subset of glaciers as in the case study presented here. Past
SMB time series of vast glacierized regions could thereby be
reconstructed, with potential applications in remote glacier-
ized regions such as the Andes or High Mountain Asia.
5 Conclusions
We presented a novel approach to simulate and reconstruct
glacier-wide SMB series using deep learning for individ-
ual glaciers at a regional scale. This method has been in-
cluded as a SMB component in ALPGM (Bolibar, 2019),
a parameterized regional glacier evolution model, follow-
ing an alternative approach to most physical and process-
based glacier models. The data-driven glacier-wide SMB
modelling component is coupled with a glacier geometry
update component, based on glacier-specific parameterized
functions. Deep learning is shown to outperform linear meth-
ods for the simulation of glacier-wide SMB with a case study
of French Alpine glaciers. By means of cross validation,
we demonstrated how important nonlinear structures (up to
35 %) coming from the glacier and climate systems in both
The Cryosphere, 14, 565–584, 2020
J. Bolibar et al.: Deep learning applied to glacier evolution modelling 581
the spatial and temporal dimensions are captured by the deep
ANN. Taking into account that this nonlinearity substantially
improved the explained variance and accuracy compared to
linear statistical models, especially in the more complex tem-
poral dimension. As we have shown in our case study, deep
ANNs are capable of dealing with relatively small datasets,
and they present a wide range of configurations to general-
ize and prevent overfitting. Machine learning models benefit
from the increasing number of available data, which makes
their performance constantly improve as time goes by.
Deep learning should be seen as an opportunity by the
glaciology community. Its good performance for SMB mod-
elling in both the spatial and temporal dimensions shows
how relevant it can be for a broad range of applications.
Combined with in situ or remote-sensing SMB estimations,
it can serve to reconstruct SMB time series for regions or
glaciers with already-available data for past and future pe-
riods, with potential applications in remote regions such as
the Andes or the High Mountain Asia. Moreover, deep learn-
ing can be used as an alternative to classical SMB mod-
els, as is done in ALPGM: important nonlinearities from
the glacier and climate systems are potentially ignored by
these mostly linear models, which could give an advantage to
deep learning models in regional studies. It might still be too
early for the development of such models in certain regions
which lack consistent datasets with good spatial and tempo-
ral coverage. Nevertheless, upcoming methods adding phys-
ical knowledge to constrain neural networks (e.g. Karpatne
et al., 2018; Rackauckas et al., 2020) could provide inter-
esting solutions to the limitations of our current method.
By incorporating prior physical knowledge into neural net-
works, the dependency on big datasets would be reduced,
and it would allow transitioning towards more interpretable
physics-informed data-science models.
Code and data availability. The source code of
ALPGM is available at
ALPGM (last access: February 2020) with its DOI
(, Bolibar, 2020) for
the v1.1 release. All scripts used to generate plots and results are
included in the repository. The detailed information regarding
glaciers used in the case study and the SMB model performance
per glacier are included in the Supplement.
Supplement. The supplement related to this article is available on-
line at:
Author contributions. JB developed ALPGM, analysed the results
and wrote the paper. AR conducted the glaciological analysis, con-
tributed with the remote-sensing SMB data and conceived the study,
together with IG, TC and JB. CG and JB developed the deep and
machine learning modelling approach, and ES contributed in the
statistical analysis. All authors discussed the results and helped with
developing the paper.
Competing interests. The authors declare that they have no conflict
of interest.
Acknowledgements. This study has been possible through fund-
ing from the ANR VIP_Mont-Blanc (ANR-14 CE03-0006-03), the
BERGER project (Pack Ambition Recherche funded by the Région
Auvergne-Rhône-Alpes, the LabEx OSUG@2020 (Investissements
d’avenir – ANR10 LABX56), and the CNES via the KALEIDOS-
Alpes and ISIS (Initiative for Space Innovative Standard) projects.
Data have been provided by Météo-France (SAFRAN dataset) and
the GLACIOCLIM National Observation Service (in situ glaciolog-
ical data). The authors would like to thank Eduardo Pérez-Pellitero
(Max Planck Institute for Intelligent Systems) for the interesting
discussions regarding deep learning, Benjamin Renard (Irstea Lyon)
for his insightful comments on statistics, and Fabien Maussion and
an anonymous reviewer for their constructive review comments
which have helped to improve the overall quality and clarity of the
Financial support. This research has been supported by the ANR
VIP_Mont-Blanc (grant no. ANR-14 CE03-0006-03), the Pack Am-
bition Recherche funded by the AuRA region (BERGER project
grant), the LabEx OSUG@2020 (grant no. ANR10 LABX56) and
the CNES (KALEIDOS-Alpes and ISIS projects grants).
Review statement. This paper was edited by Valentina Radic and
reviewed by Fabien Maussion and one anonymous referee.
Beniston, M., Farinotti, D., Stoffel, M., Andreassen, L. M., Cop-
pola, E., Eckert, N., Fantini, A., Giacona, F., Hauck, C., Huss,
M., Huwald, H., Lehning, M., López-Moreno, J.-I., Magnusson,
J., Marty, C., Morán-Tejéda, E., Morin, S., Naaim, M., Proven-
zale, A., Rabatel, A., Six, D., Stötter, J., Strasser, U., Terzago, S.,
and Vincent, C.: The European mountain cryosphere: a review of
its current state, trends, and future challenges, The Cryosphere,
12, 759–794,, 2018.
Benn, D. I. and Evans, D. J. A.: Glaciers & glaciation,
Routledge, New York, NY, USA, 2nd edn., available
patron&extendedid=P_615876_0 (last access: February 2020),
oCLC: 878863282, 2014.
Bolibar, J.: JordiBolibar/ALPGM: ALPGM v1.0,, 2019.
Bolibar, J.: JordiBolibar/ALPGM: ALPGM v1.1,, 2020.
Brun, F., Berthier, E., Wagnon, P., Kääb, A., and Treichler, D.:
A spatially resolved estimate of High Mountain Asia glacier The Cryosphere, 14, 565–584, 2020
582 J. Bolibar et al.: Deep learning applied to glacier evolution modelling
mass balances from 2000 to 2016, Nature Geosci., 10, 668–673,, 2017.
Carlson, B. Z., Georges, D., Rabatel, A., Randin, C. F., Renaud,
J., Delestrade, A., Zimmermann, N. E., Choler, P., and Thuiller,
W.: Accounting for tree line shift, glacier retreat and primary
succession in mountain plant distribution models, Diversity and
Distributions, 20, 1379–1391,,
Chollet, F.: Keras, available at: (last access: Febru-
ary 2020), 2015.
Consortium, R. G. I.: Randolph Glacier Inventory 6.0,, type: dataset, 2017.
de Bezenac, E., Pajot, A., and Gallinari, P.: Deep Learning
for Physical Processes: Incorporating Prior Scientific Knowl-
edge, arXiv:1711.07970 [cs, stat],
07970, arXiv: 1711.07970, 2018.
Ducournau, A. and Fablet, R.: Deep learning for ocean re-
mote sensing: an application of convolutional neural net-
works for super-resolution on satellite-derived SST data,
in: 2016 9th IAPR Workshop on Pattern Recogniton in
Remote Sensing (PRRS), 1–6, IEEE, Cancun, Mexico,, 2016.
Durand, Y., Laternser, M., Giraud, G., Etchevers, P., Lesaf-
fre, B., and Mérindol, L.: Reanalysis of 44 Yr of Climate
in the French Alps (1958–2002): Methodology, Model Val-
idation, Climatology, and Trends for Air Temperature and
Precipitation, J. Appl. Meteorol. Climatol., 48, 429–449,, 2009.
Dussaillant, I., Berthier, E., Brun, F., Masiokas, M., Hugonnet,
R., Favier, V., Rabatel, A., Pitte, P., and Ruiz, L.: Two
decades of glacier mass loss along the Andes, Nature Geosci.,, 2019.
Farinotti, D., Huss, M., Fürst, J. J., Landmann, J., Machguth, H.,
Maussion, F., and Pandit, A.: A consensus estimate for the ice
thickness distribution of all glaciers on Earth, Nature Geosci.,
12, 168–173,, 2019.
Fausett, L. V.: Fundamentals of neural networks: architectures,
algorithms, and applications, Prentice Hall, Englewood Cliffs,
N.J., oCLC: 28215780, 1994.
Gagliardini, O., Zwinger, T., Gillet-Chaulet, F., Durand, G., Favier,
L., de Fleurian, B., Greve, R., Malinen, M., Martín, C., Råback,
P., Ruokolainen, J., Sacchettini, M., Schäfer, M., Seddik, H.,
and Thies, J.: Capabilities and performance of Elmer/Ice, a new-
generation ice sheet model, Geosci. Model Dev., 6, 1299–1318,, 2013.
Gardent, M., Rabatel, A., Dedieu, J.-P., and Deline, P.: Multi-
temporal glacier inventory of the French Alps from the late
1960s to the late 2000s, Global Planet. Change, 120, 24–37,, 2014.
Gerbaux, M., Genthon, C., Etchevers, P., Vincent, C., and Dedieu,
J.: Surface mass balance of glaciers in the French Alps: dis-
tributed modeling and sensitivity to climate change, J. Glaciol.,
51, 561–572,,
Hanzer, F., Förster, K., Nemec, J., and Strasser, U.: Projected
cryospheric and hydrological impacts of 21st century climate
change in the Ötztal Alps (Austria) simulated using a physi-
cally based approach, Hydrol. Earth Syst. Sci., 22, 1593–1614,, 2018.
Hastie, T., Tibshirani, R., and Friedman, J.: The Elements of Sta-
tistical Learning, Springer Series in Statistics, Springer New
York, New York, NY,
7, 2009.
Hawkins, D. M.: The Problem of Overfitting, Journal of
Chemical Information and Computer Sciences, 44, 1–12,, 2004.
He, K., Zhang, X., Ren, S., and Sun, J.: Delving Deep into Recti-
fiers: Surpassing Human-Level Performance on ImageNet Clas-
sification, 2015 IEEE International Conference on Computer Vi-
sion (ICCV),, 2015.
Hock, R.: Temperature index melt modelling in mountain ar-
eas, J. Hydrol., 282, 104–115,
1694(03)00257-9, 2003.
Hock, R., Bliss, A., Marzeion, B., Giesen, R. H., Hirabayashi,
Y., Huss, M., Radi´
c, V., and Slangen, A. B. A.: Glacier-
MIP – A model intercomparison of global-scale glacier mass-
balance models and projections, J. Glaciol., 65, 453–467,, 2019.
Hoerl, A. E. and Kennard, R. W.: Ridge Regression: Biased Esti-
mation for Nonorthogonal Problems, Technometrics, 12, 55–67,, 1970.
Hoinkes, H. C.: Glacier Variation and Weather, J. Glaciol., 7, 3–18,, 1968.
Houghton, J. T., Ding, Y., Griggs, D. J., Noguer, M., van der Lin-
den, P. J., Dai, X., Maskell, K., and Johnson, C.: Climate change
2001: the scientific basis, The Press Syndicate of the University
of Cambridge, 2001.
Huss, M.: Extrapolating glacier mass balance to the mountain-range
scale: the European Alps 1900–2100, The Cryosphere, 6, 713–
727,, 2012.
Huss, M. and Hock, R.: A new model for global
glacier change and sea-level rise, Front. Earth Sci., 3,, 2015.
Huss, M. and Hock, R.: Global-scale hydrological response to
future glacier mass loss, Nature Clim. Change, 8, 135–140,, 2018.
Huss, M., Farinotti, D., Bauder, A., and Funk, M.: Mod-
elling runoff from highly glacierized alpine drainage basins
in a changing climate, Hydrol. Process., 22, 3888–3902,, 2008.
Huss, M., Jouvet, G., Farinotti, D., and Bauder, A.: Fu-
ture high-mountain hydrology: a new parameterization of
glacier retreat, Hydrol. Earth Syst. Sci., 14, 815–829,, 2010.
Huss, M., Hock, R., Bauder, A., and Funk, M.: Conventional ver-
sus reference-surface mass balance, J. Glaciol., 58, 278–286,, 2012.
Ingrassia, S. and Morlini, I.: Neural Network Model-
ing for Small Datasets, Technometrics, 47, 297–311,, 2005.
Ioffe, S. and Szegedy, C.: Batch Normalization: Accelerating Deep
Network Training by Reducing Internal Covariate Shift, 2015.
IPCC: Climate Change 2013: The Physical Science Basis. Contri-
bution of Working Group I to the Fifth Assessment Report of the
Intergovernmental Panel on Climate Change, 2018.
Jiang, G.-Q., Xu, J., and Wei, J.: A Deep Learning Algorithm
of Neural Network for the Parameterization of Typhoon-Ocean
The Cryosphere, 14, 565–584, 2020
J. Bolibar et al.: Deep learning applied to glacier evolution modelling 583
Feedback in Typhoon Forecast Models, Geophys. Res. Lett., 45,
3706–3716,, 2018.
Jouvet, G., Huss, M., Blatter, H., Picasso, M., and Rap-
paz, J.: Numerical simulation of Rhonegletscher from
1874 to 2100, J. Comput. Phys., 228, 6426–6439,, 2009.
Jóhannesson, T., Raymond, C., and Waddington, E.: Time–Scale for
Adjustment of Glaciers to Changes in Mass Balance, J. Glaciol.,
35, 355–369,,
Karpatne, A., Atluri, G., Faghmous, J. H., Steinbach, M.,
Banerjee, A., Ganguly, A., Shekhar, S., Samatova, N.,
and Kumar, V.: Theory-Guided Data Science: A New
Paradigm for Scientific Discovery from Data, IEEE Transac-
tions on Knowledge and Data Engineering, 29, 2318–2331,, 2017.
Karpatne, A., Watkins, W., Read, J., and Kumar, V.: Physics-guided
Neural Networks (PGNN): An Application in Lake Temperature
Modeling, arXiv:1710.11431 [physics, stat],
1710.11431, arXiv: 1710.11431, 2018.
Lguensat, R., Sun, M., Fablet, R., Tandeo, P., Mason, E., and Chen,
G.: EddyNet: A Deep Neural Network For Pixel-Wise Classifi-
cation of Oceanic Eddies, in: IGARSS 2018–2018 IEEE Interna-
tional Geoscience and Remote Sensing Symposium, 1764–1767,
IEEE, Valencia,,
Lguensat, R., Sommer, J. L., Metref, S., Cosme, E., and Fa-
blet, R.: Learning Generalized Quasi-Geostrophic Models Us-
ing Deep Neural Numerical Models, arXiv:1911.08856 [physics,
stat],, arXiv: 1911.08856, 2019.
Martin, S.: Correlation bilans de masse annuels-facteurs
météorologiques dans les Grandes Rousses, Zeitschrift für
Gletscherkunde und Glazialgeologie, 1974.
Marzeion, B., Jarosch, A. H., and Hofer, M.: Past and future sea-
level change from the surface mass balance of glaciers, The
Cryosphere, 6, 1295–1322,
2012, 2012.
Marçais, J. and de Dreuzy, J.-R.: Prospective Interest of Deep
Learning for Hydrological Inference, Groundwater, 55, 688–692,, 2017.
Maussion, F., Gurgiser, W., Großhauser, M., Kaser, G., and
Marzeion, B.: ENSO influence on surface energy and mass
balance at Shallap Glacier, Cordillera Blanca, Peru, The
Cryosphere, 9, 1663–1683,
2015, 2015.
Maussion, F., Butenko, A., Champollion, N., Dusch, M., Eis, J.,
Fourteau, K., Gregor, P., Jarosch, A. H., Landmann, J., Oesterle,
F., Recinos, B., Rothenpieler, T., Vlug, A., Wild, C. T., and
Marzeion, B.: The Open Global Glacier Model (OGGM) v1.1,
Geosci. Model Dev., 12, 909–931,
12-909-2019, 2019.
NSIDC: Global Land Ice Measurements from Space glacier
database. Compiled and made available by the international
GLIMS community and the National Snow and Ice Data Cen-
ter, 2005.
Nussbaumer, S., Steiner, D., and Zumbühl, H.: Réseau neuronal
et fluctuations des glaciers dans les Alpes occidentales, avail-
able at:
Alpes_occidentales (last access: February 2020) 2012.
Oliveira, M., Torgo, L., and Santos Costa, V.: Evaluation Procedures
for Forecasting with Spatio-Temporal Data, in: Machine Learn-
ing and Knowledge Discovery in Databases, edited by: Berlin-
gerio, M., Bonchi, F., Gärtner, T., Hurley, N., and Ifrim, G., vol.
11051, pp. 703–718, Springer International Publishing, Cham,, 2019.
Olson, M., Wyner, A. J., and Berk, R.: Modern Neural Networks
Generalize on Small Data Sets, NeurIPS, NIPS Proceedings,
available at: (last access: February 2020),
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion,
B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg,
V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Per-
rot, M., Duchesnay, E., and Louppe, G.: Scikit-learn: Machine
Learning in Python, Journal of Machine Learning Research, 12,
2825–2830, 2011.
Rabatel, A., Letréguilly, A., Dedieu, J.-P., and Eckert, N.: Changes
in glacier equilibrium-line altitude in the western Alps from
1984 to 2010: evaluation by remote sensing and modeling of
the morpho-topographic and climate controls, The Cryosphere,
7, 1455–1471,, 2013.
Rabatel, A., Dedieu, J. P., and Vincent, C.: Spatio-temporal changes
in glacier-wide mass balance quantified by optical remote sens-
ing on 30 glaciers in the French Alps for the period 1983–2014, J.
Glaciol., 62, 1153–1166,,
Rabatel, A., Sanchez, O., Vincent, C., and Six, D.: Estimation of
Glacier Thickness From Surface Mass Balance and Ice Flow
Velocities: A Case Study on Argentière Glacier, France, Front.
Earth Sci., 6,, 2018.
Rackauckas, C., Ma, Y., Martensen, J., Warner, C., Zubov, K., Su-
pekar, R., Skinner, D., and Ramadhan, A.: Universal Differential
Equations for Scientific Machine Learning, arXiv:2001.04385
[cs, math, q-bio, stat],, arXiv:
2001.04385, 2020.
c, V., Bliss, A., Beedlow, A. C., Hock, R., Miles, E., and
Cogley, J. G.: Regional and global projections of twenty-first
century glacier mass changes in response to climate scenar-
ios from global climate models, Clim. Dynam., 42, 37–58,, 2014.
Rasp, S., Pritchard, M. S., and Gentine, P.: Deep learning to repre-
sent subgrid processes in climate models, P. Natl. Acad. Sci., 115,
9684–9689,, 2018.
Roberts, D. R., Bahn, V., Ciuti, S., Boyce, M. S., Elith, J., Guillera-
Arroita, G., Hauenstein, S., Lahoz-Monfort, J. J., Schröder, B.,
Thuiller, W., Warton, D. I., Wintle, B. A., Hartig, F., and Dor-
mann, C. F.: Cross-validation strategies for data with temporal,
spatial, hierarchical, or phylogenetic structure, Ecography, 40,
913–929,, 2017.
Réveillet, M., Rabatel, A., Gillet-Chaulet, F., and Soruco, A.:
Simulations of changes to Glaciar Zongo, Bolivia (16S),
over the 21st century using a 3-D full-Stokes model
and CMIP5 climate projections, Ann. Glaciol., 56, 89–97,, 2015.
Réveillet, M., Vincent, C., Six, D., and Rabatel, A.: Which empir-
ical model is best suited to simulate glacier mass balances?, J.
Glaciol., 63, 39–54,, 2017. The Cryosphere, 14, 565–584, 2020
584 J. Bolibar et al.: Deep learning applied to glacier evolution modelling
Réveillet, M., Six, D., Vincent, C., Rabatel, A., Dumont, M.,
Lafaysse, M., Morin, S., Vionnet, V., and Litt, M.: Relative
performance of empirical and physical models in assessing the
seasonal and annual glacier surface mass balance of Saint-
Sorlin Glacier (French Alps), The Cryosphere, 12, 1367–1386,, 2018.
Seabold, S. and Perktold, J.: Statsmodels: Econometric and Statis-
tical Modelingwith Python, Proc. of the 9th Python in Science
Conf., 2010.
Shen, C.: A Transdisciplinary Review of Deep Learning Research
and Its Relevance for Water Resources Scientists, Water Resour.
Res., 54, 8558–8593,,
Six, D. and Vincent, C.: Sensitivity of mass balance and
equilibrium-line altitude to climate change in the French Alps, J.
Glaciol., 60, 867–878,,
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., and
Salakhutdinov, R.: Dropout: a simple way to prevent neural net-
works from overfitting, J. Mach. Learn. Res., 15, 1929–1958,
Steiner, D., Walter, A., and Zumbühl, H.: The application of a non-
linear back-propagation neural network to study the mass bal-
ance of Grosse Aletschgletscher, Switzerland, J. Glaciol., 51,
313–323,, 2005.
Steiner, D., Pauling, A., Nussbaumer, S. U., Nesje, A., Luter-
bacher, J., Wanner, H., and Zumbühl, H. J.: Sensitivity of Eu-
ropean glaciers to precipitation and temperature – two case stud-
ies, Clim. Change, 90, 413–441,
008-9393-1, 2008.
Thibert, E., Dkengne Sielenou, P., Vionnet, V., Eckert, N.,
and Vincent, C.: Causes of Glacier Melt Extremes in
the Alps Since 1949, Geophys. Res. Lett., 45, 817–825,, 2018.
Tibshirani, R.: Regression Shrinkage and Selection via the Lasso, J.
Roy. Stat. Soc. B, 58, 267–288, 1996.
Tibshirani, R., Johnstone, I., Hastie, T., and Efron,
B.: Least angle regression, Ann. Stat., 32, 407–499,, 2004.
Vincent, C., Harter, M., Gilbert, A., Berthier, E., and Six, D.: Future
fluctuations of Mer de Glace, French Alps, assessed using a pa-
rameterized model calibrated with past thickness changes, Ann.
Glaciol., 55, 15–24,,
Vincent, C., Fischer, A., Mayer, C., Bauder, A., Galos, S. P., Funk,
M., Thibert, E., Six, D., Braun, L., and Huss, M.: Common cli-
matic signal from glaciers in the European Alps over the last 50
years: Common Climatic Signal in the Alps, Geophys. Res. Lett.,
44, 1376–1383,, 2017.
Vincent, C., Peyaud, V., Laarman, O., Six, D., Gilbert, A.,
Gillet-Chaulet, F., Berthier, E., Morin, S., Verfaillie, D., Ra-
batel, A., Jourdain, B., and Bolibar, J.: Déclin des deux
plus grands glaciers des Alpes françaises au cours du XXIe
sièle: Argentière et Mer de Glace, La Météorologie, p. 49,, 2019.
Vionnet, V., Dombrowski-Etchevers, I., Lafaysse, M., Quéno, L.,
Seity, Y., and Bazile, E.: Numerical Weather Forecasts at Kilo-
meter Scale in the French Alps: Evaluation and Application
for Snowpack Modeling, J. Hydrometeorol., 17, 2591–2614,, 2016.
Vuille, M., Carey, M., Huggel, C., Buytaert, W., Rabatel, A., Ja-
cobsen, D., Soruco, A., Villacis, M., Yarleque, C., Elison Timm,
O., Condom, T., Salzmann, N., and Sicart, J.-E.: Rapid de-
cline of snow and ice in the tropical Andes – Impacts, uncer-
tainties and challenges ahead, Earth-Sci. Rev., 176, 195–213,, 2018.
Weisberg, S.: Applied linear regression, Wiley series in probability
and statistics, Wiley, Hoboken, NJ, fourth edition edn., 2014.
Whittingham, M. J., Stephens, P. A., Bradbury, R. B., and Freck-
leton, R. P.: Why do we still use stepwise modelling in ecology
and behaviour?: Stepwise modelling in ecology and behaviour,
J. Anim. Ecol., 75, 1182–1189,
2656.2006.01141.x, 2006.
Xu, B., Wang, N., Chen, T., and Li, M.: Empirical Evalua-
tion of Rectified Activations in Convolutional Network, CoRR,
abs/1505.00853, available at:
(last access: February 2020), 2015.
Zekollari, H. and Huybrechts, P.: Statistical modelling of the surface
mass-balance variability of the Morteratsch glacier, Switzerland:
strong control of early melting season meteorological conditions,
J. Glaciol., 64, 275–288,,
Zekollari, H., Huss, M., and Farinotti, D.: Modelling the future
evolution of glaciers in the European Alps under the EURO-
CORDEX RCM ensemble, The Cryosphere, 13, 1125–1146,, 2019.
Zemp, M., Haeberli, W., Hoelzle, M., and Paul, F.: Alpine
glaciers to disappear within decades?, Geophys. Res. Lett., 33,, 2006.
Zemp, M., Huss, M., Thibert, E., Eckert, N., McNabb, R., Huber,
J., Barandun, M., Machguth, H., Nussbaumer, S. U., Gärtner-
Roer, I., Thomson, L., Paul, F., Maussion, F., Kutuzov, S., and
Cogley, J. G.: Global glacier mass changes and their contribu-
tions to sea-level rise from 1961 to 2016, Nature, 568, 382–386,, 2019.
The Cryosphere, 14, 565–584, 2020
... OLS modelling is a common machine learning technique for estimating linear regressions equations with minimum squares error. In this work, we applied this method to investigate linear glacier SMB and ELA shift response to climatic forcing (summer skin temperature and IVT), as in previous studies [46,78]. Using the OLS model, calculation of glacier SMB response to climate change was carried out with Equation (14), and glacier ELA shift sensitivity to meteorological factors was calculated using Equation (15): ...
... Unlike Bolibar et al. [46], we applied only four layers (two hidden layers) since we did not consider glacial topography impact on SMB and ELA changes. For the two hidden layers, we assigned 20 neurons to each. ...
... Furthermore, Bolibar et al. [43,46] have confirmed that glacier mass change is connected with glacier topography and climatic change. Thus, we can conclude that glacier ELA shift is also connected with these two parameters. ...
Full-text available
Research into glacial mass change in West Kunlun (WK) has been sufficient, but most of the existing studies were based on geodetic methods, which are not suitable for specific health state analyses of each glacier. In this paper, we utilize Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) imagery, applying the continuity equation to obtain altitudinal specific mass balance (SMB) for 615 glaciers (>2 km2 ) during 2002–2011, 2011–2020, and 2002–2020 to research glacial health and its response to climatic forcing. The results show dissimilar glacier SMB patterns between 2002–2011 (0.10 ± 0.14 m w.e. a–1), 2011–2020 (–0.12 ± 0.14 m w.e. a–1) and 2002–2020 (−0.01 ± 0.07 m w.e. a–1). Additionally, the glacier equilibrium line altitude (ELA) in WK was 5788 m, 5744 m, and 5786 m, respectively, and the corresponding accumulation area ratios (AARs) were 0.59, 0.62, and 0.58, during 2002–2011, 2011–2020, and 2002–2020, respectively. Regarding glacier response, compared with the ordinary-least-square (OLS) model, the artificial neural network (ANN) model revealed a respectively less and more sensitive glacier SMB response to extreme negative and positive summer skin temperatures. In addition, the ANN model indicated that the glacier ELA was less sensitive when the integrated water vapor transport (IVT) change exceeded 0.7 kg m−1s−1. Moreover, compared with IVT (−121.57 m/kg m−1s−1), glacier ELA shifts were chiefly dominated by summer skin temperature (+154.66 m/℃) in the last two decades. From 2002–2011 and 2011–2020, glacier SMB was more susceptible to summer skin temperature (−0.38 m w.e./℃ and −0.16 m w.e./℃, respectively), while during 2002–2020, it was more influenced by IVT (0.45 m w.e./kg m−1s−1). In contrast with eastern WK, glaciers in western WK were healthier, although mitigation measures are still needed to safeguard glacier health and prevent possible natural hazards in this region. Finally, we believe that the inconsistent change between glacier SMB and ELAs from 2002–2020 was connected with ice rheology and that the combined effects of skin temperature and IVT can explain the WK glacier anomaly.
... This behaviour is particularly clear for summer snowfall, for which the differences are the largest (Fig. 3). In summary, the linear approximations used by the Lasso manage to correctly fit the main cluster of average values but perform poorly for extreme values 31 . This has the strongest impact under RCP 2.6, where positive MB rates are more frequent (Fig. 4), as the linear model tends to over-estimate positive MB rates both from air temperature and snowfall (Fig. 3). ...
... The effect of glaciers shrinking to smaller extents is not captured by these synthetic experiments, but this effect is less important for flat glaciers that are dominated by thinning (Fig. 5). Additionally, glacier surface area was found to be a minor predictor in our MB models 31 . These synthetic experiments suggest that, for equal climatic conditions, flatter glaciers and ice caps will experience substantially more negative MB rates than steeper mountain glaciers. ...
... Glacier-wide MB is simulated annually for individual glaciers using deep learning (i.e. a deep artificial neural network) or the Lasso (regularized multilinear regression) 30 . This modelling approach was described in detail in a previous publication dedicated to the methods, where the ALpine Parameterized Glacier Model (ALPGM 43 ) was presented 31 . ALPGM uses a feedforward fully connected multilayer perceptron, with an architecture (40-20-10-5-1) with Leaky-ReLu 44 activation functions and a single linear function at the output. ...
Full-text available
Glaciers and ice caps are experiencing strong mass losses worldwide, challenging water availability, hydropower generation, and ecosystems. Here, we perform the first-ever glacier evolution projections based on deep learning by modelling the 21st century glacier evolution in the French Alps. By the end of the century, we predict a glacier volume loss between 75 and 88%. Deep learning captures a nonlinear response of glaciers to air temperature and precipitation, improving the representation of extreme mass balance rates compared to linear statistical and temperature-index models. Our results confirm an over-sensitivity of temperature-index models, often used by large-scale studies, to future warming. We argue that such models can be suitable for steep mountain glaciers. However, glacier projections under low-emission scenarios and the behaviour of flatter glaciers and ice caps are likely to be biased by mass balance models with linear sensitivities, introducing long-term biases in sea-level rise and water resources projections.
... However, these studies did not quantified the relevance of morphometric variables to estimate glacier changes such as elevation and aspect, or glacier surface area and slope; these variables have already been significantly correlated to glacier changes in studies either dedicated to the Tropical and Southern Andes (e.g., Soruco et al., 2009;Rabatel et al., 2011) or in other mountain ranges (e.g., Rabatel et al., 2016;Brun et al., 2019;Bolibar et al., 2020;Davaze et al., 2020). In addition, simulations of glacier changes are traditionally conducted using geodetic mass balance products and few in situ glacier measurements available for calibration/validation purposes. ...
... Our approach is based on machine learning tools. The main explanatory variables of GAV and GMB will be identified at watershed scale using the Least Absolute Shrinkage and Selection Operator (LASSO) linear regression algorithm (Tibshirani, 1996), which has shown good results at glacier scale in the Alps (Bolibar et al., 2020;Davaze et al., 2020). These results will be used to determine new glaciological zones (hereafter named "clusters") across the Andes, composed of glacierized watersheds with similar morphometric and climatic characteristics. ...
... This consideration is associated with the existence of glacier change data inside of each watershed, and because LASSO needs a sample minimum greater than a number of predictive variables of GAV and GMB. Recently, Bolibar et al. (2020) and Davaze et al. (2020) have shown satisfactory results using this algorithm in the Alps but at glacier scale and using temporal series. Classical linear regression methods calculate a coefficient values that maximize the r 2 value and minimize the error using all available explanatory variables, which results in a high variance and low bias model. ...
Full-text available
Over the last decades, glaciers across the Andes have been strongly affected by a loss of mass and surface areas. This increases risks of water scarcity for the Andean population and ecosystems. However, the factors controlling glacier changes in terms of surface area and mass loss remain poorly documented at watershed scale across the Andes. Using machine learning methods (Least Absolute Shrinkage and Selection Operator, known as LASSO), we explored climatic and morphometric variables that explain the spatial variance of glacier surface area variations in 35 watersheds (1980–2019), and of glacier mass balances in 110 watersheds (2000–2018), with data from 2,500 to 21,000 glaciers, respectively, distributed between 8 and 55°S in the Andes. Based on these results and by applying the Partitioning Around Medoids (PAM) algorithm we identified new glacier clusters. Overall, spatial variability of climatic variables presents a higher explanatory power than morphometric variables with regards to spatial variance of glacier changes. Specifically, the spatial variability of precipitation dominates spatial variance of glacier changes from the Outer Tropics to the Dry Andes (8–37°S) explaining between 49 and 93% of variances, whereas across the Wet Andes (40–55°S) the spatial variability of temperature is the most important climatic variable and explains between 29 and 73% of glacier changes spatial variance. However, morphometric variables such as glacier surface area show a high explanatory power for spatial variance of glacier mass loss in some watersheds (e.g., Achacachi with r2 = 0.6 in the Outer Tropics, Río del Carmen with r2 = 0.7 in the Dry Andes). Then, we identified a new spatial framework for hydro-glaciological analysis composed of 12 glaciological zones, derived from a clustering analysis, which includes 274 watersheds containing 32,000 glaciers. These new zones better take into account different seasonal climate and morphometric characteristics of glacier diversity. Our study shows that the exploration of variables that control glacier changes, as well as the new glaciological zones calculated based on these variables, would be very useful for analyzing hydro-glaciological modelling results across the Andes (8–55°S).
... Problems arising from this approach may be bypassed, for example, by using a different family of regression procedures for the transfer functions, such as Random Forests (Breiman, 2001). Bolibar and others (2020) demonstrate that a deep-learning approach is able to capture non-linear behaviour that is missed by classic regression-based techniques. 4. Temporal stationarity of transfer functions: Our approach assumes that the models' transfer functions are stationary in time. ...
... As mass-balance records continue to grow, the inclusion of topographic predictors in a deep-learning approach (e.g. Bolibar and others, 2020) represents another promising way to address this limitation. ...
Full-text available
We investigate relationships between synoptic-scale atmospheric variability and the mass-balance of 13 Andean glaciers (located 16-55°S) using Pearson correlation coefficients (PCCs) and multiple regressions. We then train empirical glacier mass-balance models (EGMs) in a cross-validated multiple regression procedure for each glacier. We find four distinct glaciological zones with regard to their climatic controls: (1) The mass-balance of the Outer Tropics glaciers is linked to temperature and the El Niño-Southern Oscillation (PCC ⩽ 0.6), (2) glaciers of the Desert Andes are mainly controlled by zonal wind intensity (PCC ⩽ 0.9) and the Antarctic Oscillation (PCC ⩽0.6), (3) the mass-balance of the Central Andes glaciers is primarily correlated with precipitation anomalies (PCC ⩽ 0.8), and (4) the glacier of the Fuegian Andes is controlled by winter precipitation (PCC ≈ 0.7) and summer temperature (PCC ≈ −0.9). Mass-balance data in the Lakes District and Patagonian Andes zones, where most glaciers are located, are too sparse for a robust detection of synoptic-scale climatic controls. The EGMs yield R 2 values of ∼ 0.45 on average and ⩽ 0.74 for the glaciers of the Desert Andes. The EGMs presented here do not consider glacier dynamics or geometry and are therefore only suitable for short-term predictions.
... Consequently, this makes it difficult for machine learning models to predict the glaciers when applied to data outside the given training period. Although deep learning methods can model very complex relationships both in spatial and temporal dimensions (Bolibar et al., 2020), yet these models require a large amount of training data. As mentioned earlier, obtaining a large amount of labeled training data is very costly in terms of the time and effort needed to prepare it. ...
In recent years, deep learning (DL) methods have proven their efficiency for various computer vision (CV) tasks such as image classification, natural language processing, and object detection. However, training a DL model is expensive in terms of both complexities of the network structure and the amount of labeled data needed. In addition, the imbalance among available labeled data for different classes of interest may also adversely affect the model accuracy. This paper addresses these issues using a new convolutional neural network (CNN) based architecture. The proposed network incorporates both spatial and spectral information that combines two sub-networks: spatial-CNN and spectral-CNN. The spectral-CNN extracts spectral information, while spatial-CNN captures spatial information. Moreover, to make the features more robust, a multiscale spatial CNN architecture is introduced using different kernels. The final feature vector is formed by concatenating the outputs obtained from both spatial-CNN and spectral-CNN. To address the data imbalance problem, a generative adversarial network (GAN) was used to generate data for the underrepresented class. Finally, relatively a shallower network architecture was used to reduce the number of parameters in the network and improve the processing speed. The proposed model was trained and tested on Senitel-2 images for the classification of the debris-covered glacier. The results showed that the proposed method is well-suited for mapping and monitoring debris-covered glaciers at a large scale with high classification accuracy. In addition, we compared the proposed method with conventional machine learning approaches, support vector machine (SVM), random forest (RF) and multilayer perceptron (MLP).
... ML techniques on remote sensing data have been used successfully in areas such as land cover classification (Talukdar et al., 2020), building extraction (Ghaffarian and Emtehani, 2021), tree species diversity (Mallinis et al., 2020), agriculture (Aghighi et al., 2018), and canopy extraction (Csillik et al., 2020). Similarly, ML approaches have also been highly effective in glacier classification and delineation (Nijhawan et al., 2018;Khan et al., 2020;Alifu et al., 2015;Bolibar et al., 2020a;Xie et al., 2020;Dirscherl et al., 2020;Lin et al., 2020;Bolibar et al., 2020b). While traditional remote sensing methods used image-classification based on band ratios and other classification methods to classify the glaciers, current trend is to apply ML and deep learning approaches. ...
Full-text available
Machine learning image classification algorithms offer a potential for effective and efficient classification of remotely sensed images covering glaciated areas. The Columbia Icefield in Canada is one such place where glaciers are retreating and losing mass over the years. The Columbia Icefield plays an important role in the region's water budget. In this study, the accuracy of three machine learning algorithms, namely, SVM, RF and MLC, were assessed for the classification of snow/ice area on 2020 Landsat 8 OLI image. All three algorithms classified the image with over 99 percent accuracies, but the SVM classifier showed a higher accuracy in debris covered areas on glaciers. Further, we used SVM algorithm to classify Landsat 5 TM - Sept 10, 1985, Landsat 5 TM - Sept 27, 1991, Landsat 8 OLI - Aug 22, 2013, and Landsat 8 OLI - Sept 10, 2020 images in the Columbia Icefield. Among nine glaciers, Saskatchewan (- 4.57 km²), Dome (−2.03 km²), Columbia (−2.06 km²), Stutfield (−2.17 km²), G242655E52112N (−1.39 km²), Athabasca (−1.39 km²), Castleguard (−1.3 km²), and G242614E52109N (−0.54 km²) measured less ice and snow-covered areas between 1985 and 2020. For these nine glaciers, there was a total decrease of 2.01 ± 0.24 km³ volume between 1985 to 2020, which is about 1.81 ± 0.22 km³ water equivalent or 0.12 ± 0.015 km³ water equivalent per year. On average, Saskatchewan (−0.699 km³) and Columbia (−0.307 km³) Glaciers lost the highest volume of snow and ice between 1985 and 2020. This study also concluded that all SVM, RF and MLC produce highly accurate satellite image classification in the glaciated areas.
Full-text available
The interpretation of deep learning (DL) hydrological models is a key challenge in data-driven modeling of streamflow, as the DL models are often seen as “black box” models despite often outperforming process-based models in streamflow prediction. Here we explore the interpretability of a convolutional long short-term memory network (CNN-LSTM) previously trained to successfully predict streamflow at 226 stream gauge stations across southwestern Canada. To this end, we develop a set of sensitivity experiments to characterize how the CNN-LSTM model learns to map spatiotemporal fields of temperature and precipitation to streamflow across three streamflow regimes (glacial, nival, and pluvial) in the region, and we uncover key spatiotemporal patterns of model learning. The results reveal that the model has learned basic physically-consistent principles behind runoff generation for each streamflow regime, without being given any information other than temperature, precipitation, and streamflow data. In particular, during periods of dynamic streamflow, the model is more sensitive to perturbations within/nearby the basin where streamflow is being modeled, than to perturbations far away from the basins. The sensitivity of modeled streamflow to the magnitude and timing of the perturbations, as well as the sensitivity of day-to-day increases in streamflow to daily weather anomalies, are found to be specific for each streamflow regime. For example, during summer months in the glacial regime, modeled daily streamflow is increasingly generated by warm daily temperature anomalies in basins with a larger fraction of glacier coverage. This model's learning of “glacier runoff” contributions to streamflow, without any explicit information given about glacier coverage, is enabled by a set of cell states that learned to strongly map temperature to streamflow only in glacierized basins in summer. Our results demonstrate that the model's decision making, when mapping temperature and precipitation to streamflow, is consistent with a basic physical understanding of the system.
Full-text available
Glaciers play a crucial role in the Earth System: they are important water suppliers to lower‐lying areas during hot and dry periods, and they are major contributors to the observed present‐day sea‐level rise. Glaciers can also act as a source of natural hazards and have a major touristic value. Given their societal importance, there is large scientific interest in better understanding and accurately simulating the temporal evolution of glaciers, both in the past and in the future. Here, we give an overview of the state of the art of simulating the evolution of individual glaciers over decadal to centennial time scales with ice‐dynamical models. We hereby highlight recent advances in the field and emphasize how these go hand‐in‐hand with an increasing availability of on‐site and remotely sensed observations. We also focus on the gap between simplified studies that use parameterizations, typically used for regional and global projections, and detailed assessments for individual glaciers, and explain how recent advances now allow including ice dynamics when modeling glaciers at larger spatial scales. Finally, we provide concrete recommendations concerning the steps and factors to be considered when modeling the evolution of glaciers. We suggest paying particular attention to the model initialization, analyzing how related uncertainties in model input influence the modeled glacier evolution and strongly recommend evaluating the simulated glacier evolution against independent data.
Full-text available
This work introduces the S2M (SAFRAN–SURFEX/ISBA–Crocus–MEPRA) meteorological and snow cover reanalysis in the French Alps, Pyrenees and Corsica, spanning the time period from 1958 to 2021. The simulations are made over elementary areas, referred to as massifs, designed to represent the main drivers of the spatial variability observed in mountain ranges (elevation, slope and aspect). The meteorological reanalysis is performed by the SAFRAN system, which combines information from numerical weather prediction models (ERA-40 reanalysis from 1958 to 2002, ARPEGE from 2002 to 2021) and the best possible set of available in situ meteorological observations. SAFRAN outputs are used to drive the Crocus detailed snow cover model, which is part of the land surface scheme SURFEX/ISBA. This model chain provides simulations of the evolution of the snow cover, underlying ground and the associated avalanche hazard using the MEPRA model. This contribution describes and discusses the main climatological characteristics (climatology, variability and trends) and the main limitations of this dataset. We provide a short overview of the scientific applications using this reanalysis in various scientific fields related to meteorological conditions and the snow cover in mountain areas. An evaluation of the skill of S2M is also displayed, in particular through comparison to 665 independent in situ snow depth observations. Further, we describe the technical handling of this open-access dataset, available at The S2M data are provided by Météo-France – CNRS, CNRM, Centre d'Études de la Neige, through AERIS (Vernay et al., 2022).
Full-text available
Andean glaciers are among the fastest shrinking and largest contributors to sea level rise on Earth. They also represent crucial water resources in many tropical and semi-arid mountain catchments. Yet the magnitude of the recent ice loss is still debated. Here we present Andean glacier mass changes (from 10° N to 56° S) between 2000 and 2018 using time series of digital elevation models derived from ASTER stereo images. The total mass change over this period was −22.9 ± 5.9 Gt yr⁻¹ (−0.72 ± 0.22 m w.e. yr⁻¹ (m w.e., metres of water equivalent)), with the most negative mass balances in the Patagonian Andes (−0.78 ± 0.25 m w.e. yr⁻¹) and the Tropical Andes (−0.42 ± 0.24 m w.e. yr⁻¹), compared to relatively moderate losses (−0.28 ± 0.18 m w.e. yr⁻¹) in the Dry Andes. Subperiod analysis (2000–2009 versus 2009–2018) revealed a steady mass loss in the tropics and south of 45° S. Conversely, a shift from a slightly positive to a strongly negative mass balance was measured between 26 and 45° S. In the latter region, the drastic glacier loss in recent years coincides with the extremely dry conditions since 2010 and partially helped to mitigate the negative hydrological impacts of this severe and sustained drought. These results provide a comprehensive, high-resolution and multidecadal data set of recent Andes-wide glacier mass changes that constitutes a relevant basis for the calibration and validation of hydrological and glaciological models intended to project future glacier changes and their hydrological impacts.
Full-text available
Des modélisations ont été réalisées sur les deux plus grands glaciers des Alpes françaises afin d’estimer leur évolution au cours du XXI esiècle.Pour un scénario climatique intermédiaire avec réduction des émissions de gaz à effet de serreavant la fin du XXI esiècle (RCP 4.5), les simulations indiquent que le glacier d’Argentière devrait disparaître vers la fin du XXI esiècleet que la surface de la Mer de Glace pourrait diminuer de 80 %. Dans l’hypothèse la plus pessimiste d’une croissance ininterrompue des émissions de gaz à effet de serre (RCP 8.5), la Mer de Glace pourrait disparaître avant 2100 et le glacier d’Argentière une vingtaine d’années plus tôt.
Full-text available
We present a parameterized glacier evolution model, with a surface mass balance (SMB) component based on a deep artificial neural network (i.e. deep learning). While most glacier models tend to incorporate more and more physical processes, here we take an alternative approach by creating a parameterized model based on data science. Annual glacier-wide SMBs can be simulated using either deep learning or Lasso (regularized multilinear regression), whereas the glacier geometry is updated using a glacier-specific parameterization. We compare and cross-validate our nonlinear deep learning SMB model against other standard linear statistical methods on a dataset of 32 French alpine glaciers. Deep learning is found to outperform linear methods, with improved explained variance (up to +64% in space and +108% in time) and accuracy (up to +47% in space and +58% in time), resulting in an estimated r 2 of 0.77 and RMSE of 0.51 m.w.e. Substantial nonlinear structures are captured by deep learning, with around 35% of nonlinear behaviour in the temporal dimension. For the glacier geometry evolution, the main uncertainties come from the ice thickness data used to initialize the model. These results should encourage the use of deep learning in glacier modelling as a powerful nonlinear tool, capable of capturing the nonlinearities of the climate and glacier systems, that can serve to reconstruct or simulate SMB time series for individual glaciers at a regional scale for past and future climates.
Full-text available
Global-scale 21st-century glacier mass change projections from six published global glacier models are systematically compared as part of the Glacier Model Intercomparison Project. In total 214 projections of annual glacier mass and area forced by 25 General Circulation Models (GCMs) and four Representative Concentration Pathways (RCP) emission scenarios and aggregated into 19 glacier regions are considered. Global mass loss of all glaciers (outside the Antarctic and Greenland ice sheets) by 2100 relative to 2015 averaged over all model runs varies from 18 ± 7% (RCP2.6) to 36 ± 11% (RCP8.5) corresponding to 94 ± 25 and 200 ± 44 mm sea-level equivalent (SLE), respectively. Regional relative mass changes by 2100 correlate linearly with relative area changes. For RCP8.5 three models project global rates of mass loss (multi-GCM means) of >3 mm SLE per year towards the end of the century. Projections vary considerably between regions, and also among the glacier models. Global glacier mass changes per degree global air temperature rise tend to increase with more pronounced warming indicating that mass-balance sensitivities to temperature change are not constant. Differences in glacier mass projections among the models are attributed to differences in model physics, calibration and downscaling procedures, initial ice volumes and varying ensembles of forcing GCMs.
Full-text available
Glaciers in the European Alps play an important role in the hydrological cycle, act as a source for hydroelectricity and have a large touristic importance. The future evolution of these glaciers is driven by surface mass balance and ice flow processes, of which the latter is to date not included explicitly in regional glacier projections for the Alps. Here, we model the future evolution of glaciers in the European Alps with GloGEMflow, an extended version of the Global Glacier Evolution Model (GloGEM), in which both surface mass balance and ice flow are explicitly accounted for. The mass balance model is calibrated with glacier-specific geodetic mass balances and forced with high-resolution regional climate model (RCM) simulations from the EURO-CORDEX ensemble. The evolution of the total glacier volume in the coming decades is relatively similar under the various representative concentrations pathways (RCP2.6, 4.5 and 8.5), with volume losses of about 47 %-52 % in 2050 with respect to 2017. We find that under RCP2.6, the ice loss in the second part of the 21st century is relatively limited and that about one-third (36.8 % ± 11.1 %, multi-model mean ±1σ) of the present-day (2017) ice volume will still be present in 2100. Under a strong warming (RCP8.5) the future evolution of the glaciers is dictated by a substantial increase in surface melt, and glaciers are projected to largely disappear by 2100 (94.4 ± 4.4 % volume loss vs. 2017). For a given RCP, differences in future changes are mainly determined by the driving global climate model (GCM), rather than by the RCM, and these differences are larger than those arising from various model parameters (e.g. flow parameters and cross-section parameterisation). We find that under a limited warming, the inclusion of ice dynamics reduces the projected mass loss and that this effect increases with the glacier elevation range, implying that the inclusion of ice dynamics is likely to be important for global glacier evolution projections .
Full-text available
The largest collection so far of glaciological and geodetic observations suggests that glaciers contributed about 27 millimetres to sea-level rise from 1961 to 2016, at rates of ice loss that could see the disappearance of many glaciers this century.
Full-text available
Despite their importance for sea-level rise, seasonal water availability, and as a source of geohazards, mountain glaciers are one of the few remaining subsystems of the global climate system for which no globally applicable, open source, community-driven model exists. Here we present the Open Global Glacier Model (OGGM), developed to provide a modular and open-source numerical model framework for simulating past and future change of any glacier in the world. The modeling chain comprises data downloading tools (glacier outlines, topography, climate, validation data), a preprocessing module, a mass-balance model, a distributed ice thickness estimation model, and an ice-flow model. The monthly mass balance is obtained from gridded climate data and a temperature index melt model. To our knowledge, OGGM is the first global model to explicitly simulate glacier dynamics: the model relies on the shallow-ice approximation to compute the depth-integrated flux of ice along multiple connected flow lines. In this paper, we describe and illustrate each processing step by applying the model to a selection of glaciers before running global simulations under idealized climate forcings. Even without an in-depth calibration, the model shows very realistic behavior. We are able to reproduce earlier estimates of global glacier volume by varying the ice dynamical parameters within a range of plausible values. At the same time, the increased complexity of OGGM compared to other prevalent global glacier models comes at a reasonable computational cost: several dozen glaciers can be simulated on a personal computer, whereas global simulations realized in a supercomputing environment take up to a few hours per century. Thanks to the modular framework, modules of various complexity can be added to the code base, which allows for new kinds of model intercomparison studies in a controlled environment. Future developments will add new physical processes to the model as well as automated calibration tools. Extensions or alternative parameterizations can be easily added by the community thanks to comprehensive documentation. OGGM spans a wide range of applications, from ice–climate interaction studies at millennial timescales to estimates of the contribution of glaciers to past and future sea-level change. It has the potential to become a self-sustained community-driven model for global and regional glacier evolution.
Full-text available
Knowledge of the ice thickness distribution of the world’s glaciers is a fundamental prerequisite for a range of studies. Projections of future glacier change, estimates of the available freshwater resources or assessments of potential sea-level rise all need glacier ice thickness to be accurately constrained. Previous estimates of global glacier volumes are mostly based on scaling relations between glacier area and volume, and only one study provides global-scale information on the ice thickness distribution of individual glaciers. Here we use an ensemble of up to five models to provide a consensus estimate for the ice thickness distribution of all the about 215,000 glaciers outside the Greenland and Antarctic ice sheets. The models use principles of ice flow dynamics to invert for ice thickness from surface characteristics. We find a total volume of 158 ± 41 × 10 ³ km ³ , which is equivalent to 0.32 ± 0.08 m of sea-level change when the fraction of ice located below present-day sea level (roughly 15%) is subtracted. Our results indicate that High Mountain Asia hosts about 27% less glacier ice than previously suggested, and imply that the timing by which the region is expected to lose half of its present-day glacier area has to be moved forward by about one decade. © 2019, The Author(s), under exclusive licence to Springer Nature Limited.
Conference Paper
[ FULL-TEXT AVAILABLE AT: ] The amount of available spatio-temporal data has been increasing as large-scale data collection (e.g., from geosensor networks) becomes more prevalent. This has led to an increase in spatio-temporal forecasting applications using geo-referenced time series data motivated by important domains such as environmental monitoring (e.g., air pollution index, forest fire risk prediction). Being able to properly assess the performance of new forecasting approaches is fundamental to achieve progress. However, the dependence between observations that the spatio-temporal context implies, besides being challenging in the modelling step, also raises issues for performance estimation as indicated by previous work. In this paper, we empirically compare several variants of cross-validation (CV) and out-of-sample (OOS) performance estimation procedures that respect data ordering, using both artificially generated and real-world spatio-temporal data sets. Our results show both CV and OOS reporting useful estimates. Further, they suggest that blocking may be useful in addressing CV’s bias to underestimate error. OOS can be very sensitive to test size, as expected, but estimates can be improved by careful management of the temporal dimension in training. Code related to this paper is available at: