PreprintPDF Available

Forecasting Lake Nokoué Water Levels Using Long Short-Term Memory Network

Authors:
Preprints and early-stage research may not have been peer reviewed yet.

Abstract and Figures

The prediction of hydrological flows (rainfall-depth or rainfall-discharge) is becoming increasingly important in the management of hydrological risks such as floods. In this study, the Long Short-Term Memory (LSTM) network, a state-of-the-art algorithm dedicated to time series, is applied to predict the daily water level of lake Nokoue in Benin. This paper aims to provide an effective and reliable method enable of reproducing the future daily water level of Lake Nokoue, which is influenced by a combination of two phenomena: rainfall and river flow (runoff from the Ouémé River, the Sô River, the Porto-Novo lagoon, and the Atlantic Ocean). Performance analysis based on the forecasting horizon indicates that LSTM can predict the water level of Nokoué Lake up to a forecast horizon of t+10 days. Performance metrics such as Root Mean Square Error (RMSE), coefficient of correlation (R²), Nash-Sutcliffe Efficiency (NSE), and Mean Absolute Error (MAE) agree on a forecast horizon of up to t+3 days. The values of these metrics remain stable for forecast horizons of t+1 days, t+2 days, and t+3 days. The values of R² and NSE are greater than 0.97 during the training and testing phases in the Nokoué Lake basin. Based on the evaluation indices used to assess the model's performance for the appropriate forecast horizon of water level in the Nokoué lake basin, the forecast horizon of t+3 days is chosen for predicting future daily water levels.
Content may be subject to copyright.
Article Not peer-reviewed version
Forecasting Lake Nokoué Water Levels
Using Long Short-Term Memory
Network
Namwinwelbere Dabire * , Eugene C. Ezin , Adandedji M. Firmin
Posted Date: 20 March 2024
doi: 10.20944/preprints202403.1218.v1
Keywords: forecasting; Machine learning algorithms; recurrent artificial neural network; lake Nokoue.
Preprints.org is a free multidiscipline platform providing preprint service that
is dedicated to making early versions of research outputs permanently
available and citable. Preprints posted at Preprints.org appear in Web of
Science, Crossref, Google Scholar, Scilit, Europe PMC.
Copyright: This is an open access article distributed under the Creative Commons
Attribution License which permits unrestricted use, distribution, and reproduction in any
medium, provided the original work is properly cited.
Article
Forecasting Lake Nokoué Water Levels Using Long
Short-Term Memory Network
Namwinwelbere Dabire 1,2,*, Eugene C. Ezin 3 and Adandedji M. Firmin 4
1 Institut National de l’Eau (INE), Centre d’Excellence d’Afrique pour l’Eau et l’Assainissement (C2EA),
Université d’Abomey Calavi (UAC), nin, namwinwelbere@gmail.com
2 Ecole Doctorale des Sciences de l’Ingénieur (ED-SDI), Universid’Abomey Calavi, nin,
namwinwelbere@gmail.com
3 Institut de Formation et de Recherche en Informatique (IFRI), Université d’Abomey Calavi, nin,
eugene.ezin@uac.bj
4 Laboratoire d’Hydrologie Appliquée (LHA), Institut National de l’Eau, Université d’Abomey Calavi,
nin, firminelite@gmail.com
* Correspondence: namwinwelbere@gmail.com
Abstract: The prediction of hydrological flows (rainfall-depth or rainfall-discharge) is becoming
increasingly important in the management of hydrological risks such as floods. In this study, the
Long Short-Term Memory (LSTM) network, a state-of-the-art algorithm dedicated to time series, is
applied to predict the daily water level of lake Nokoue in Benin. This paper aims to provide an
effective and reliable method enable of reproducing the future daily water level of Lake Nokoue,
which is influenced by a combination of two phenomena: rainfall and river flow (runoff from the
Ouémé River, the River, the Porto-Novo lagoon, and the Atlantic Ocean). Performance analysis
based on the forecasting horizon indicates that LSTM can predict the water level of Nokoué Lake
up to a forecast horizon of t+10 days. Performance metrics such as Root Mean Square Error (RMSE),
coefficient of correlation (R²), Nash-Sutcliffe Efficiency (NSE), and Mean Absolute Error (MAE)
agree on a forecast horizon of up to t+3 days. The values of these metrics remain stable for forecast
horizons of t+1 days, t+2 days, and t+3 days. The values of and NSE are greater than 0.97 during
the training and testing phases in the Nokoué Lake basin. Based on the evaluation indices used to
assess the model's performance for the appropriate forecast horizon of water level in the Nokoué
lake basin, the forecast horizon of t+3 days is chosen for predicting future daily water levels.
Keywords: forecasting; machine learning algorithms; recurrent artificial neural network; lake
Nokoue
Introduction
Lake Nokoué is at the center of important Benin socio-economic and ecological issues. Hosting
lacustrine villages and bordered by three major urban centers (Cotonou, Abomey-Calavi, and Sèmè
Podji), the planning and implementation of flood management strategies require a deep
understanding of the processes involved in the physical dynamics of Nokolake, of which water
level variation is a key parameter. This variation, which can occasionally lead to floods with dramatic
repercussions on local populations, is primarily influenced by (i) ocean tides, (ii) hydrological
variability of the watershed, and (iii) direct contributions from precipitation. These floods are linked
to major river basin floods and the regular overflow of Nokoué lake. Due to its complex hydrological
configuration, hydrological modeling of Nokolake is challenging for hydrodynamic conceptual
models due to the non-linearity of explanatory variables [1]. A high-water level is associated with a
strong freshwater river flow from the river and the Ouémé river, while a low water level is
associated with periods when saltwater from the ocean enters Nokoué lake. Hydrological prediction
models have long been devoted to forecasting river discharge for the proper management of water
Disclaimer/Publisher’s Note: The statements, opinions, and data contained in all publications are solely those of the individual author(s) and
contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting
from any ideas, methods, instructions, or products referred to in the content.
Preprints.org (www.preprints.org) | NOT PEER-REVIEWED | Posted: 20 March 2024 doi:10.20944/preprints202403.1218.v1
© 2024 by the author(s). Distributed under a Creative Commons CC BY license.
2
resource systems. As a result, there is a vast body of literature on the development and application
of a wide range of methods for predicting river flow, primarily governed by rainfall. Two types of
models can be identified. [2,3]: (a) Physical models apply deterministic equations to a set of input
variables (such as physiographic characteristics or precipitation) to obtain desired river flow values,
while (b) stochastic (statistical) models probabilistically model hydrological phenomena, taking into
account the uncertainty of observed data and non-linearity. However, the calculation of physical
models is subject to considerable uncertainties, such as the lack of data related to the physical
processes of the hydrological system and the limited level of scientific knowledge concerning natural
systems like water bodies. These limitations of physical models negatively impact the quality of
forecasts, especially as domain experts require accurate results associated with minimal
computational time for optimal decision-making. According to [4] These techniques are limited by
the understanding of flood dynamics (variations in flood wave propagation time in hydrographic
networks) or by the different types of hydrological fluxes that influence water bodies.
In the same way to physical models, stochastic models function as black boxes on observational
data without any consideration of the internal structure of the system [510]. Furthermore, stochastic
models adapt to the non-linearity of hydrological processes and address uncertainties in parameter
estimations. [11,12]. On the other hand, stochastic models introduce various techniques for flood
estimation, ranging from simple regression of discharge to detailed modeling of hydrological
processes. However, stochastic models are criticized for being more data-intensive, requiring in-situ
observation data to ensure reliable predictions [13,14]. A variety of stochastic models have been
proposed for hydrological flow prediction. Two major categories can be distinguished: time series
models and regression models. The former is primarily based on modeling the autocorrelation
structure of hydrological flows (water level or discharge), while the latter focuses on the correlation
between input and output variables regardless of the temporal structure. [15,16]. Therefore,
according to [17], typical input variables such as precipitation forecasts are used to predict the output
variable (future water levels of water bodies or future streamflow). Some regression models, such as
linear regression, principal component regression, partial least squares regression, and wavelet
regression, are commonly used for long-term forecasts [18]. However, these models only forecast
water levels at large time scales (typically seasonal or annual). This does not provide information
about the shape of the water level curve during these long periods, lacking precision on when the
water level will reach a critical level that could lead to disasters. There are also a number of regression
models such as highly non-linear recurrent neural networks represented by Long Short-Term
Memory (LSTM), generic recurrent cell or standard Recurrent Neural Network (RNN), and Gated
Recurrent Unit (GRU) that are widely used to predict hydrological flows. [1921]. However, these
models can only model and predict a few days (or data points) at a time, which is useful for detecting
future extremes but not for forecasting the overall trends of the water level evolution process. In
particular, the application of LSTM in several domains has proven powerful for achieving
hydrological forecasts, which is beneficial for managing hydrological extremes such as floods [22].
LSTMs belong to the category of stochastic models that learn features to extract useful information
from sequential data for future predictions of hydrological behavior, but without any understanding
of the internal structure of the hydrographic basin. This technique has been successfully applied in
hydrological modeling and has shown significant computational power in several studies of
hydrological flow modeling, such as streamflow forecasting, yielding more accurate results. [2325].
Despite the importance of monitoring the water level of Lake Nokoué and the use of conceptual
models to explain its fluctuations, the inefficiency of these conceptual models in accounting for the
non-linearity of variables coupled with the complexity of this water body remains major challenges
in implementing appropriate flood management measures.
To address the challenges posed by the non-linearity of variables coupled with the complexity
of Nokolake and guide decision-making in the implementation of prevention plans, the study
aims to apply LSTM for Nokoué lake water level prediction. LSTM belongs to the category of
recurrent neural network methods that not only solve the gradient instability problem but also
address the issue of preserving information over long sequences in time series data. This model will
Preprints.org (www.preprints.org) | NOT PEER-REVIEWED | Posted: 20 March 2024 doi:10.20944/preprints202403.1218.v1
3
be performed at different time horizons to identify the suitable horizon for predicting the water level
of Nokoué lake.
The contributions of this study are the followings:
It is a first attempt to apply artificial intelligence models to a complex water body in order to
assess the performance of these models in establishing the nonlinear relationship between input
variables and the output.
It is also to propose and implement a recurrent neural network model to leverage the set of
inputs variables for predicting the water level of Nokoué lake.
2. Materials and Methods
2.1. Study Area
Nokoué lake, (Figure 1), is located in the southeast of Benin, between 6°25'N and 2°36'E, covering
an area that varies between 150 km2 and 170 km2 respectively during the low water period and high-
water period, respectively [2629]. It stretches for approximately 20 km from east to west along the
coast and 11 km from south to north, as confirmed by multiple authors such as[27,28]. The average
and maximum depths of the lake are approximately 1.3m and 2.9m, respectively. Towards the
Cotonou channel, the Nokoué lake deepens, and the average and maximum depths reach around 3m
and 8m, respectively. Two rivers flow into Nokolake: on its northern bank is the -Ava river,
which drains a watershed area of approximately 10,000 km2, and the Ouémé river, the largest river in
Benin, which drains a watershed area of approximately 50,000 square kilometer [26,28]. The Djonou
river, with a smaller extent and flow, also contributes to the freshwater input in the southwestern
part of Noko lake. In the southern part, Nokolake is connected to the Atlantic Ocean through
the Cotonou channel, which is 280 meters wide and approximately 4 kilometers long [26,29]. Through
this canal, constructed in 1885, exchanges of freshwater and saltwater occur in accordance with the
tides and hydrological regime [1]. The canal of Tochè, approximately 4 km long, connects the Porto
Novo lagoon, with an area of approximately 35 km2, to Lake Nokoon the eastern side, with little
effect on the dynamics of Lake Nokoué. At a seasonal scale, the hydrological regime of Lake Nokoué
is determined by the West African summer monsoon, resulting in two rainy seasons and two dry
seasons. [1]. These seasons are linked to the north-south movement of the intertropical convergence
zone and the associated belt of intense tropical rainfall. The main rainy season extends from April-
May to the end of July when the intertropical convergence zone moves northward from its southern
position near the equator. The second rainy season, shorter and less intense, occurs from late
September to November when the intertropical convergence zone migrates southward from its
northernmost position. However, this dual rainy season, along with local precipitation, has only a
weak influence on the water level of Nokoué lake. Nokoué lake is more influenced by the hydrology
of the central part of Benin, where the main basin of the Ouémé river is located [29]. This region is
characterized by a single rainy season with peak precipitation occurring between July and October
[29,30]. The period from September to October is when the maximum flow enters Nokoué lake.
Figure 1. Location map of the study area .
Preprints.org (www.preprints.org) | NOT PEER-REVIEWED | Posted: 20 March 2024 doi:10.20944/preprints202403.1218.v1
4
2.2. Data Acquisition
The data used in this study include a times series of daily water level observations expressing
flood-recession events, a series of rainfall observations, and a series of discharge observations
provided by the National Meteorological Agency of Benin (METEO-Benin) and the General
Directorate of Water of Benin (DGEau).
2.3. Data Preprocessing
Preprocessing of the variables is necessary to account for the full range of values (both low and
high) and to avoid sigmoid saturation with the high values in the database [31]. Indeed, the direct
application of the sigmoid function to the weighted sums of rainfall-discharge inputs results in the
neglect of information from low numerical values (rainfall and water levels) compared to high
numerical values (discharge). This preprocessing step involves normalizing all values in the database
between -1 and 1. This is done through the following Equation 1:

 (1)
where Xi is the actual value to be normalized, Xmean is its mean, and Xnormalized is the normalized value.
This transformation scales the input data between [-1, 1].
2.4. Structure of the Long Short Term Memory Model
The Long Short-Term Memory (LSTM) cell is an enhancement of the optimization proposed in
[32]. Considered as a black box, the LSTM cell, widely used in time series forecasting, is an impressive
architecture of an artificial recurrent neural network (RNN) capable of memorizing the temporal
order of data. Furthermore, the LSTM overcomes the issues of gradient instability and insufficient
memory capacity through the state of its cells and gates [33]. The two main problems of the RNN
architecture are gradient instability and its inability to retain information from long sequences of
temporal data. In contrast, the LSTM cell, as a deep learning predictive model, receives the latent
states from the previous step and has a self-evaluation mechanism that offers better performance in
time series forecasting. The internal structure of the LSTM consists of three main gates that control
the flow of information: (i) forgetting unwanted information in the current cell state through the
forget gate (ft), (ii) adding additional data to the current cell state through the input gate, also known
as the temporal attention module (It), and (iii) producing an output from the current cell state through
the output gate (Ot). These gates serve specific operations on the cell states. The state of the LSTM
network is divided into two states: ht and ct. The hidden state ht of the LSTM network is considered
as short-term memory, while the cell state ct is considered as long-term memory of the network. The
operations performed within the LSTM cells help the model retain information from sequential data.
The LSTM network uses cells as memory units for the model. The gates, as shown in Figure 2 and
illustrated in Equations 1, 4, 5, and 6, determine the data to be carried.
a) The forget gate
The forget gate, ft, determines the amount of information from the previous timestamp that
should be transmitted and is the essence of the LSTM architecture. It determines the amount of
memory preserved from the previous memory state, Ct-1. In Equation 2, the previous hidden state, ht-
1, and the current input information, xt, are passed through the sigmoid activation function.
Information associated with 0 is forgotten, while information associated with one 1 continues to be
carried through the cell state.
 (2)
Where:
σ is the sigmoid activation function ;
wfh and wfx represent the weight matrices of the forget gate, ft, for their connections to the
previous hidden state ht1 and to the input vector xt;
bf the bias term for the forget gate.
Preprints.org (www.preprints.org) | NOT PEER-REVIEWED | Posted: 20 March 2024 doi:10.20944/preprints202403.1218.v1
5
b) Long-term state
The flow of the long-term state, ct-1, through the network is from left to right. It first passes
through a forget gate, which discards certain information, and then adds new information through
an addition operation (the added information is selected by an input gate), and the resulting ct is sent
directly without any further transformation. Therefore, at each time step, information is removed and
new information is added. The update calculation of ct is illustrated in the following Figure 2 and
Equations 3 et 4:   (3)
 (4)
Where :
wgh and wgx represent the weight matrices of the main layer gt for their connections to the
previous short-term state ht1 and to the input vector xt;
bg is the bias term for the main layer.
c) The input gate
The input gate, It, in Equation 5 determines which parts of the main layer, gt, to update in the
long-term state. The previous and current information is updated based on the result of the sigmoid
operation (σ). Information associated with 0 is considered trivial, while information associated with
1 is deemed essential. Additionally, the hyperbolic tangent activation function (tanh), which
compresses data between -1 and 1, is used to regulate the network. The outputs of the sigmoid and
tanh functions are then multiplied to select the information that will be updated.
󰇛 󰇜 (5)
Where :
wih and wix represent the weight matrices of the input gate for their connections to the short-
term state ht-1 and to the input vector xt ;
bi the bias term of the input gate.
d) The output gate
In Equations 6 and 7, the output gate, ot, is also used for inference and determines which parts
of the long-term state should be read and output at this time step, both in ht and in yt. Additionally,
after the addition operation, the long-term state is copied and passed through the hyperbolic tangent
function (tanh), with the result being filtered by the output gate. The result is the short-term state, ht,
which is equal to the output yt of the LSTM cell.
󰇛  󰇜 (6)
󰇛󰇜 (7)
Where:
woh and wox represent the weight matrices of the output gate for their connections to the short-
term state ht-1 and to the input vector xt;
bo is the bias term of the output gate;
ht is the output result of the masked layer at time step t.
Preprints.org (www.preprints.org) | NOT PEER-REVIEWED | Posted: 20 March 2024 doi:10.20944/preprints202403.1218.v1
6
Figure 2. Internal architecture of an LSTM cell (fully connected layer).
2.5. Long Short-Term Memory Model Configuration
In this work, optimizing the performance of the LSTM model involves selecting the input
variables, determining the appropriate network architecture, optimizing network learning, and using
a reliable validation methodology. The LSTM network, as explained earlier, consists of an input layer,
a single hidden layer, and an output layer with sigmoid activation functions for the artificial neurons
and hyperbolic tangent for the hidden states. The optimal initialization of the model's learning
algorithm parameters, such as the number of hidden neurons, the optimization function, the number
of iterations, and the batch size, for performance estimation is performed using the random search
cross-validation method (randomSearchCV), during which we test and evaluate different
combinations of inputs (rainfall, discharge, water level). The preprocessed database is divided into
two parts:
the part intended for training to recognize the system's dynamics, which is the most important
part (80%);
the testing part (20%) which prevents overfitting by checking and testing the loss function
evolution during training and validation. After the training is stopped and the weights of the
interconnections of the most performing model are saved. The validation dataset allows for
confirmation of the LSTM model's performance.
2.6. Model Performance Assessment
To ensure synchronization between the observed flood and the one estimated by the LSTM
model, we rely on a qualitative evaluation of the LSTM model using various evaluation criteria. There
is a wide range of performance evaluation criteria for hydrological models proposed by the World
Meteorological Organization and other authors. [9,31,32,34]. To ensure the reliability of the LSTM
model results in this study, we used four most relevant metrics (equations 8 to 11). These criteria
include the Nash criterion, the Root Mean Square Error (RMSE), the Mean Absolute Error (MAE),
and the coefficient of determination (R2).
NASH 󰇛󰆒󰇜
󰇛
󰇜
(8)
RMSE=󰇡
󰇢
(9)
MAE= 


(10)
󰇡
󰇢
󰇡
󰇢
(11)
Preprints.org (www.preprints.org) | NOT PEER-REVIEWED | Posted: 20 March 2024 doi:10.20944/preprints202403.1218.v1
7
Results and Discussion
3.1. Variable Selection and Statistics
3.1.1. Selection of the Variables
The variables selected for the model following forward feature selection are presented in Figure
1 and Figure 2. Rainfall, discharge of the Ouémé river and water level in lake Nokoue were the
predominant predisposing factors in the lake Nokoue. Rainfall and discharge were selected as input
for the model and the output was the water level of lake Nokoue.
Figure 3. Water level of lake Nokoue (Output variable).
Figure 4. Rainfall and discharge (selected input variables).
3.1.2. Statistics of the Variable
The statistical results of the variables are explained in Table 2. These values of discharge variable
were the among values, presenting mean and max values above 219.113 and 1064.000, respectively.
The variation in the water level of lake Nokoué is strongly influenced by rainfall [37,38].
Table 1. Statistics of the all variables.
statistics
rainfall
discharge
Water level
mean
3.392
219.113
3.173
std
10.852
318.670
0.195
min
0.000
0.610
2.736
25%
0.000
8.225
3.045
50%
0.000
25.590
3.115
75%
0.200
376.900
3.257
max
158.200
1064.000
3.981
Preprints.org (www.preprints.org) | NOT PEER-REVIEWED | Posted: 20 March 2024 doi:10.20944/preprints202403.1218.v1
8
3.2. Model Performance Analysis
The evaluation of a model’s performance is an integral step in modeling. In this study, the RMSE,
NSE, and MAE were used to assess model accuracy. The resulting RMSE, NSE, and MAE scores
for the LSTM model are shown in Table 2 for different forecast horizons. Depending on the prediction
horizon from t+1 day to t+10 days, the values of all performance criteria show a nearly stable pattern
during both phases. The values of all performance criteria remain constant for the forecast horizons
of t+1 days, t+2 days, and t+3 days (Table 2). Additionally, the R2 value is greater than 0.97 in the
calibration phase and greater than 0.96 in the validation phase. These satisfactory results demonstrate
that the LSTM model performs well in both periods (calibration phase, also known as the learning
phase, and validation phase, also known as the testing phase). It is capable of reliably reproducing
the observed water levels in the Nokoué lake basin up to a forecast horizon of t+10 days. This enables
significant proactive anticipation of flood events.
Table 2. Performance metrics values for different Prediction Horizons .
Forecast horizon
RMSE
NSE
MAE
RMSE
NSE
MAE
t+1 day
t+2 days
t+3 days
t+4 days
t+5 days
t+10 days
0.03
0.03
0.03
0.03
0.03
0.03
0.98
0.98
0.98
0.94
0.98
0.92
0.98
0.98
0.98
0.98
0.98
0.97
0.02
0.02
0.02
0.02
0.02
0.02
0.04
0.04
0.04
0.03
0.03
0.04
0.97
0.97
0.97
0.96
0.97
0.90
0.97
0.97
0.97
0.97
0.97
0.96
0.03
0.03
0.02
0.02
0.02
0.03
The adaptability of LSTM to the Nokoué lake basin during the calibration and validation periods
is also supported by the convergence of the loss function evolution during both phases, as shown in
Figure 5. Corresponding, the results from Ling et al.[39] showed that the method proposed in this
article was more effective than other methods of artificial recurrent neural networks. The loss
function values decrease with the number of iterations, stabilizing below 0.1, indicating maximum
optimization of the model's performance during the calibration and validation phases. Given the
architecture of the LSTM model, it was noted in the study by Hu et al. [40] a tendency of the LSTM
model towards a local optimum. The hydrographs of observed and predicted water levels by the
LSTM model during the calibration and validation phases are depicted in Figures 6a and 6b. It is
evident that the observed and predicted floods and recessions are nearly identical. This
synchronization between observed and estimated floods demonstrates the LSTM model's ability to
reproduce critical water levels that could lead to flooding. Recently, a number of existing literature
studies have been considered the classification of streamflow forecasting using recurrent neural
networks models. Thapa et al. [41] developed a deep learning long-short-term memory (LSTM)-based
model in the Himalayan basin for snowmelt-based discharge modeling. Ni et al. [42] applied the deep
learning method for daily flow simulation, and used data from previous years for flow prediction.
The model was carried out according to several perspectives. At the end of the study, it was found
that the LSTM model was advantageous in processing constant flow data in the dry season and gave
satisfying results in capturing data features in rapidly fluctuating flow data in rainy seasons. Luo et
al. [43] built a new hybrid model based on the long-short-term memory approach for predicting
streamflow. In this study, the linear regression model, which is one of the classical methods, was used
to show how successful the performance between the benchmark model and the hybrid model was.
The satisfactory forecasting results are further evident in Figures 7a, 7b, 8a, and 8b, which present
scatter plots of predicted (estimated) water levels against observed water levels. The linear trend line
of the scatter plots during the training (calibration) and testing (validation) phases highlights the
strong correlation between observed and predicted water levels, with a correlation coefficient of 0.98
during calibration and 0.97 during validation (Table 2). This linear alignment, particularly for water
levels between 3.5 meters and 3.9 meters, indicates that the LSTM model accurately predicts extreme
water levels in Nokoué lake that could lead to flooding. Figures 7b and 8b illustrate a good
Preprints.org (www.preprints.org) | NOT PEER-REVIEWED | Posted: 20 March 2024 doi:10.20944/preprints202403.1218.v1
9
distribution of the LSTM model's residual errors, with values ranging from -0.1 to 0.4 during the
training phase and -0.5 to 0.1 during the testing phase. Negative residual errors indicate
overestimation of water levels, while positive residual errors indicate underestimation of water levels
in Noko lake by the LSTM model. These underestimations and overestimations remain acceptable
for predicting extreme water levels in Nokoué lake.
Figure 5. Comparison of the loss function during the calibration and validation phase.
Figure 6. (a) Combined training and testing phase of the LSTM model; (b) Separated training and
testing phase of the LSTM model.
Preprints.org (www.preprints.org) | NOT PEER-REVIEWED | Posted: 20 March 2024 doi:10.20944/preprints202403.1218.v1
10
Figure 7. (a) Comparison between observed water levels and water levels predicted by the LSTM
model during the calibration phase; (b) correlation and residual error during the calibration phase.
Figure 8. (a) Comparison between observed water levels and water levels predicted by the LSTM
model during the validation phase; (b) correlation and residual error during the validation phase.
Preprints.org (www.preprints.org) | NOT PEER-REVIEWED | Posted: 20 March 2024 doi:10.20944/preprints202403.1218.v1
11
Conclusion
Our approach offers a tool for water level assessment at the lake Nokoue scale. This can help
decision-makers in implementing appropriate preventive and adaptive measures, thereby
contributing to more effective flood management. It is important to note that prediction models based
on LSTM are not infallible because the LSTM model doesn't consider the initial hydrological
conditions. They rely on available training data and underlying assumptions. Therefore, it is essential
to continuously monitor the model's performance and update the training data if necessary.
Additionally, the expertise of forecasters, combined with the analysis of the model's results, is crucial
for interpreting and properly utilizing the generated predictions. In terms of future prospects, the
study suggests applying the prediction method based on LSTM to other lakes and rivers to improve
water level forecasting. Additionally, exploring hybrid model machine learning based prediction
methods is recommended to improve the accuracy of forecasts. Furthermore, it is important to
continue collecting data on water levels, precipitation, and river flow rates to enhance the quality of
forecasts.
Acknowledgments: This work is supported in part by the World Bank and the French Development Agency
through “Centre d’Excellence pour l’Eau et l’Assainissement en Afrique (C2EA)” of University of Abomey-
Calavi in Benin. The authors would like to thank the reviewers for their constructive comments, which have
certainly improved the quality and readability of the article.
Conicts of interest: The authors declare that there are no conicts of interest.
References
1. Chaigneau, A., Okpeitcha, O. V., Morel, Y., Stieglitz, T., Assogba, Morgane, B., Allamel, P., Honfos, J.,
Thierry Derol Awoulmbang, S., Retif, F., Duhaut, T., Peugeot, C. From seasonal flood pulse to seiche: Multi-
frequency water-level fluctuations in a large shallow tropical lagoon (Nokoue Lagoon, Benin).ecss 2022,
267, 107-767.
2. Ngoc, D. V. Deterministic hydrological modeling for flood risk assessment and climate change in large
catchment: Application to Vu Gia Thu Bon catchment, Vietnam. Ph.D, Université Nice Sophia Antipolis,
2015.
3. Rebolho, C. Modélisation conceptuelle de l’aa inondation à l’échelle du bassin versant. Hydrology thesis
doctorate, Ph.D, 2018.
4. Golob, R., Štokelj, T., Grgič, D. Neural-network-based water inflow forecasting. Control Engineering Practice
1998, 6(5), 37-98.
5. Ancona, M., Corradi, N., Dellacasa, A., Delzanno, G., Dugelay, J.-L., Federici, B., Gourbesville, P., Guerrini,
G., La Caméra, A., Rosso, P., Stephens, J., Tacchella, A., Zolezzi, G. On the Design of an Intelligent Sensor
Network for Flash Flood Monitoring, Diagnosis and Management in Urban Areas Position Paper. PCS
2014, 32, 941-946.
6. Chen, L., & Singh, V. P. Flood forecasting and error simulation using copula entropy method. In P. Sharma
& D. Machiwal (Éds.), Advances in Streamflow Forecasting 2021,6, 331-368.
7. Chu, H., Wu, W., Wang, Q. J., Nathan, R., Wei, J. An ANN-based emulation modeling framework for flood
inundation modeling: Application, challenges and future direction. envsoft 2019, 19, 104-587.
8. Audrey Bornancin Plantier. Conception de modèles de prévision des crues éclair par apprentissage
artificiel, informatic thesis doctorate. Universi Pierre et Marie Curie, Paris, 2013.
9. Kharroubi, O., Blanpain, O., Masson, E., Lallahem, S. Application du seau des neurones artificiels à la
prévision des bits horaires: Cas du bassin versant de lEure, France. Hydrological Sciences Journal 2016,
61(3), 933-225.
10. Peredo, D., Ramos, M.-H., Andréassian, V., Oudin, L. Investigating hydrological model versatility to
simulate extreme flood events. Hydrological Sciences Journal 2022, 67(4), 628-645.
11. Modeste Meliho. Spatial prediction of flood susceptible zones in the Ourika watershed of Morocco using
machine learning algorithms. Aci 2022, 09,2021-0264.
12. Noor, F., Haq, S., Rakib, M., Ahmed, T., Jamal, Z., Siam, Z. S., Hasan, R. T., Adnan, M. S. G., Dewan, A., &
Rahman, R. M. Water Level Forecasting Using Spatiotemporal Attention-Based Long Short-Term Memory
Network. Water 2022,14(4), 4040-612
Preprints.org (www.preprints.org) | NOT PEER-REVIEWED | Posted: 20 March 2024 doi:10.20944/preprints202403.1218.v1
12
13. Alliau, D., De Saint Seine, J., Lang, M., Sauquet, E., Renard, B. Étude du risque d’inondation d’un site
industriel par des crues extrêmes: De l’évaluation des valeurs extrêmes aux incertitudes hydrologiques et
hydrauliques. La Houille Blanche 2015, 101(2), 67-74.
14. Morel, Y., Chaigneau, A., Okpeitcha, V. O., Stieglitz, T., Assogba, A., Duhaut, T., Rétif, F., Peugeot, C., &
Sohou, Z. Terrestrial or oceanic forcing ? Water level variations in coastal lagoons constrained by river
inflow and ocean tides. Hal open science 2022, 169, 104-309.
15. Fathian, F. Introduction of multiple/multivariate linear and nonlinear time series models in forecasting
streamflow process. In P. Sharma & D. Machiwal ds.), ASF 2021,12, 87-113
16. Wang, L., Dong, H., Cao, Y., Hou, D., Zhang, G. Real-time water quality detection based on fluctuation
feature analysis with the LSTM model. Journal of Hydroinformatics 2023, 5, 127-305
17. Masselot, P., Dabo-Niang, S., Chebana, F., Ouarda, T. B. M. J. Streamflow forecasting using functional
regression. J.Hydrol 2016, 538, 754-766.
18. Luo, X., Yuan, X., Zhu, S., Xu, Z., Meng, L., Peng, J. A hybrid support vector regression framework for
streamflow forecast. Journal of Hydrology 2019, 568, 184-193.
19. Douvinet, J., Serra-Llobet, A., Radke, J., Kondolf, M. Quels enseignements tirer des coulées de débris post-
incendie survenues le 9 janvier 2018 à Montecito (Californie, USA)? La Houille Blanche 2020, 106(6), 25-35.
20. Lang, M., Arnaud, P., Carreau, J., Deaux, N., Dezileau, L., Garavaglia, F., Latapie, A., Neppel, L., Paquet,
E., Renard, B., Soubeyroux, J.-M., Terrier, B., Veysseire, J.-M., Aubert, Y., Auffray, A., Borchi, F., Bernardara,
P., Carre, J.-C., Chambon, D., Tramblay, Y. Résultats du projet ExtraFlo (ANR 2009-2013) sur l’estimation
des pluies et crues extrêmes. La Houille Blanche 2014, 2, 5-13.
21. Viatgé, J., Berthet, L., Marty, R., Bourgin, F., Piotte, O., Ramos, M.-H., & Perrin, C. Vers une production en
temps réel d’intervalles prédictifs associés aux prévisions de crue dans Vigicrues en France. La Houille
Blanche 2019, 105(2), 63-71.
22. Hossein Hosseiny. A deep learning model for predicting river flood depth and extent. j.envsoft 2021, 145,
105-186.
23. Ji, H., Chen, Y., Fang, G., Li, Z., Duan, W., Zhang, Q. Adaptability of machine learning methods and
hydrological models to discharge simulations in data-sparse glaciated watersheds. Journal of Arid Land
2021, 13(6), 549-567.
24. Maier, H. R., & Dandy, G. C. Neural networks for the prediction and forecasting of water resources
variables: A review of modeling issues and applications. J.envsoft 2000 15(1),101-124.
25. Malik, A., Kumar, A., Tikhamarine, Y., Souag-Gamane, D., Kişi, Ö. Hybrid artificial intelligence models for
predicting daily runoff. In P. Sharma, D. Machiwal (Éds.). A SF 2021, 12, 305-329.
26. Barbe, Millet, Texier, Borel, Gualde. Les ressources en eaux superficielles de la République du nin. 1993,
540.
27. Daouda Mama, Véronique Deluchat, James Bowen, Waris Chouti, Benjamin Yao, Baba Gnon,Michel
Baudu. Caractérisation d’un Système Lagunaire en Zone Tropicale: Cas du lac Nokoué (Bénin). EJSR 2011,
56(4), 516-528.
28. Metogbe Belfrid Djihouessi, Martin pin Aina. A review of hydrodynamics and water quality of Lake
Nokoué: Current state of knowledge and prospects for further research. j.rsma 2018, 17, 2352-4855
29. Texier, H., Colleuil, B., Profizi, J. P., Dossou, C. (1980). Le lac Nokoué, environnement du domaine margino-
littoral sud-ninois: Bathymétrie, lithofaciès, salinité, mollusque et peuplements végétaux (No 28 ; p.
115-142).
30. Tore, D. B., Alamou, A. E., Obada, E., Biao, E. I., Zandagba, E. B. J. Assessment of Intra-Seasonal Variability
and Trends of Precipitations in a Climate Change Framework in West Africa. ACS 2022, 12(01), 150-171.
31. Sedai, A., Dhakal, R., Gautam, S., Dhamala, A., Bilbao, A., Wang, Q., Wigington, A., & Pol, S. Performance
Analysis of Statistical, Machine Learning and Deep Learning Models in Long-Term Forecasting of Solar
Power Production. Forecasting 2023, 5(1), 1-14
32. Murray, K., Rossi, A., Carraro, D., Visentin, A. On Forecasting Cryptocurrency Prices: A Comparison of
Machine Learning, Deep Learning, and Ensembles. Forecasting 2023, 5(1), 1-10
33. Zhu, X., Guo, H., Huang, J. J., Tian, S., Xu, W., Mai, Y. An ensemble machine learning model for water
quality estimation in coastal areas based on remote sensing imagery. JEM, 2022, 323, 116-187.
34. Wood, M., Ogliari, E., Nespoli, A., Simpkins, T., Leva, S. Day Ahead Electric Load Forecast: A
Comprehensive LSTM-EMD Methodology and Several Diverse Case Studies. Forecasting 2023, 5(1), 1-16
35. Ömer Faruk, D. A hybrid neural network and ARIMA model for water quality time series prediction. EAAI
2010, 23(4), 586-594.
36. Sharma, P., Machiwal, D. Streamflow forecasting: Overview of advances in data-driven techniques. In P.
Sharma, D. Machiwal (Éds.). ASF 2021, 82(6), 1-50.
Preprints.org (www.preprints.org) | NOT PEER-REVIEWED | Posted: 20 March 2024 doi:10.20944/preprints202403.1218.v1
13
37. Calèche Nehemie Nounagnon Avahouin, Henri Sourou Totin Vodounon, Ernest, Amoussou. Variabilité
climatique et production halieutique du lac Nokoué dans les Aguégués au Bénin. 2018, 8(2), 51-61.
38. Kawoun Alagbe Gildas, Ahamide Bernard, Chabi Amédée, Ayena Abraham, Adandedji Firminn, & Vissin
Expédit. Variabilité Pluvio-Hydrologique et Incidences sur les Eaux de Surface dans la Basse Vale de
l’Ouémé au Sud-Est nin. IJPSAT 2020, 23(2), 52-65.
39. Line Kong A Siou, Anne Johannet, Valérie Borrell, Séverin Pistre. Complexity selection of a neural network
model for karst flood forecasting: The case of the Lez Basin (southern France). jhydrol 2011, 403, 367-380.
40. Hu, C., Q. Wu, H. Li, S. Jian, N. Li, and Z. Lou. Deep learning with a long short-term memory networks
approach for rainfall-runoff simulation. Water 2018,10 (11), 1543.
41. Thapa, S.; Zhao, Z.; Li, B.; Lu, L.; Fu, D.; Shi, X.; Qi, H. Snowmelt-driven streamflow prediction using
machine learning techniques (LSTM, NARX, GPR, and SVR). Water 2020, 12, 1734.
42. Ni, L.; Wang, D.; Singh, V.P.; Wu, J.; Wang, Y.; Tao, Y.; Zhang, J. Streamflow and rainfall forecasting by
two long short-term memory-based models. J. Hydrol. 2020, 583, 124296.
43. Luo, B.; Fang, Y.; Wang, H.; Zang, D. Reservoir inflow prediction using a hybrid model based on deep
learning. IOP Conf. Ser. Mater. Sci. Eng. 2020, 715, 012044.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those
of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s)
disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or
products referred to in the content.
Preprints.org (www.preprints.org) | NOT PEER-REVIEWED | Posted: 20 March 2024 doi:10.20944/preprints202403.1218.v1
Article
Full-text available
With the impacts of climate change, floods have become increasingly frequent in recent years. Estimating flood hazard thresholds and peak floodwater levels based on flood frequency analysis is crucial for anticipating and preparing for potential flooding events. This study aims to estimate flood hazard thresholds, flood occurrence probabilities, and the return periods of peak floodwater levels in the Nokoue lake watershed in Benin. To achieve this, the standardized water level index, also known as the Flood hazard Index, was calculated to estimate flood hazard thresholds. The three best probability distribution models, Gumbel, Generalized Extreme Value (GEV), and Generalized Pareto (GPA), were selected to project future floodwater levels using annual maximum daily water level data for extreme floods from 1997 to 2022, obtained from a water gauge site at Nokoue lake. Three goodness-of-fit tests were applied to identify the best-fitting probability distribution model: a Taylor diagram (three-dimensional analysis), a cumulative probability density diagram based on the root-mean-square error (RMSE), and an L-moment diagram (two-dimensional analysis). The Flood hazard Index values ranged from −1.10 to +3.40, with 77.78% showing positive indices and 22.22% showing negative indices. The flood hazard thresholds were classified in ascending order of index values: limited hazards, moderate hazards, significant hazards, and critical hazards. The analysis results indicate that the flood hazard thresholds are defined as follows: below 3.94 m for limited hazards, from 3.94 m up to 4.04 m for moderate hazards, from 4.04 m to 4.14 m for significant hazards, and above 4.14 m for critical hazards. The distribution model analysis showed that the Gumbel distribution best fits the Nokoue lake watershed, with an RMSE of 0.0724, compared to 0.0754 and 0.0761 for the GEV and GPA models, respectively. The annual maximum daily water levels for various non-exhaustive return periods, 2, 3, 5, 10, 25, 50, and 100 years, were estimated and compared. The return period for the highest recorded annual maximum daily water levels (4.4 m/day) in the Nokoue lake watershed were calculated to be 12, 15, and 15 years using the Gumbel, GEV, and GPA models, respectively. Quantile analysis revealed that the Gumbel distribution produced overestimated results compared to the GEV and GPA models for return periods exceeding 10 years. Exceptional and very exceptional hydrological events have return periods of 100 and 150 years, corresponding to peak flow levels of 4.95 m and 5.05 m respectively. Finally, the results of this study will be invaluable for flood hazard managers in monitoring flood alerts and for water resource engineers in determining dimensions for designing flood control structures such as spillways, dams, and bridges, thereby improving the management of recurrent flooding events.
Article
Full-text available
The transformative potential of deep learning models is felt in many research fields, including hydrology and water resources. This study investigates the effectiveness of the Temporal Fusion Transformer (TFT), a deep neural network architecture for predicting daily streamflow in Portugal, and benchmarks it against the popular Hydrologiska Byråns Vattenbalansavdelning (HBV) hydrological model. Additionally, it evaluates the performance of TFTs through selected forecasting examples. Information is provided about key input variables, including precipitation, temperature, and geomorphological characteristics. The study involved extensive hyperparameter tuning, with over 600 simulations conducted to fine–tune performances and ensure reliable predictions across diverse hydrological conditions. The results showed that TFTs outperformed the HBV model, successfully predicting streamflow in several catchments of distinct characteristics throughout the country. TFTs not only provide trustworthy predictions with associated probabilities of occurrence but also offer considerable advantages over classical forecasting frameworks, i.e., the ability to model complex temporal dependencies and interactions across different inputs or weight features based on their relevance to the target variable. Multiple practical applications can rely on streamflow predictions made with TFT models, such as flood risk management, water resources allocation, and support climate change adaptation measures.
Article
Full-text available
Optimal behind-the-meter energy management often requires a day-ahead electric load forecast capable of learning non-linear and non-stationary patterns, due to the spatial disaggregation of loads and concept drift associated with time-varying physics and behavior. There are many promising machine learning techniques in the literature, but black box models lack explainability and therefore confidence in the models’ robustness can’t be achieved without thorough testing on data sets with varying and representative statistical properties. Therefore this work adopts and builds on some of the highest-performing load forecasting tools in the literature, which are Long Short-Term Memory recurrent networks, Empirical Mode Decomposition for feature engineering, and k-means clustering for outlier detection, and tests a combined methodology on seven different load data sets from six different load sectors. Forecast test set results are benchmarked against a seasonal naive model and SARIMA. The resultant skill scores range from −6.3% to 73%, indicating that the methodology adopted is often but not exclusively effective relative to the benchmarks.
Article
Full-text available
The Machine Learning/Deep Learning (ML/DL) forecasting model has helped stakeholders overcome uncertainties associated with renewable energy resources and time planning for probable near-term power fluctuations. Nevertheless, the effectiveness of long-term forecasting of renewable energy resources using an existing ML/DL model is still debatable and needs additional research. Considering the constraints inherent in current empirical or physical-based forecasting models, the study utilizes ML/DL models to provide long-term predictions for solar power production. This study aims to examine the efficacy of several existing forecasting models. The study suggests approaches to enhance the accuracy of long-term forecasting of solar power generation for a case study power plant. It summarizes and compares the statistical model (ARIMA), ML model (SVR), DL models (LSTM, GRU, etc.), and ensemble models (RF, hybrid) with respect to long-term prediction. The performances of the univariate and multivariate models are summarized and compared based on their ability to accurately predict solar power generation for the next 1, 3, 5, and 15 days for a 100-kW solar power plant in Lubbock, TX, USA. Conclusions are drawn predicting the accuracy of various model changes with variation in the prediction time frame and input variables. In summary, the Random Forest model predicted long-term solar power generation with 50% better accuracy over the univariate statistical model and 10% better accuracy over multivariate ML/DL models.
Article
Full-text available
Traders and investors are interested in accurately predicting cryptocurrency prices to increase returns and minimize risk. However, due to their uncertainty, volatility, and dynamism, forecasting crypto prices is a challenging time series analysis task. Researchers have proposed predictors based on statistical, machine learning (ML), and deep learning (DL) approaches, but the literature is limited. Indeed, it is narrow because it focuses on predicting only the prices of the few most famous cryptos. In addition, it is scattered because it compares different models on different cryptos inconsistently, and it lacks generality because solutions are overly complex and hard to reproduce in practice. The main goal of this paper is to provide a comparison framework that overcomes these limitations. We use this framework to run extensive experiments where we compare the performances of widely used statistical, ML, and DL approaches in the literature for predicting the price of five popular cryptocurrencies, i.e., XRP, Bitcoin (BTC), Litecoin (LTC), Ethereum (ETH), and Monero (XMR). To the best of our knowledge, we are also the first to propose using the temporal fusion transformer (TFT) on this task. Moreover, we extend our investigation to hybrid models and ensembles to assess whether combining single models boosts prediction accuracy. Our evaluation shows that DL approaches are the best predictors, particularly the LSTM, and this is consistently true across all the cryptos examined. LSTM reaches an average RMSE of 0.0222 and MAE of 0.0173, respectively, 2.7% and 1.7% better than the second-best model. To ensure reproducibility and stimulate future research contribution, we share the dataset and the code of the experiments.
Article
Full-text available
Signal analysis and anomaly detection for water pollution early warning systems are important and necessary. In view of the nonlinear and volatile characteristics of water quality time series, this paper proposes a new method for water anomaly detection based on fluctuation feature analysis. The method has two steps. First, the water quality time series data are used to calculate the residuals between the observed value and the predicted value with the long short-term memory (LSTM) network. Second, the dynamic features are extracted by sliding time window and described by the Approximate Entropy (ApEn) which are input to the anomaly detection model with Isolation Forest. Compared with traditional anomaly detection methods, the results obtained by the proposed method show better performance in distinguishing water quality anomalies. The proposed method can be applied to real-time water quality anomaly detection and early warning. HIGHLIGHTS A prediction model based on LSTM networks is constructed to predict six water quality indicators.; Dynamic features of water time series are extracted by the Approximate Entropy (ApEn).; Combining with the high-dimensional ApEn characteristics, the Isolation Forest method is applied to identify anomalies of water quality.; This research has the potential for the improvement of water quality early warning system.;
Article
Full-text available
Purpose The purpose of the paper is to predict mapping of areas vulnerable to flooding in the Ourika watershed in the High Atlas of Morocco with the aim of providing a useful tool capable of helping in the mitigation and management of floods in the associated region, as well as Morocco as a whole. Design/methodology/approach Four machine learning (ML) algorithms including k-nearest neighbors (KNN), artificial neural network, random forest (RF) and x-gradient boost (XGB) are adopted for modeling. Additionally, 16 predictors divided into categorical and numerical variables are used as inputs for modeling. Findings The results showed that RF and XGB were the best performing algorithms, with AUC scores of 99.1 and 99.2%, respectively. Conversely, KNN had the lowest predictive power, scoring 94.4%. Overall, the algorithms predicted that over 60% of the watershed was in the very low flood risk class, while the high flood risk class accounted for less than 15% of the area. Originality/value There are limited, if not non-existent studies on modeling using AI tools including ML in the region in predictive modeling of flooding, making this study intriguing.
Article
Full-text available
Climate change has led human beings to take an interest in the study of meteorological and climatic phenomena. In fact, the main impact of climate change on different sectors of society is caused by extreme events since the occurrence of extreme events leads to more impact related to change in mean climate. Unfortunately, the West African region is vulnerable to extreme rainfall impact because its economy is based on rain-fed agriculture. This study examined the seasonal variability of extreme rainfall in West Africa. Eight (8) climate indices were chosen from among the 27 defined by the Expert Team on Climate Change Detection and Indices (ETCCDI). The nonpa-rametric Mann-Kendall test was used to assess the seasonal trends. The indices of the same types (frequency or intensity) were compared to assess the intra-seasonal variation of extreme precipitation. The results indicate that, regardless of the season, the Gulf of Guinea receives more rainfall than the Sahel. This phenomenon is due to the fact that the coastal part of West Africa is under the influence of evaporation which is observed at the Atlantic Ocean and during the monsoon, while the other part is dominated by the desert. Mann-Kendall's test revealed upward and downward trends during each season. The increase in extreme rainfall trends in the number of consecutive dry days suggests that droughts, due to global warming, could be observed and could have severe consequences in terms of water availability, energy supply, agricultural yields and ecosystems in West Africa. In addition, it can lead to the loss of biodiversity and health issues. It is therefore essential for policy-How to cite this paper: Tore, 151 Atmospheric and Climate Sciences makers or decisions makers to determine strategies and mitigation measures against climate change and its impacts on populations.
Article
Full-text available
Bangladesh is in the floodplains of the Ganges, Brahmaputra, and Meghna River delta, crisscrossed by an intricate web of rivers. Although the country is highly prone to flooding, the use of state-of-the-art deep learning models in predicting river water levels to aid flood forecasting is underexplored. Deep learning and attention-based models have shown high potential for accurately forecasting floods over space and time. The present study aims to develop a long short-term memory (LSTM) network and its attention-based architectures to predict flood water levels in the rivers of Bangladesh. The models developed in this study incorporated gauge-based water level data over 7 days for flood prediction at Dhaka and Sylhet stations. This study developed five models: artificial neural network (ANN), LSTM, spatial attention LSTM (SALSTM), temporal attention LSTM (TALSTM), and spatiotemporal attention LSTM (STALSTM). The multiple imputation by chained equations (MICE) method was applied to address missing data in the time series analysis. The results showed that the use of both spatial and temporal attention together increases the predictive performance of the LSTM model, which outperforms other attention-based LSTM models. The STALSTM-based flood forecasting system, developed in this study, could inform flood management plans to accurately predict floods in Bangladesh and elsewhere.
Article
The accurate estimation of coastal water quality parameters (WQPs) is crucial for decision-makers to manage water resources. Although various machine learning (ML) models have been developed for coastal water quality estimation using remote sensing data, the performance of these models has significant uncertainties when applied to regional scales. To address this issue, an ensemble ML-based model was developed in this study. The ensemble ML model was applied to estimate chlorophyll-a (Chla), turbidity, and dissolved oxygen (DO) based on Sentinel-2 satellite images in Shenzhen Bay, China. The optimal input features for each WQP were selected from eight spectral bands and seven spectral indices. A local explanation strategy termed Shapley Additive Explanations (SHAP) was employed to quantify contributions of each feature to model outputs. In addition, the impacts of three climate factors on the variation of each WQP were analyzed. The results suggested that the ensemble ML models have satisfied performance for Chla (errors = 1.7%), turbidity (errors = 1.5%) and DO estimation (errors = 0.02%). Band 3 (B3) has the highest positive contribution to Chla estimation, while Band Ration Index2 (BR2) has the highest negative contribution to turbidity estimation, and Band 7 (B7) has the highest positive contribution to DO estimation. The spatial patterns of the three WQPs revealed that the water quality deterioration in Shenzhen Bay was mainly influenced by input of terrestrial pollutants from the estuary. Correlation analysis demonstrated that air temperature (Temp) and average air pressure (AAP) exhibited the closest relationship with Chla. DO showed the strongest negative correlation with Temp, while turbidity was not sensitive to Temp, average wind speed (AWS), and AAP. Overall, the ensemble ML model proposed in this study provides an accurate and practical method for long-term Chla, turbidity, and DO estimation in coastal waters.
Article
The study concerns the water level (WL) evolution in lagoons under the influence of tides and river fluxes. We derive new approximate analytical solutions of the Stigebrandt (1980) equations and apply them to the Nokoué Lagoon (Benin), a large tropical coastal lagoon fed by substantial river input. We show the solutions accurately predict the mean WL and tide amplitude. In particular the non linear combination of the spring-neap tidal cycle and river inflow gives rise to a strong fortnightly variation of the mean WL. The analytical solutions are used to explain this phenomenon. We also calculate the phase shift between the ocean and the lagoon tides and the asymmetry of the ebb and flood tide duration in the lagoon. The asymmetry first increases with the river flux but reduces above a critical flux before becoming symmetric again at river flood peaks. Finally, the analytical solutions are inverted to estimate the net river fluxes entering the Nokoué Lagoon, where no observations are available for rivers but for which we have high frequency observations of the lagoon water level for 2 years. The model is calibrated using few available flux observations in the channel connecting the lagoon to the ocean. Realistic river fluxes are estimated. We subsequently calculate the lagoon WL variations using the simplified model forced with the recalculated river fluxes and tidal forcing from an ocean tide model. The solution accurately represents the observations over the 2 year time period of the study. We conclude that this simple model is able to represent the complex interaction of tides and river fluxes, and its influence on the low frequency WL variations of coastal lagoons.
Article
This study investigated the main water-level (WL) variability modes of Nokoué Lagoon in Benin (West-Africa). The average WL ranges between 1.3 and 2.3 m between the low- and high-water seasons. Seasonal as well as weak interannual variations between 2018 and 2019 are driven by rainfall regime over the catchment and associated river inflow. At sub-monthly scales, the lagoon is tidally choked: ocean tides can reach 90 cm, whereas in the lagoon semi-diurnal and diurnal tides hardly reach few centimeters. Choking conditions vary with river inflow and ocean tide amplitude, correctly represented by a simple tidal choking model. Diurnal modulation and asymmetry of the tide are stronger (weaker) during high (low) water period. We also observed WL variations of ±5–10 cm at a fortnightly frequency, stronger during wet (high-water) season. Superimposed on the seasonal, fortnightly and tidal WL variations, we further observed short-term high-frequency seiche events. Mostly observed during dry (low-water) conditions, they are characterized by typical standing-wave oscillations of 5–10 cm amplitudes and 3 h periods. They are forced by the passage of fast-moving squall-lines that induce strong wind variations, heavy rainfalls and rapid drop-off of the air temperature. Results obtained in this study provide useful metrics for the validation of flood forecasting models to be implemented in Benin, and elsewhere on the West African coastline.