Available via license: CC BY
Content may be subject to copyright.
Article Not peer-reviewed version
Forecasting Lake Nokoué Water Levels
Using Long Short-Term Memory
Network
Namwinwelbere Dabire * , Eugene C. Ezin , Adandedji M. Firmin
Posted Date: 20 March 2024
doi: 10.20944/preprints202403.1218.v1
Keywords: forecasting; Machine learning algorithms; recurrent artificial neural network; lake Nokoue.
Preprints.org is a free multidiscipline platform providing preprint service that
is dedicated to making early versions of research outputs permanently
available and citable. Preprints posted at Preprints.org appear in Web of
Science, Crossref, Google Scholar, Scilit, Europe PMC.
Copyright: This is an open access article distributed under the Creative Commons
Attribution License which permits unrestricted use, distribution, and reproduction in any
medium, provided the original work is properly cited.
Article
Forecasting Lake Nokoué Water Levels Using Long
Short-Term Memory Network
Namwinwelbere Dabire 1,2,*, Eugene C. Ezin 3 and Adandedji M. Firmin 4
1 Institut National de l’Eau (INE), Centre d’Excellence d’Afrique pour l’Eau et l’Assainissement (C2EA),
Université d’Abomey Calavi (UAC), Bénin, namwinwelbere@gmail.com
2 Ecole Doctorale des Sciences de l’Ingénieur (ED-SDI), Université d’Abomey Calavi, Bénin,
namwinwelbere@gmail.com
3 Institut de Formation et de Recherche en Informatique (IFRI), Université d’Abomey Calavi, Bénin,
eugene.ezin@uac.bj
4 Laboratoire d’Hydrologie Appliquée (LHA), Institut National de l’Eau, Université d’Abomey Calavi,
Bénin, firminelite@gmail.com
* Correspondence: namwinwelbere@gmail.com
Abstract: The prediction of hydrological flows (rainfall-depth or rainfall-discharge) is becoming
increasingly important in the management of hydrological risks such as floods. In this study, the
Long Short-Term Memory (LSTM) network, a state-of-the-art algorithm dedicated to time series, is
applied to predict the daily water level of lake Nokoue in Benin. This paper aims to provide an
effective and reliable method enable of reproducing the future daily water level of Lake Nokoue,
which is influenced by a combination of two phenomena: rainfall and river flow (runoff from the
Ouémé River, the Sô River, the Porto-Novo lagoon, and the Atlantic Ocean). Performance analysis
based on the forecasting horizon indicates that LSTM can predict the water level of Nokoué Lake
up to a forecast horizon of t+10 days. Performance metrics such as Root Mean Square Error (RMSE),
coefficient of correlation (R²), Nash-Sutcliffe Efficiency (NSE), and Mean Absolute Error (MAE)
agree on a forecast horizon of up to t+3 days. The values of these metrics remain stable for forecast
horizons of t+1 days, t+2 days, and t+3 days. The values of R² and NSE are greater than 0.97 during
the training and testing phases in the Nokoué Lake basin. Based on the evaluation indices used to
assess the model's performance for the appropriate forecast horizon of water level in the Nokoué
lake basin, the forecast horizon of t+3 days is chosen for predicting future daily water levels.
Keywords: forecasting; machine learning algorithms; recurrent artificial neural network; lake
Nokoue
Introduction
Lake Nokoué is at the center of important Benin socio-economic and ecological issues. Hosting
lacustrine villages and bordered by three major urban centers (Cotonou, Abomey-Calavi, and Sèmè
Podji), the planning and implementation of flood management strategies require a deep
understanding of the processes involved in the physical dynamics of Nokoué lake, of which water
level variation is a key parameter. This variation, which can occasionally lead to floods with dramatic
repercussions on local populations, is primarily influenced by (i) ocean tides, (ii) hydrological
variability of the watershed, and (iii) direct contributions from precipitation. These floods are linked
to major river basin floods and the regular overflow of Nokoué lake. Due to its complex hydrological
configuration, hydrological modeling of Nokoué lake is challenging for hydrodynamic conceptual
models due to the non-linearity of explanatory variables [1]. A high-water level is associated with a
strong freshwater river flow from the Sô river and the Ouémé river, while a low water level is
associated with periods when saltwater from the ocean enters Nokoué lake. Hydrological prediction
models have long been devoted to forecasting river discharge for the proper management of water
Disclaimer/Publisher’s Note: The statements, opinions, and data contained in all publications are solely those of the individual author(s) and
contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting
from any ideas, methods, instructions, or products referred to in the content.
Preprints.org (www.preprints.org) | NOT PEER-REVIEWED | Posted: 20 March 2024 doi:10.20944/preprints202403.1218.v1
© 2024 by the author(s). Distributed under a Creative Commons CC BY license.
2
resource systems. As a result, there is a vast body of literature on the development and application
of a wide range of methods for predicting river flow, primarily governed by rainfall. Two types of
models can be identified. [2,3]: (a) Physical models apply deterministic equations to a set of input
variables (such as physiographic characteristics or precipitation) to obtain desired river flow values,
while (b) stochastic (statistical) models probabilistically model hydrological phenomena, taking into
account the uncertainty of observed data and non-linearity. However, the calculation of physical
models is subject to considerable uncertainties, such as the lack of data related to the physical
processes of the hydrological system and the limited level of scientific knowledge concerning natural
systems like water bodies. These limitations of physical models negatively impact the quality of
forecasts, especially as domain experts require accurate results associated with minimal
computational time for optimal decision-making. According to [4] These techniques are limited by
the understanding of flood dynamics (variations in flood wave propagation time in hydrographic
networks) or by the different types of hydrological fluxes that influence water bodies.
In the same way to physical models, stochastic models function as black boxes on observational
data without any consideration of the internal structure of the system [5–10]. Furthermore, stochastic
models adapt to the non-linearity of hydrological processes and address uncertainties in parameter
estimations. [11,12]. On the other hand, stochastic models introduce various techniques for flood
estimation, ranging from simple regression of discharge to detailed modeling of hydrological
processes. However, stochastic models are criticized for being more data-intensive, requiring in-situ
observation data to ensure reliable predictions [13,14]. A variety of stochastic models have been
proposed for hydrological flow prediction. Two major categories can be distinguished: time series
models and regression models. The former is primarily based on modeling the autocorrelation
structure of hydrological flows (water level or discharge), while the latter focuses on the correlation
between input and output variables regardless of the temporal structure. [15,16]. Therefore,
according to [17], typical input variables such as precipitation forecasts are used to predict the output
variable (future water levels of water bodies or future streamflow). Some regression models, such as
linear regression, principal component regression, partial least squares regression, and wavelet
regression, are commonly used for long-term forecasts [18]. However, these models only forecast
water levels at large time scales (typically seasonal or annual). This does not provide information
about the shape of the water level curve during these long periods, lacking precision on when the
water level will reach a critical level that could lead to disasters. There are also a number of regression
models such as highly non-linear recurrent neural networks represented by Long Short-Term
Memory (LSTM), generic recurrent cell or standard Recurrent Neural Network (RNN), and Gated
Recurrent Unit (GRU) that are widely used to predict hydrological flows. [19–21]. However, these
models can only model and predict a few days (or data points) at a time, which is useful for detecting
future extremes but not for forecasting the overall trends of the water level evolution process. In
particular, the application of LSTM in several domains has proven powerful for achieving
hydrological forecasts, which is beneficial for managing hydrological extremes such as floods [22].
LSTMs belong to the category of stochastic models that learn features to extract useful information
from sequential data for future predictions of hydrological behavior, but without any understanding
of the internal structure of the hydrographic basin. This technique has been successfully applied in
hydrological modeling and has shown significant computational power in several studies of
hydrological flow modeling, such as streamflow forecasting, yielding more accurate results. [23–25].
Despite the importance of monitoring the water level of Lake Nokoué and the use of conceptual
models to explain its fluctuations, the inefficiency of these conceptual models in accounting for the
non-linearity of variables coupled with the complexity of this water body remains major challenges
in implementing appropriate flood management measures.
To address the challenges posed by the non-linearity of variables coupled with the complexity
of Nokoué lake and guide decision-making in the implementation of prevention plans, the study
aims to apply LSTM for Nokoué lake water level prediction. LSTM belongs to the category of
recurrent neural network methods that not only solve the gradient instability problem but also
address the issue of preserving information over long sequences in time series data. This model will
Preprints.org (www.preprints.org) | NOT PEER-REVIEWED | Posted: 20 March 2024 doi:10.20944/preprints202403.1218.v1
3
be performed at different time horizons to identify the suitable horizon for predicting the water level
of Nokoué lake.
The contributions of this study are the followings:
It is a first attempt to apply artificial intelligence models to a complex water body in order to
assess the performance of these models in establishing the nonlinear relationship between input
variables and the output.
It is also to propose and implement a recurrent neural network model to leverage the set of
inputs variables for predicting the water level of Nokoué lake.
2. Materials and Methods
2.1. Study Area
Nokoué lake, (Figure 1), is located in the southeast of Benin, between 6°25'N and 2°36'E, covering
an area that varies between 150 km2 and 170 km2 respectively during the low water period and high-
water period, respectively [26–29]. It stretches for approximately 20 km from east to west along the
coast and 11 km from south to north, as confirmed by multiple authors such as[27,28]. The average
and maximum depths of the lake are approximately 1.3m and 2.9m, respectively. Towards the
Cotonou channel, the Nokoué lake deepens, and the average and maximum depths reach around 3m
and 8m, respectively. Two rivers flow into Nokoué lake: on its northern bank is the Sô-Ava river,
which drains a watershed area of approximately 10,000 km2, and the Ouémé river, the largest river in
Benin, which drains a watershed area of approximately 50,000 square kilometer [26,28]. The Djonou
river, with a smaller extent and flow, also contributes to the freshwater input in the southwestern
part of Nokoué lake. In the southern part, Nokoué lake is connected to the Atlantic Ocean through
the Cotonou channel, which is 280 meters wide and approximately 4 kilometers long [26,29]. Through
this canal, constructed in 1885, exchanges of freshwater and saltwater occur in accordance with the
tides and hydrological regime [1]. The canal of Tochè, approximately 4 km long, connects the Porto
Novo lagoon, with an area of approximately 35 km2, to Lake Nokoué on the eastern side, with little
effect on the dynamics of Lake Nokoué. At a seasonal scale, the hydrological regime of Lake Nokoué
is determined by the West African summer monsoon, resulting in two rainy seasons and two dry
seasons. [1]. These seasons are linked to the north-south movement of the intertropical convergence
zone and the associated belt of intense tropical rainfall. The main rainy season extends from April-
May to the end of July when the intertropical convergence zone moves northward from its southern
position near the equator. The second rainy season, shorter and less intense, occurs from late
September to November when the intertropical convergence zone migrates southward from its
northernmost position. However, this dual rainy season, along with local precipitation, has only a
weak influence on the water level of Nokoué lake. Nokoué lake is more influenced by the hydrology
of the central part of Benin, where the main basin of the Ouémé river is located [29]. This region is
characterized by a single rainy season with peak precipitation occurring between July and October
[29,30]. The period from September to October is when the maximum flow enters Nokoué lake.
Figure 1. Location map of the study area .
Preprints.org (www.preprints.org) | NOT PEER-REVIEWED | Posted: 20 March 2024 doi:10.20944/preprints202403.1218.v1
4
2.2. Data Acquisition
The data used in this study include a times series of daily water level observations expressing
flood-recession events, a series of rainfall observations, and a series of discharge observations
provided by the National Meteorological Agency of Benin (METEO-Benin) and the General
Directorate of Water of Benin (DGEau).
2.3. Data Preprocessing
Preprocessing of the variables is necessary to account for the full range of values (both low and
high) and to avoid sigmoid saturation with the high values in the database [31]. Indeed, the direct
application of the sigmoid function to the weighted sums of rainfall-discharge inputs results in the
neglect of information from low numerical values (rainfall and water levels) compared to high
numerical values (discharge). This preprocessing step involves normalizing all values in the database
between -1 and 1. This is done through the following Equation 1:
(1)
where Xi is the actual value to be normalized, Xmean is its mean, and Xnormalized is the normalized value.
This transformation scales the input data between [-1, 1].
2.4. Structure of the Long Short Term Memory Model
The Long Short-Term Memory (LSTM) cell is an enhancement of the optimization proposed in
[32]. Considered as a black box, the LSTM cell, widely used in time series forecasting, is an impressive
architecture of an artificial recurrent neural network (RNN) capable of memorizing the temporal
order of data. Furthermore, the LSTM overcomes the issues of gradient instability and insufficient
memory capacity through the state of its cells and gates [33]. The two main problems of the RNN
architecture are gradient instability and its inability to retain information from long sequences of
temporal data. In contrast, the LSTM cell, as a deep learning predictive model, receives the latent
states from the previous step and has a self-evaluation mechanism that offers better performance in
time series forecasting. The internal structure of the LSTM consists of three main gates that control
the flow of information: (i) forgetting unwanted information in the current cell state through the
forget gate (ft), (ii) adding additional data to the current cell state through the input gate, also known
as the temporal attention module (It), and (iii) producing an output from the current cell state through
the output gate (Ot). These gates serve specific operations on the cell states. The state of the LSTM
network is divided into two states: ht and ct. The hidden state ht of the LSTM network is considered
as short-term memory, while the cell state ct is considered as long-term memory of the network. The
operations performed within the LSTM cells help the model retain information from sequential data.
The LSTM network uses cells as memory units for the model. The gates, as shown in Figure 2 and
illustrated in Equations 1, 4, 5, and 6, determine the data to be carried.
a) The forget gate
The forget gate, ft, determines the amount of information from the previous timestamp that
should be transmitted and is the essence of the LSTM architecture. It determines the amount of
memory preserved from the previous memory state, Ct-1. In Equation 2, the previous hidden state, ht-
1, and the current input information, xt, are passed through the sigmoid activation function.
Information associated with 0 is forgotten, while information associated with one 1 continues to be
carried through the cell state.
(2)
Where:
σ is the sigmoid activation function ;
wfh and wfx represent the weight matrices of the forget gate, ft, for their connections to the
previous hidden state ht–1 and to the input vector xt;
bf the bias term for the forget gate.
Preprints.org (www.preprints.org) | NOT PEER-REVIEWED | Posted: 20 March 2024 doi:10.20944/preprints202403.1218.v1
5
b) Long-term state
The flow of the long-term state, ct-1, through the network is from left to right. It first passes
through a forget gate, which discards certain information, and then adds new information through
an addition operation (the added information is selected by an input gate), and the resulting ct is sent
directly without any further transformation. Therefore, at each time step, information is removed and
new information is added. The update calculation of ct is illustrated in the following Figure 2 and
Equations 3 et 4: (3)
(4)
Where :
wgh and wgx represent the weight matrices of the main layer gt for their connections to the
previous short-term state ht–1 and to the input vector xt;
bg is the bias term for the main layer.
c) The input gate
The input gate, It, in Equation 5 determines which parts of the main layer, gt, to update in the
long-term state. The previous and current information is updated based on the result of the sigmoid
operation (σ). Information associated with 0 is considered trivial, while information associated with
1 is deemed essential. Additionally, the hyperbolic tangent activation function (tanh), which
compresses data between -1 and 1, is used to regulate the network. The outputs of the sigmoid and
tanh functions are then multiplied to select the information that will be updated.
(5)
Where :
wih and wix represent the weight matrices of the input gate for their connections to the short-
term state ht-1 and to the input vector xt ;
bi the bias term of the input gate.
d) The output gate
In Equations 6 and 7, the output gate, ot, is also used for inference and determines which parts
of the long-term state should be read and output at this time step, both in ht and in yt. Additionally,
after the addition operation, the long-term state is copied and passed through the hyperbolic tangent
function (tanh), with the result being filtered by the output gate. The result is the short-term state, ht,
which is equal to the output yt of the LSTM cell.
(6)
(7)
Where:
woh and wox represent the weight matrices of the output gate for their connections to the short-
term state ht-1 and to the input vector xt;
bo is the bias term of the output gate;
ht is the output result of the masked layer at time step t.
Preprints.org (www.preprints.org) | NOT PEER-REVIEWED | Posted: 20 March 2024 doi:10.20944/preprints202403.1218.v1
6
Figure 2. Internal architecture of an LSTM cell (fully connected layer).
2.5. Long Short-Term Memory Model Configuration
In this work, optimizing the performance of the LSTM model involves selecting the input
variables, determining the appropriate network architecture, optimizing network learning, and using
a reliable validation methodology. The LSTM network, as explained earlier, consists of an input layer,
a single hidden layer, and an output layer with sigmoid activation functions for the artificial neurons
and hyperbolic tangent for the hidden states. The optimal initialization of the model's learning
algorithm parameters, such as the number of hidden neurons, the optimization function, the number
of iterations, and the batch size, for performance estimation is performed using the random search
cross-validation method (randomSearchCV), during which we test and evaluate different
combinations of inputs (rainfall, discharge, water level). The preprocessed database is divided into
two parts:
– the part intended for training to recognize the system's dynamics, which is the most important
part (80%);
– the testing part (20%) which prevents overfitting by checking and testing the loss function
evolution during training and validation. After the training is stopped and the weights of the
interconnections of the most performing model are saved. The validation dataset allows for
confirmation of the LSTM model's performance.
2.6. Model Performance Assessment
To ensure synchronization between the observed flood and the one estimated by the LSTM
model, we rely on a qualitative evaluation of the LSTM model using various evaluation criteria. There
is a wide range of performance evaluation criteria for hydrological models proposed by the World
Meteorological Organization and other authors. [9,31,32,34]. To ensure the reliability of the LSTM
model results in this study, we used four most relevant metrics (equations 8 to 11). These criteria
include the Nash criterion, the Root Mean Square Error (RMSE), the Mean Absolute Error (MAE),
and the coefficient of determination (R2).
NASH
(8)
RMSE=
(9)
MAE=
(10)
(11)
Preprints.org (www.preprints.org) | NOT PEER-REVIEWED | Posted: 20 March 2024 doi:10.20944/preprints202403.1218.v1
7
Results and Discussion
3.1. Variable Selection and Statistics
3.1.1. Selection of the Variables
The variables selected for the model following forward feature selection are presented in Figure
1 and Figure 2. Rainfall, discharge of the Ouémé river and water level in lake Nokoue were the
predominant predisposing factors in the lake Nokoue. Rainfall and discharge were selected as input
for the model and the output was the water level of lake Nokoue.
Figure 3. Water level of lake Nokoue (Output variable).
Figure 4. Rainfall and discharge (selected input variables).
3.1.2. Statistics of the Variable
The statistical results of the variables are explained in Table 2. These values of discharge variable
were the among values, presenting mean and max values above 219.113 and 1064.000, respectively.
The variation in the water level of lake Nokoué is strongly influenced by rainfall [37,38].
Table 1. Statistics of the all variables.
statistics
rainfall
discharge
Water level
mean
3.392
219.113
3.173
std
10.852
318.670
0.195
min
0.000
0.610
2.736
25%
0.000
8.225
3.045
50%
0.000
25.590
3.115
75%
0.200
376.900
3.257
max
158.200
1064.000
3.981
Preprints.org (www.preprints.org) | NOT PEER-REVIEWED | Posted: 20 March 2024 doi:10.20944/preprints202403.1218.v1
8
3.2. Model Performance Analysis
The evaluation of a model’s performance is an integral step in modeling. In this study, the RMSE,
NSE, R² and MAE were used to assess model accuracy. The resulting RMSE, NSE, R² and MAE scores
for the LSTM model are shown in Table 2 for different forecast horizons. Depending on the prediction
horizon from t+1 day to t+10 days, the values of all performance criteria show a nearly stable pattern
during both phases. The values of all performance criteria remain constant for the forecast horizons
of t+1 days, t+2 days, and t+3 days (Table 2). Additionally, the R2 value is greater than 0.97 in the
calibration phase and greater than 0.96 in the validation phase. These satisfactory results demonstrate
that the LSTM model performs well in both periods (calibration phase, also known as the learning
phase, and validation phase, also known as the testing phase). It is capable of reliably reproducing
the observed water levels in the Nokoué lake basin up to a forecast horizon of t+10 days. This enables
significant proactive anticipation of flood events.
Table 2. Performance metrics values for different Prediction Horizons .
Forecast horizon
Training step
Testing step
RMSE
NSE
R²
MAE
RMSE
NSE
R²
MAE
t+1 day
t+2 days
t+3 days
t+4 days
t+5 days
t+10 days
0.03
0.03
0.03
0.03
0.03
0.03
0.98
0.98
0.98
0.94
0.98
0.92
0.98
0.98
0.98
0.98
0.98
0.97
0.02
0.02
0.02
0.02
0.02
0.02
0.04
0.04
0.04
0.03
0.03
0.04
0.97
0.97
0.97
0.96
0.97
0.90
0.97
0.97
0.97
0.97
0.97
0.96
0.03
0.03
0.02
0.02
0.02
0.03
The adaptability of LSTM to the Nokoué lake basin during the calibration and validation periods
is also supported by the convergence of the loss function evolution during both phases, as shown in
Figure 5. Corresponding, the results from Ling et al.[39] showed that the method proposed in this
article was more effective than other methods of artificial recurrent neural networks. The loss
function values decrease with the number of iterations, stabilizing below 0.1, indicating maximum
optimization of the model's performance during the calibration and validation phases. Given the
architecture of the LSTM model, it was noted in the study by Hu et al. [40] a tendency of the LSTM
model towards a local optimum. The hydrographs of observed and predicted water levels by the
LSTM model during the calibration and validation phases are depicted in Figures 6a and 6b. It is
evident that the observed and predicted floods and recessions are nearly identical. This
synchronization between observed and estimated floods demonstrates the LSTM model's ability to
reproduce critical water levels that could lead to flooding. Recently, a number of existing literature
studies have been considered the classification of streamflow forecasting using recurrent neural
networks models. Thapa et al. [41] developed a deep learning long-short-term memory (LSTM)-based
model in the Himalayan basin for snowmelt-based discharge modeling. Ni et al. [42] applied the deep
learning method for daily flow simulation, and used data from previous years for flow prediction.
The model was carried out according to several perspectives. At the end of the study, it was found
that the LSTM model was advantageous in processing constant flow data in the dry season and gave
satisfying results in capturing data features in rapidly fluctuating flow data in rainy seasons. Luo et
al. [43] built a new hybrid model based on the long-short-term memory approach for predicting
streamflow. In this study, the linear regression model, which is one of the classical methods, was used
to show how successful the performance between the benchmark model and the hybrid model was.
The satisfactory forecasting results are further evident in Figures 7a, 7b, 8a, and 8b, which present
scatter plots of predicted (estimated) water levels against observed water levels. The linear trend line
of the scatter plots during the training (calibration) and testing (validation) phases highlights the
strong correlation between observed and predicted water levels, with a correlation coefficient of 0.98
during calibration and 0.97 during validation (Table 2). This linear alignment, particularly for water
levels between 3.5 meters and 3.9 meters, indicates that the LSTM model accurately predicts extreme
water levels in Nokoué lake that could lead to flooding. Figures 7b and 8b illustrate a good
Preprints.org (www.preprints.org) | NOT PEER-REVIEWED | Posted: 20 March 2024 doi:10.20944/preprints202403.1218.v1
9
distribution of the LSTM model's residual errors, with values ranging from -0.1 to 0.4 during the
training phase and -0.5 to 0.1 during the testing phase. Negative residual errors indicate
overestimation of water levels, while positive residual errors indicate underestimation of water levels
in Nokoué lake by the LSTM model. These underestimations and overestimations remain acceptable
for predicting extreme water levels in Nokoué lake.
Figure 5. Comparison of the loss function during the calibration and validation phase.
Figure 6. (a) Combined training and testing phase of the LSTM model; (b) Separated training and
testing phase of the LSTM model.
Preprints.org (www.preprints.org) | NOT PEER-REVIEWED | Posted: 20 March 2024 doi:10.20944/preprints202403.1218.v1
10
Figure 7. (a) Comparison between observed water levels and water levels predicted by the LSTM
model during the calibration phase; (b) correlation and residual error during the calibration phase.
Figure 8. (a) Comparison between observed water levels and water levels predicted by the LSTM
model during the validation phase; (b) correlation and residual error during the validation phase.
Preprints.org (www.preprints.org) | NOT PEER-REVIEWED | Posted: 20 March 2024 doi:10.20944/preprints202403.1218.v1
11
Conclusion
Our approach offers a tool for water level assessment at the lake Nokoue scale. This can help
decision-makers in implementing appropriate preventive and adaptive measures, thereby
contributing to more effective flood management. It is important to note that prediction models based
on LSTM are not infallible because the LSTM model doesn't consider the initial hydrological
conditions. They rely on available training data and underlying assumptions. Therefore, it is essential
to continuously monitor the model's performance and update the training data if necessary.
Additionally, the expertise of forecasters, combined with the analysis of the model's results, is crucial
for interpreting and properly utilizing the generated predictions. In terms of future prospects, the
study suggests applying the prediction method based on LSTM to other lakes and rivers to improve
water level forecasting. Additionally, exploring hybrid model machine learning based prediction
methods is recommended to improve the accuracy of forecasts. Furthermore, it is important to
continue collecting data on water levels, precipitation, and river flow rates to enhance the quality of
forecasts.
Acknowledgments: This work is supported in part by the World Bank and the French Development Agency
through “Centre d’Excellence pour l’Eau et l’Assainissement en Afrique (C2EA)” of University of Abomey-
Calavi in Benin. The authors would like to thank the reviewers for their constructive comments, which have
certainly improved the quality and readability of the article.
Conicts of interest: The authors declare that there are no conicts of interest.
References
1. Chaigneau, A., Okpeitcha, O. V., Morel, Y., Stieglitz, T., Assogba, Morgane, B., Allamel, P., Honfos, J.,
Thierry Derol Awoulmbang, S., Retif, F., Duhaut, T., Peugeot, C. From seasonal flood pulse to seiche: Multi-
frequency water-level fluctuations in a large shallow tropical lagoon (Nokoue Lagoon, Benin).ecss 2022,
267, 107-767.
2. Ngoc, D. V. Deterministic hydrological modeling for flood risk assessment and climate change in large
catchment: Application to Vu Gia Thu Bon catchment, Vietnam. Ph.D, Université Nice Sophia Antipolis,
2015.
3. Rebolho, C. Modélisation conceptuelle de l’aléa inondation à l’échelle du bassin versant. Hydrology thesis
doctorate, Ph.D, 2018.
4. Golob, R., Štokelj, T., Grgič, D. Neural-network-based water inflow forecasting. Control Engineering Practice
1998, 6(5), 37-98.
5. Ancona, M., Corradi, N., Dellacasa, A., Delzanno, G., Dugelay, J.-L., Federici, B., Gourbesville, P., Guerrini,
G., La Caméra, A., Rosso, P., Stephens, J., Tacchella, A., Zolezzi, G. On the Design of an Intelligent Sensor
Network for Flash Flood Monitoring, Diagnosis and Management in Urban Areas Position Paper. PCS
2014, 32, 941-946.
6. Chen, L., & Singh, V. P. Flood forecasting and error simulation using copula entropy method. In P. Sharma
& D. Machiwal (Éds.), Advances in Streamflow Forecasting 2021,6, 331-368.
7. Chu, H., Wu, W., Wang, Q. J., Nathan, R., Wei, J. An ANN-based emulation modeling framework for flood
inundation modeling: Application, challenges and future direction. envsoft 2019, 19, 104-587.
8. Audrey Bornancin Plantier. Conception de modèles de prévision des crues éclair par apprentissage
artificiel, informatic thesis doctorate. Université Pierre et Marie Curie, Paris, 2013.
9. Kharroubi, O., Blanpain, O., Masson, E., Lallahem, S. Application du réseau des neurones artificiels à la
prévision des débits horaires : Cas du bassin versant de l’Eure, France. Hydrological Sciences Journal 2016,
61(3), 933-225.
10. Peredo, D., Ramos, M.-H., Andréassian, V., Oudin, L. Investigating hydrological model versatility to
simulate extreme flood events. Hydrological Sciences Journal 2022, 67(4), 628-645.
11. Modeste Meliho. Spatial prediction of flood susceptible zones in the Ourika watershed of Morocco using
machine learning algorithms. Aci 2022, 09,2021-0264.
12. Noor, F., Haq, S., Rakib, M., Ahmed, T., Jamal, Z., Siam, Z. S., Hasan, R. T., Adnan, M. S. G., Dewan, A., &
Rahman, R. M. Water Level Forecasting Using Spatiotemporal Attention-Based Long Short-Term Memory
Network. Water 2022,14(4), 4040-612
Preprints.org (www.preprints.org) | NOT PEER-REVIEWED | Posted: 20 March 2024 doi:10.20944/preprints202403.1218.v1
12
13. Alliau, D., De Saint Seine, J., Lang, M., Sauquet, E., Renard, B. Étude du risque d’inondation d’un site
industriel par des crues extrêmes : De l’évaluation des valeurs extrêmes aux incertitudes hydrologiques et
hydrauliques. La Houille Blanche 2015, 101(2), 67-74.
14. Morel, Y., Chaigneau, A., Okpeitcha, V. O., Stieglitz, T., Assogba, A., Duhaut, T., Rétif, F., Peugeot, C., &
Sohou, Z. Terrestrial or oceanic forcing ? Water level variations in coastal lagoons constrained by river
inflow and ocean tides. Hal open science 2022, 169, 104-309.
15. Fathian, F. Introduction of multiple/multivariate linear and nonlinear time series models in forecasting
streamflow process. In P. Sharma & D. Machiwal (Éds.), ASF 2021,12, 87-113
16. Wang, L., Dong, H., Cao, Y., Hou, D., Zhang, G. Real-time water quality detection based on fluctuation
feature analysis with the LSTM model. Journal of Hydroinformatics 2023, 5, 127-305
17. Masselot, P., Dabo-Niang, S., Chebana, F., Ouarda, T. B. M. J. Streamflow forecasting using functional
regression. J.Hydrol 2016, 538, 754-766.
18. Luo, X., Yuan, X., Zhu, S., Xu, Z., Meng, L., Peng, J. A hybrid support vector regression framework for
streamflow forecast. Journal of Hydrology 2019, 568, 184-193.
19. Douvinet, J., Serra-Llobet, A., Radke, J., Kondolf, M. Quels enseignements tirer des coulées de débris post-
incendie survenues le 9 janvier 2018 à Montecito (Californie, USA) ? La Houille Blanche 2020, 106(6), 25-35.
20. Lang, M., Arnaud, P., Carreau, J., Deaux, N., Dezileau, L., Garavaglia, F., Latapie, A., Neppel, L., Paquet,
E., Renard, B., Soubeyroux, J.-M., Terrier, B., Veysseire, J.-M., Aubert, Y., Auffray, A., Borchi, F., Bernardara,
P., Carre, J.-C., Chambon, D., … Tramblay, Y. Résultats du projet ExtraFlo (ANR 2009-2013) sur l’estimation
des pluies et crues extrêmes. La Houille Blanche 2014, 2, 5-13.
21. Viatgé, J., Berthet, L., Marty, R., Bourgin, F., Piotte, O., Ramos, M.-H., & Perrin, C. Vers une production en
temps réel d’intervalles prédictifs associés aux prévisions de crue dans Vigicrues en France. La Houille
Blanche 2019, 105(2), 63-71.
22. Hossein Hosseiny. A deep learning model for predicting river flood depth and extent. j.envsoft 2021, 145,
105-186.
23. Ji, H., Chen, Y., Fang, G., Li, Z., Duan, W., Zhang, Q. Adaptability of machine learning methods and
hydrological models to discharge simulations in data-sparse glaciated watersheds. Journal of Arid Land
2021, 13(6), 549-567.
24. Maier, H. R., & Dandy, G. C. Neural networks for the prediction and forecasting of water resources
variables: A review of modeling issues and applications. J.envsoft 2000 15(1),101-124.
25. Malik, A., Kumar, A., Tikhamarine, Y., Souag-Gamane, D., Kişi, Ö. Hybrid artificial intelligence models for
predicting daily runoff. In P. Sharma, D. Machiwal (Éds.). A SF 2021, 12, 305-329.
26. Barbe, Millet, Texier, Borel, Gualde. Les ressources en eaux superficielles de la République du Bénin. 1993,
540.
27. Daouda Mama, Véronique Deluchat, James Bowen, Waris Chouti, Benjamin Yao, Baba Gnon,Michel
Baudu. Caractérisation d’un Système Lagunaire en Zone Tropicale : Cas du lac Nokoué (Bénin). EJSR 2011,
56(4), 516-528.
28. Metogbe Belfrid Djihouessi, Martin Pépin Aina. A review of hydrodynamics and water quality of Lake
Nokoué: Current state of knowledge and prospects for further research. j.rsma 2018, 17, 2352-4855
29. Texier, H., Colleuil, B., Profizi, J. P., Dossou, C. (1980). Le lac Nokoué, environnement du domaine margino-
littoral sud-béninois : Bathymétrie, lithofaciès, salinité, mollusque et peuplements végétaux (No 28 ; p.
115-142).
30. Tore, D. B., Alamou, A. E., Obada, E., Biao, E. I., Zandagba, E. B. J. Assessment of Intra-Seasonal Variability
and Trends of Precipitations in a Climate Change Framework in West Africa. ACS 2022, 12(01), 150-171.
31. Sedai, A., Dhakal, R., Gautam, S., Dhamala, A., Bilbao, A., Wang, Q., Wigington, A., & Pol, S. Performance
Analysis of Statistical, Machine Learning and Deep Learning Models in Long-Term Forecasting of Solar
Power Production. Forecasting 2023, 5(1), 1-14
32. Murray, K., Rossi, A., Carraro, D., Visentin, A. On Forecasting Cryptocurrency Prices: A Comparison of
Machine Learning, Deep Learning, and Ensembles. Forecasting 2023, 5(1), 1-10
33. Zhu, X., Guo, H., Huang, J. J., Tian, S., Xu, W., Mai, Y. An ensemble machine learning model for water
quality estimation in coastal areas based on remote sensing imagery. JEM, 2022, 323, 116-187.
34. Wood, M., Ogliari, E., Nespoli, A., Simpkins, T., Leva, S. Day Ahead Electric Load Forecast: A
Comprehensive LSTM-EMD Methodology and Several Diverse Case Studies. Forecasting 2023, 5(1), 1-16
35. Ömer Faruk, D. A hybrid neural network and ARIMA model for water quality time series prediction. EAAI
2010, 23(4), 586-594.
36. Sharma, P., Machiwal, D. Streamflow forecasting: Overview of advances in data-driven techniques. In P.
Sharma, D. Machiwal (Éds.). ASF 2021, 82(6), 1-50.
Preprints.org (www.preprints.org) | NOT PEER-REVIEWED | Posted: 20 March 2024 doi:10.20944/preprints202403.1218.v1
13
37. Calèche Nehemie Nounagnon Avahouin, Henri Sourou Totin Vodounon, Ernest, Amoussou. Variabilité
climatique et production halieutique du lac Nokoué dans les Aguégués au Bénin. 2018, 8(2), 51-61.
38. Kawoun Alagbe Gildas, Ahamide Bernard, Chabi Amédée, Ayena Abraham, Adandedji Firminn, & Vissin
Expédit. Variabilité Pluvio-Hydrologique et Incidences sur les Eaux de Surface dans la Basse Vallée de
l’Ouémé au Sud-Est Bénin. IJPSAT 2020, 23(2), 52-65.
39. Line Kong A Siou, Anne Johannet, Valérie Borrell, Séverin Pistre. Complexity selection of a neural network
model for karst flood forecasting: The case of the Lez Basin (southern France). jhydrol 2011, 403, 367-380.
40. Hu, C., Q. Wu, H. Li, S. Jian, N. Li, and Z. Lou. Deep learning with a long short-term memory networks
approach for rainfall-runoff simulation. Water 2018,10 (11), 1543.
41. Thapa, S.; Zhao, Z.; Li, B.; Lu, L.; Fu, D.; Shi, X.; Qi, H. Snowmelt-driven streamflow prediction using
machine learning techniques (LSTM, NARX, GPR, and SVR). Water 2020, 12, 1734.
42. Ni, L.; Wang, D.; Singh, V.P.; Wu, J.; Wang, Y.; Tao, Y.; Zhang, J. Streamflow and rainfall forecasting by
two long short-term memory-based models. J. Hydrol. 2020, 583, 124296.
43. Luo, B.; Fang, Y.; Wang, H.; Zang, D. Reservoir inflow prediction using a hybrid model based on deep
learning. IOP Conf. Ser. Mater. Sci. Eng. 2020, 715, 012044.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those
of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s)
disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or
products referred to in the content.
Preprints.org (www.preprints.org) | NOT PEER-REVIEWED | Posted: 20 March 2024 doi:10.20944/preprints202403.1218.v1