Uploaded by fatinrazali

2023 Q2 PLOS One Machine Learning models development Great Lakes

advertisement
PLOS ONE
RESEARCH ARTICLE
Machine learning models development for
accurate multi-months ahead drought
forecasting: Case study of the Great Lakes,
North America
Mohammed Majeed Hameed ID1, Siti Fatin Mohd Razali ID1,2*, Wan Hanna Melini
Wan Mohtar1,2, Norinah Abd Rahman1,2, Zaher Mundher Yaseen3,4
a1111111111
a1111111111
a1111111111
a1111111111
a1111111111
1 Department of Civil Engineering, Faculty of Engineering and Built Environment, Universiti Kebangsaan
Malaysia (UKM), Bangi, Selangor, Malaysia, 2 Green Engineering and Net Zero Solution (GREENZ),
Universiti Kebangsaan Malaysia (UKM), Bangi, Selangor, Malaysia, 3 Civil and Environmental Engineering
Department, King Fahd University of Petroleum & Minerals, Dhahran, Saudi Arabia, 4 Interdisciplinary
Research Center for Membranes and Water Security, King Fahd University of Petroleum & Minerals,
Dhahran, Saudi Arabia
* fatinrazali@ukm.edu.my
Abstract
OPEN ACCESS
Citation: Hameed MM, Razali SFM, Mohtar
WHMW, Rahman NA, Yaseen ZM (2023) Machine
learning models development for accurate multimonths ahead drought forecasting: Case study of
the Great Lakes, North America. PLoS ONE 18(10):
e0290891. https://doi.org/10.1371/journal.
pone.0290891
Editor: Sani Isah Abba, King Fahd University of
Petroleum and Minerals, SAUDI ARABIA
Received: May 3, 2023
Accepted: August 15, 2023
Published: October 31, 2023
Copyright: © 2023 Hameed et al. This is an open
access article distributed under the terms of the
Creative Commons Attribution License, which
permits unrestricted use, distribution, and
reproduction in any medium, provided the original
author and source are credited.
Data Availability Statement: All data used in this
study are available in the supplementary file and
provided in Appendix B.
The Great Lakes are critical freshwater sources, supporting millions of people, agriculture,
and ecosystems. However, climate change has worsened droughts, leading to significant
economic and social consequences. Accurate multi-month drought forecasting is, therefore,
essential for effective water management and mitigating these impacts. This study introduces the Multivariate Standardized Lake Water Level Index (MSWI), a modified drought
index that utilizes water level data collected from 1920 to 2020. Four hybrid models are
developed: Support Vector Regression with Beluga whale optimization (SVR-BWO), Random Forest with Beluga whale optimization (RF-BWO), Extreme Learning Machine with
Beluga whale optimization (ELM-BWO), and Regularized ELM with Beluga whale optimization (RELM-BWO). The models forecast droughts up to six months ahead for Lake Superior
and Lake Michigan-Huron. The best-performing model is then selected to forecast droughts
for the remaining three lakes, which have not experienced severe droughts in the past 50
years. The results show that incorporating the BWO improves the accuracy of all classical
models, particularly in forecasting drought turning and critical points. Among the hybrid models, the RELM-BWO model achieves the highest level of accuracy, surpassing both classical
and hybrid models by a significant margin (7.21 to 76.74%). Furthermore, Monte-Carlo simulation is employed to analyze uncertainties and ensure the reliability of the forecasts.
Accordingly, the RELM-BWO model reliably forecasts droughts for all lakes, with a lead time
ranging from 2 to 6 months. The study’s findings offer valuable insights for policymakers,
water managers, and other stakeholders to better prepare drought mitigation strategies.
Funding: The author(s) received no specific
funding for this work.
Competing interests: The authors have declared
that no competing interests exist.
PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023
1 / 37
PLOS ONE
Accurate multi-months ahead drought forecasting
1. Introduction
Drought is a harmful and creeping climatic crisis that can occur in all climates and negatively
influences ecosystems and human societies. Unlike rapidly developing natural disasters (e.g.,
hurricanes, floods, and storms), drought is a slow-developing phenomenon that poses a significant and enduring threat to human daily life, agriculture, ecology, and the economy [1, 2].
The lack of precipitation for a long time is the direct and leading cause of drought [3]. Since
two decades ago, droughts have become increasingly common and severe due to climate
change brought on by global warming, causing food and water crises in some regions worldwide [4–6]. Moreover, the occurrence of drought leads to various issues, including desertification, water scarcity, and forest fires, which have negative repercussions at both local and global
scales [7–10]. Furthermore, drought that occurs in areas located in the North American continent and other countries has a great relationship with forest fires [11, 12]. Natural disasters
that struck almost all world countries have severely affected enormous numbers of people all
over the globe. Statistically analyzed data estimates that about 3.8 billion people worldwide
have suffered from these catastrophes and that half of those have suffered due to droughts [13].
Also, the drought disaster has directly or indirectly killed 1.3 million people out of 3.5 million
who died due to natural disasters.
Drought is typically categorized into four types: meteorological drought, agricultural
drought, hydrological drought, and socio-economic drought [14–16]. The three physical
drought categories (i.e., metrological, agricultural, and hydrological droughts) self-explanatorily describe the lack of water in the hydrological cycle. It can be said that drought is associated
significantly with precipitation, and thus the onset of metrological drought begins when there
is a rainfall deficit. Hydrological droughts often deal with the decrease in the river flow,
groundwater level, dam reservoir storage, and lake water level due to long-term meteorological
droughts [17]. In addition, persistent hydrological drought causes agricultural droughts. In
agricultural drought, soil moisture is deficient, resulting in a reduction in crop yields.
Several drought indices have been developed since the 1960s for monitoring and forecasting
droughts based on individual or multiple meteorological or hydrological variables (e.g., rainfall, evapotranspiration, streamflow, groundwater level, and runoff) [18]. It is worth mentioning that among all drought indexes, the "standardized precipitation index (SPI)", proposed by
[19], is broadly used to quantify drought across the world and receives considerable attention
from researchers [5, 20, 21]. Based on SPI principles, several hydrological parameters have
been derived and widely used to assess and monitor droughts, such as standardized streamflow
index (SSI) [22, 23], standardized hydrological drought index (SHDI) [24], standardized
groundwater level index (SGI) [25], and standardized runoff index (SRI) [26, 27]. These indices’ popularity and widespread use can be attributed to their ability to effectively evaluate and
track drought conditions across diverse timeframes and geographic locations while requiring
minimal data inputs and maintaining strong comparability in time and space [28, 29].
Since drought is a manifestation of climate variability events, scholars have developed
drought forecasting models to minimize the adverse effects of drought on agriculture, water
resource systems, socio-economic aspects, and hydropower generation. Notably, climate variability events refer to natural fluctuations in climate patterns that can affect temperature, precipitation, and other climate factors over time. Climate change can intensify the impacts of
drought, leading to more frequent occurrences of water scarcity [30, 31]. This poses significant
challenges for ecosystems and communities that depend on reliable access to water resources.
Thus, drought forecasting is substantial and necessary to make effective decisions in managing
water resources, executing drought mitigation strategies, and establishing robust early warning
systems [32]. Therefore, selecting an appropriate model is the most crucial factor in drought
PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023
2 / 37
PLOS ONE
Accurate multi-months ahead drought forecasting
forecasting after identifying the drought type. The forecast of droughts is frequently implemented using physical and data-driven models. Conceptual and physical models that necessitate substantial data about a catchment or basin process, resulting in complicated predicting
models [24]. However, data-driven models are more popular because they are flexible, require
less data, and do not consider the process of a studied catchment. In recent decades, different
machine learning (ML) models and statistical approaches were intensively utilized, such as
autoregressive integrated moving average [33, 34], support vector regression [35–37], adaptive
neuro-fuzzy inference system [38–40], artificial neural network (ANN) [41–43], deep learning
models [44, 45], and assembling based approaches like random forest [46, 47] for forecasting
drought over the world. Some researchers have developed hybrid models based on natureinspired algorithms that are effective and adaptable in forecasting drought. The meta-heuristic
algorithms can help produce models with high prediction accuracy by training or optimizing
the parameters of ML models, which substantially affects the model performance [48, 49].
Notably, fourteen meta-algorithms have been intensively used to improve the ML model’s performance in drought forecasting [24, 50–54].
Enhancing the structure of ML-based models in drought forecasting has aroused the interest of researchers in the last few years. Some scholars tackled fundamental issues of some ML
models, such as ANN. Despite being a popular approach among ML-based models in drought
sectors, this model encounters difficulties related to generalization, complexity, and learning
speed, along with the challenge of becoming stuck in a local minimum. Thus, an extreme
learning machine (ELM) was introduced to address the drawbacks of ANN (i.e., complexity,
slow convergence, and overfitting) and then used widely to solve engineering problems and
drought forecasting. ELM has several advantages compared to ANN, such as better generalization, fast learning, simpler structure, and superior performance [55, 56]. Accordingly, ELM
has been applied for drought forecasting in different climate regions and obtained promising
results [1, 30, 57, 58]. Nevertheless, it is possible to adjust the model structure to enhance its
performance and stability by using regularization parameters and nature-inspired algorithms
to train the model effectively.
Researchers have not paid enough attention to monitoring and forecasting hydrological
drought, despite its importance in managing water resources [59]. The main objective of this
study is to evaluate and forecast hydrological drought in the Great Lakes, situated in North
America, which contains over one-fifth of the world’s fresh water. The Great Lakes are an
essential source of fresh water for people, and droughts in these water bodies can significantly
impact water availability and the ecosystem. Notably, the recent droughts in the Great Lakes
have been mainly attributed to climate change-induced elevated water temperatures, resulting
in increased evaporation due to a substantial decline in winter ice cover [60]. This study used
Multivariate Standardized Lake Water Level Index (MSWI) to assess the hydrological drought
over the Grate Lakes. MSWI can simultaneously display the drought condition from the standpoints of many time scales. To forecast hydrological drought for different lead-time horizons
(from one to six months ahead), four hybrid models were developed by combining regularized
extreme learning machine (RELM), RF, ELM, and SVR with the Beluga whale optimization
(BWO) algorithm. The BWO algorithm is a novel nature-inspired algorithm with robust and
efficient performance in solving engineering problems. Moreover, it displays excellent and
superior results when tested against fifteen meta-heuristic algorithms in solving different engineering problems [61]. The hybrid models were compared and validated against established
standard models (RF, ELM, and SVR).
Previous drought works have suggested using standard performance metrics such as correlation of determinations (R2), root mean square error (RMSE), and others to choose the best
prediction models [62, 63]. However, drought time series data, particularly at higher time
PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023
3 / 37
PLOS ONE
Accurate multi-months ahead drought forecasting
scales, include a significantly high auto-correlation [64–66]. This can result in very high R2 values and lower forecasting errors, rendering such metrics less effective in selecting the best
models. Accordingly, this paper’s proposed advanced evaluation method focuses on accurately
detecting the turning points in drought data and evaluating the model’s ability to capture them
effectively. Finally, uncertainty analysis was used to evaluate the forecasting accuracy and select
which period lead times can be reliably forecasted.
2. Methodological overview
2.1 Random forest (RF)
The principle of an ensemble and bagging learning technique served as the foundation for creating the RF. The decision tree approach utilizing the bagging methodology is a crucial factor
in the effectiveness of this model in solving engineering problems, especially those related to
regression [67]. In RF, nodes are nominated randomly based on the most significant input features, enhancing the learning, forecasting, and maintaining reliability to improve the generalization capacity. The following procedures are considered to establish the RF model [57]:
i. From training data observations, select random i data samples bootstrap sample.
ii. Establish the decision tree related to the data selected (i) from the above step.
iii. Select the n-trees (ntrees) that require to be built.
iv. Repeat the steps from one to three.
v. Forecast MSWI by accumulating the aggregative forecasts of ntrees.
The RF model’s capacity has gained a good reputation in solving hydrological and environmental problems and also conducted in drought forecasting [46, 68–71]. Fig 1 provides the
random forest model’s flowchart.
2.2 Support vector regression (SVR)
SVR is a supervisor learning machine used widely in water resource applications and drought
forecasting. The main idea behind SVR is to map the input vectors to higher feature dimensional
spaces through a nonlinear mapping and then conduct a regression model based on the calculated feature spaces. Even with limited observations from the training datasets, the approach can
effectively find an excellent solution to nonlinear problems [48, 72]. The structural risk minimization (SRM) concept, which SVM employs, seeks to minimize the upper and lower bounds of the
generalization error, which is the sum of the training error and a confidence level [73, 74]. This
concept is distinct from the more common empirical risk minimization (ERM) method, which
focuses only on reducing the total error generated during the training stage.
Assuming the MSWIi in the datapoints is the actual normalized value of the ith sample
while the input parameter of that target is Xi. The sample set accordingly can be derived as
{(Xi, Yi)}i=1N where the N is the total samples, SVR uses the form in the following Equation to
approximate the function. Finally, the term Yi is this case is MSWIi.
MSWI ¼ f ðXÞ ¼ W:;ðXÞ þ b
ð1Þ
The ;(X) is the high-dimensional space. Estimating coefficients (W and b) requires using a
regularized risk function, represented by Eq (2).
Minimize :
PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023
1
1 XN
2
kWk þ C
L ðYi ; f ðXi ÞÞ
i¼1 ε
2
N
ð2Þ
4 / 37
PLOS ONE
Accurate multi-months ahead drought forecasting
Fig 1. Schematic of RF model.
https://doi.org/10.1371/journal.pone.0290891.g001
(
Lε ðMSWIi ; f ðXi ÞÞ ¼
jYi
f ðXi Þj � ε
f ðXi Þj
Others
0;
jMSWIi
ð3Þ
The regularized term in the above equations is kWk2, and C to calculate the trade-off
between training error and model flatness. In Eq (2), the second term represents the empirical
error determined by the intensity loss function (Eq 3). The loss function equals zero if the
anticipated value corresponds with the tube (a region around the predicted regression line).
Conversely, the loss is a value representing the difference between the tube residual of the tube
(ε) and the projected value [75]. The Equation above is converted into the principal objective
function represented by Eq 4 to estimate the coefficients (W and b).
Xn
�1
minmize W; x1 ; x∗1 ; b k Wk2 þ C
ðx1 þ x∗1 Þ
i¼1
2
8
>
<
Subject to :
>
:
MSWIi
b � ε þ x1 ;
W:;ðxi Þ
∗
1
W:;ðxi Þ þ b � ε þ x ; i ¼ 1; 2; . . . ; N
ð4Þ
∗
1
x1 ; x � 0
In the Equation above the slack variables are ξ1, and x∗1 . Eq (5) is expressed as follows once
PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023
5 / 37
PLOS ONE
Accurate multi-months ahead drought forecasting
the kernel function K(Xi, Xj) is introduced;
�
Minimize ai ; a∗i
�
�
1 XN XN
∗
∗
:
ða
a
Þða
a
Þ:K
X
;
X
i
j
i
j
i
j
i¼1
j¼1
2
Subject to :
ε
8 N
X
>
<
ða
i
>
:
XN
i¼1
ðai
a∗i Þ þ
XN
a∗i Þ ¼ 0
i¼1
i¼1
MSWIi ðai
a∗i Þ
ð5Þ
∗
i
ai ; a 2 ½0; C�
In the above Equation, the αi, and a∗i are the Lagrange multipliers while the i, j are different
samples that are used in the training dataset. Accordingly, Eq (3) can be expressed as below:
MSWI ¼ f ðXÞ ¼
N
X
ðai
a∗i ÞKðXi ; Xj Þ þ b
ð6Þ
i¼1
Notably, the Gaussian kernel density function is used in this study [76].
2.3 Extreme learning machine (ELM)
ELM is a cutting-edge algorithm invented in 2006 to train the single-layer feedforward neural
network. ELM’s inventor affirmed that this algorithm could replace the conventional algorithm (i.e., backpropagation algorithms) to train ANN due to its quick learning and superior
generalization [77]. Fig 2 shows the general schematic of the ELM model. The construction of
Fig 2. ELM model for forecasting MSWI.
https://doi.org/10.1371/journal.pone.0290891.g002
PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023
6 / 37
PLOS ONE
Accurate multi-months ahead drought forecasting
the ELM model can be expressed mathematically as follows:
M
X
f ðxÞ ¼
yi hi ðxi Þ ¼ hðxÞy
ð7Þ
i¼1
where θi = [θ1, θ2, I, θM]T is the output layer weight value. These values are significant because
they connect the m output nodes in the hidden layer of M nodes. The nonlinear ELM feature
mapping represents as hi(xi), hi(xi) = [hi(x), h2(xI. . .,hm(x)]T. The final ELM output or forecast
is f(x). The xi is the independent variable, and hi(x) is the output magnitude of the ith hidden
note output. It is possible in the ELM model that hidden nodes do not have unique output
functions. It can be said that each hidden neuron may have a different output function. However, Hi(xi) in practical problems can be expressed as:
Hi ðxi Þ ¼ Gðai ; bi ; xÞ; ai 2 Rd ; bi 2 R
ð8Þ
Parameters a and b are the hidden node parameters (weights and bias values), xi is the
input variable, R refers to the set of real number values, Rd is the d-dimensional set of real
numbers, and G is the nonlinear transfer function. The transfer function is essential in the
ELM model because it is responsible for mapping input data to a complex and higher dimensional space; hence, the model can find a satisfactory solution for complex engineering problems [78]. ELM algorithm through training ANN is based on two essential phases, the first
stage is random features mapping, and the second one is called linear parameters solving.
ELM uses several nonlinear activation functions to convert the inputs (i.e., variables) into feature space by randomly adjusting weights and bias values of t’e ELM’s hidden layer. Accordingly, the hidden layer parameters (a, b) are assigned randomly. The other significant stage in
ELM is to calculate the weight values (θ) which connect the hidden layer of the ELM model
with the output layer. These wights are linearly calculated using least square error sense by
minimizing the approximations error:
min kHy
y
Yk
ð9Þ
The term Y is this case is MSWI (training data) and the H matrix depends on input variables, and hidden layer parameters in the above formula can be calculated below [79]:
2
6
H¼6
4
hðx1 Þ
3
2
Gða1 x1 þ b1 Þ
���
GðaM x1 þ bM Þ
..
.
..
..
.
7 6
7¼6
5 4
..
.
hðxN Þ
Gða1 xN þ b1 Þ
.
3
7
7
5
ð10Þ
� � � GðaM xN þ bM Þ
The Y is the training data response, which can be expressed as:
2
y1 T
3
2
t11
6 . 7 6 .
7 6
Y¼6
4 .. 5 ¼ 4 ..
y1 T
PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023
tN1
���
..
.
t1M
3
.. 7
7
. 5
ð11Þ
� � � tNM
7 / 37
PLOS ONE
Accurate multi-months ahead drought forecasting
The k.k is the Frobenius norm, and thus, the calculated output weights |(θ*) can be determined as follow:
y∗ ¼ H þ Y
ð12Þ
In the above mathematical expression, the H+ is the Moore–Penrose generalized inverse of
the matrix of H. ELM differs from conventional neural networks in that there is no requirement to fine-tune every parameter of the model, including the weights and biases of the hidden
layer. After random selection of the hidden layer weight and bias values, the ELM can be considered a linear system. Then, the weight values that connect the hidden layer with the output
layer of ELM are systematically determined by the generalized inverse procedure of the hidden
layer output matrices. As a result, the ELM model is several times quicker than classical algorithms used for training feedforward networks.
2.4 Regularized extreme learning machine
As discussed in the previous section, the only output weight matrix of the ELM model is calculated from the training procedure. The training error should decrease to a minimum value to
obtain these weights. However, minimizing only the training error might reduce the generalization capacity of the developed model when estimating samples that were not included in
training data points [80]. Thus, the classical ELM solutions would have some serious weaknesses. The first problem is that ELM, based on the empirical risk minimization (ERM) concept, frequently leads to over-fitting models [81]. Due to its direct calculation of the lowest
norm least-squares solutions, ELM has a poor control capability, which is regarded as the second disadvantage. Another drawback is that it might lead to less reliable estimates, for example, if data includes outlier values or heteroskedasticity.
The highest level of generalizability is achieved in some advanced models when the weights’
norm values and forecasting training -error rate are minimized simultaneously [82]. Accordingly, this paper suggested RELM based on the structural risk minimization (SRM) concept of
statistical learning theory to address mentioned disadvantages [83]. The uniqueness of this
characteristic in the RELM model is attributed to its regularization parameter (γ). It is important to mention that the loss function for the ELM model is modified in this work using γ
parameter to reduce both the training error and the weight values norm simultaneously. The
mathematical expression of the cost function for RELM model can be illustrated below:
ERELM ¼ Min gkY
y
2
Hyk2 þ kyk2
2
ð13Þ
Subject to error ε:
ε¼Y
Hy
ð14Þ
Thus, the relation can be expressed as the following mathematical expression:
2
ERELM ¼ Min gkεk2 þ kyk2
2
ð15Þ
y
The Lagrangian Equation is defined as the follow to calculate the optimum solution:
2
2
Lðy; ε; lÞ ¼ gkεk2 þ kyk2 þ lT ðY
PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023
Hy
εÞ
ð16Þ
8 / 37
PLOS ONE
Accurate multi-months ahead drought forecasting
8
@L
>
¼ 0 ) 2y
>
>
>
@y
>
>
<
@L
¼ 0 ) 2y
>
@ε
>
>
>
>
>
: @L ¼ 0 ) Y
@l
HT l ¼ 0
HT l ¼ 0
ð17Þ
Hy ¼ 0
In the above equations, the Lagrangian multiplier matrix is λ, and ε is the training error, ε
= [ε1, ε2,. . .εN]T. The N is the total number of training samples. According to the previous
equations, the optimal weights of the RELM output layer are computed as the following Equation:
1 I
y^ ¼ ðH T HÞ ∗ H T Y
g
ð18Þ
The term I in the above Equation is called the unit matrix. The formula of calculated y^ vector is suitable when the hidden neurons are more than the training samples. Otherwise, the following Equation can be used to find the weights.
�
� 1
I
y^ ¼ H T HH T þ
∗Y
g
ð19Þ
2.5 Beluga whale algorithm (BWO)
BWO is a sophisticated metaheuristic algorithm invented in 2022 to solve engineering and
optimization problems [61]. The algorithm was employed to improve the efficiency of drought
forecasting by training the RELM model. BWO simulates the behaviour of beluga whales that
live in regular groups, searching for and catching prey in the sea [84]. The mathematical
model of BWO depends on three essential stages: exploration, exploitation, and whale fall. The
first stage (exploration phase) in the mathematical model simulates the swimming behaviour
of the Beluga whale, while the exploitation stage is inspired by preying behaviour of these animals. The third stage is inspired by the whale’s fall into the depths of the sea, or it was hunted
by people or predators such as polar bears; hence it’s called "whale fall". It is worth noting that
during the optimization process, the whale fall stage is executed in each iteration after the
exploration and exploitation stages are completed. The main steps needed to implement this
algorithm can be summarized as follows:
i. Initialization: input the algorithm parameters, such as the maximum number of iterations
(Tmax) and population size (n). The initial position of each population size (beluga whales)
is randomly assigned in the space search. Besides, the fitness values are computed according
to the given objective function.
ii. Extraction and exploration phases update: according to the balance factor (Bf) value, each
beluga whale is appointed to be involved in the exploration or exploitation stage. If the Bf
PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023
9 / 37
PLOS ONE
Accurate multi-months ahead drought forecasting
value exceeds 0.5, that means the Beluga whale is entering into the exploration phase, and
thus its position will be updated according to the following Equation:
(
xi;j Tþ1 ¼ xi;pj T þ ðxr;p1 T
xi;pj T Þ þ ð1 þ r1 Þsinð2pr2 Þ;
j ¼ even
xi;j Tþ1 ¼ xi;pj T þ ðxr;p1 T
xi;pj T Þ þ ð1 þ r1 Þcosð2pr2 Þ;
j ¼ odd
)
ð20Þ
The current iteration in the evacuation is T while the new position of ith whale on jth
dimensional space is xi,jT+1. The symbols r1, and r2 are random numbers between (0, 1). xi;pj T is
the position of the ith beluga whale on pj dimension.
Otherwise, when the Bf value is less than 0.5, the updating mechanism is controlled by the
exploitation stage, and eventually, the position of each whale is updated using Eq 21. After
that, the fitness values of new positions are determined and sorted to achieve the best outcome
in the current iteration.
Xi Tþ1 ¼ r3 Xbest T
r4 Xi T þ C1 :LF :ðXr T
Xi T Þ
ð21Þ
XiT and XrT are current positions for the ith beluga whale and a random beluga whale,
respectively. The two terms in the Equation (r3 and r4) are random numbers between (0, 1)
while C1 is a constant number that depends on r4 current and maximum iteration (T and
�
�
T
Tmax); C1 ¼ 2r4 1 Tmax
. Finally, the LF is the Levy flight function [85].
LF ¼ 0:05 �
�
s¼
m�s
ð22Þ
1
jvjb
Gð1 þ bÞ � sinðpb=2Þ
Gðð1 þ bÞ=2Þ � b � 2ð1 bÞ=2
�b1
ð23Þ
β is the default constant (1.5), while μ, and v are normally distributed random numbers.
iii. Whale fall phases update: The probability of whale fall (Wf) is determined in each iteration,
considering that some beluga whales may fall into the deep sea or die. Consequently, it is
essential in such cases to update the position of a beluga whale.
X Tþ1 i ¼ r5 Xi T
r6 Xr T þ r7 Xstep
ð24Þ
In the above equations, the r5, r6, and r7 are assigned randomly between one and zero.
Besides, Xstep is the step size of a whale fall.
Xstep ¼ ðub
lb Þ � expð C2 T=Tmax Þ
ð25Þ
In the above Equation, the ub, and lb are upper and lower bond variables. The C2 is a step
factor that depends mainly on Wf.
C2 ¼ 2Wf � n
ð26Þ
Finally, the probability of whale fall can be calculated using the following formula.
Wf ¼ 0:1
PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023
0:05 T=Tmax
ð27Þ
10 / 37
PLOS ONE
Accurate multi-months ahead drought forecasting
iv. Terminating condition check: If the current iteration exceeds the maximum number of iterations, the BWO algorithm terminates. Otherwise, repeat steps ii to iii.
2.6 Model development
Multiple ML-based models have been created to forecast droughts in the Great Lakes region
with the ability to forecast several months in advance. This study integrated a modified version
of ELM called RELM with BWO to create a hybrid model known as RELM-BWO. The performance of this model was compared to classical models such as ELM, SVR, and RF. The
RELM-BWO model was also validated against other hybrid models that combined classical
models with BWO, namely SVR-BWO, RF-BWO, and ELM-BWO. All models were trained
using past values (5-month lag) to forecast MSWI for several time steps ranging from 1 to 6
months ahead. In this study, time series drought data is divided into a training set (75% of the
data from December 1921 to January 1997) and a testing set (data points from February 1997
to December 2021). This division ratio balances the amount of data for training and testing in
machine learning models, resulting in good results [86, 87]. The method aims to train the
model on historical data and assess its predictive efficiency on the remaining portion. This
approach provides insights into the accuracy of future forecasts for drought data using historical information and is considered more challenging than random data division in time series
analysis.
Once the input selection was determined, all input vectors and their corresponding targets
were normalized to a range between 0 and 1. For the hybrid models, the primary objective of
BWO was to optimize the hyperparameters of the classical models, as they greatly influence the
accuracy of the model’s predictions. The objective function used was the root mean square
error, and BWO aimed to minimize this function by selecting the best hyperparameters for
each model. In the case of RELM-BWO, BWO was employed to determine the optimal value of
the regularization parameter (γ) and the weight and bias values of the input layer. The remaining parameters for the other models calculated by BWO are listed in Table 1. All the models
were implemented using MATLAB 2020a, and the primary process can be observed in Fig 3.
2.7 Statistical parameters
The forecasting results were evaluated to identify the best forecasting model. Several statistical
indicators have been used to assess the forecasting results, such as Mean absolute error (MAE),
Root means square error (RMSE), Relative error (RE%), and Correlation of determination
Table 1. Hyperparameters of the applied models.
Algorithm/
Model
RF
Parameter
Number of trees ranges from 26 to 70, while the leaf 2 [3,8]
SVR
Kernel Scale range from 0.2441 to o.91061, and Box Constrain range from 0.1449 to 0.9554.
ELM
Weights range from -1 to 1, Hidden Layer Nodes are 7.
BWO
The maximum iteration is 100, and the number population is 50.
RELM-BWO
SVR-BWO
RF-BWO
ELM-BWO
Weights range from -1 to 1, Hidden Layer Nodes are 7. BWO applied to compute hidden layer’s
weights, biases values, and the Regularization parameter (γ)
BWO applied to compute Kernel Scale, Box Constrain, and Epsilon
BWO applied to determine Number of trees, and leaf numbers.
Weights range from -1 to 1, Hidden Layer Nodes are 7. BWO applied to compute the hidden
layer’s weights, and biases values,
https://doi.org/10.1371/journal.pone.0290891.t001
PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023
11 / 37
PLOS ONE
Accurate multi-months ahead drought forecasting
Fig 3. Flow chart of model development.
https://doi.org/10.1371/journal.pone.0290891.g003
(R2). MAE is used widely to assess regression models in hydrological studies. This criterion
measures the mean absolute forecasting error. RMSE is one of the most famous indices used to
evaluate comparable models. RMSE measures the mean square forecasting error and has a
higher value than MAE. Besides, RE% is the absolute error divided by the actual data value,
while R2 is a measurement used to explain how much variability of the forecasted values can be
caused by its relationship to the measured data. The mathematical expression of the statistical
metrics can be seen as following [76]:
MAE ¼
n
1X
jMSWIobsi
n i¼1
MSWIpredi j
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
N
1X
2
RMSE ¼
ðMSWIobsi MSWIpredi Þ
n i¼1
PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023
ð28Þ
ð29Þ
12 / 37
PLOS ONE
Accurate multi-months ahead drought forecasting
RE% ¼
MSWIobsi MSWIpredi
� 100%
MSWIobsi
Pn
2
R ¼1
2
i¼1
ðMSWIobsi
i¼1
ðMSWIpredi
Pn
ð30Þ
MSWIpredi Þ
� pred Þ2
MSWI
ð31Þ
The best forecasting models provide the possible lower value of RMSE, RE%, and the highest R2. In the above formulas, the n is the total number of samples, MSWIobsi is the observed
� pred is the mean forecasted value.
drought value, MSWIpred is the forecasted values, and MSWI
i
2.8 Uncertainty analysis
The forecasting accuracy decreases when the lead time horizon increases [59, 88]; hence,
adopting an efficient norm to assess forecasting efficiency is crucial. Since it is difficult to evaluate the reliability of the ML-based models via classical performance measures (i.e., RMSE, R2,
etc.) in forecasting the drought at longer lead time scales, UN is used because it is an efficient
tool to examine the dependability of forecasting results. Based on the concept of Monte Carlo
Simulation, the UN of the adopted model is quantified. This study’s consideration of input
parameter uncertainty relates to the accuracy and representativeness of the input data used to
make forecasts [89]. The applied approach comprises producing random input data that
matches the empirical probability distribution utilized to represent the input parameter, executing the forecasting model, and generating output from the data obtained [90]. The current
study depends on randomly resampling the input data set without replacement 100000 times
while maintaining the same ratio rate between the training and testing sets [91, 92]. In other
words, for each observation, 100000 data points are generated from their corresponding
empirical probability distribution and then fed to the applied model to generate the responses.
To calculate the 95% confidence intervals for each response, the 2.5th(XL) and 97.5th(XU) percentiles of the cumulative distribution comprising 100000 datapoints are determined. The
validity of the final drought model is assessed by calculating its robustness measure, which is
determined by the proportion of observed drought values that fall within the 95% confidence
interval. A higher ratio indicates greater validity, while a lower ratio indicates the opposite.
The Equation for the 95% prediction uncertainty (95PPU) is provided below.
�
1
Bracketted by 95 PPU ¼ count mjX i L � X i Obs � X l U � 100
n
ð32Þ
In the Eq 32, the term 95PPU denotes 95% predicted uncertainty; n is the total number of
drought data records; i is the current month that can be changed from 1 to m; XiL, and XiU are
lower and upper bands of uncertainty, and XiObs is the observed data of ith month.
Additionally, the average width of the confidence interval is calculated using the d−factor,
which can be expressed through the following Equation:
PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023
d
factor ¼
1
d�x ¼
n
n
X
ðXU i
d�x
s�x
XL i Þ
ð33Þ
ð34Þ
i¼1
13 / 37
PLOS ONE
Accurate multi-months ahead drought forecasting
The s�x is the standard deviation of the observed data while the average distance between
the upper band (97.5th percentile) and the lower band (2.5th percentile). The higher value of
the d-factor represents a higher degree of uncertainty. Theoretically, the d-factor is zero, but it
normally cannot be attained due to the model uncertainty. Thus, the ideal value of the d-factor
is thus considered to be smaller than one [93].
3. Case study and data description
The Great Lakes are a collection of interconnected freshwater lakes with some sea features.
They are in the central-eastern part of North America and are linked to the Atlantic Ocean by
the Saint Lawrence River. All five lakes—Superior, Michigan, Huron, Erie, and Ontario—are
located inside or close to the international boundary between Canada and the United States
(see Fig 4 [94]). Lakes Michigan and Huron (Lake Michigan- Huron) are hydrologically considered one body of water since they are connected at the Straits of Mackinac. Furthermore,
the Great Lakes Waterway qualifies modern ships to navigate between the lakes. Since approximately one-fifth (21%) of the world’s surface freshwater is contained in the Great Lakes, they
are considered the largest group of freshwater lakes on the earth in terms of the total surface
area (244,106 km2) [95]. The Great Lakes are one of the most significant water resources in the
world, and they provide more than fifty million people in eastern North America with water
that can be used for various purposes [96]. The five lakes’ cumulative capacity is approximately
6 x 1015 gallons, which is enough water to drown North America to an average depth of one
meter and indicates how huge the lakes are. It is essential to mention that Lake Superior is the
most immense, profound, and upstream of the Lakes [97]. It is important to note that the
Fig 4. Location of the Great Lakes [1]. 1. Great Lake Map. Available: https://kids.britannica.com/kids/article/Great-Lakes/
346130#:—:text = The Great Lakes are five,of fresh water on Earth.
https://doi.org/10.1371/journal.pone.0290891.g004
PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023
14 / 37
PLOS ONE
Accurate multi-months ahead drought forecasting
water level data were acquired and gathered from open access source website [98]. The data
covers a long period of time from 1918 to 2021. Notably, the raw data for all lakes are available
in B1 Table in S1 File, which can be found in Appendix B.
4. Drought calculation
4.1 Multivariate standardized water level index (MSWI)
Standardized water level index (SWI) is calculated using the same method as the SPI, but
instead of using monthly rain, the water level is used as input [99]. The MSWI is calculated
based on several SWI time series for a specific station, and each time scale represents a particular time horizon. The MSWI considers every subset of SWI time scales (i.e., SWI-1, SWI-2,
SWI-3, . . ., to SWI-48 months) for studied lakes. Examining multiple SWI time series in detail
can offer valuable insights into drought events. However, comparing and analyzing these time
series at different scales can be challenging for scholars and may lead to confusion and inefficiencies in addressing the issues. In addition, it is challenging to precisely define the duration
of the timescale, such as long or short-term, and accurately connect it to a specific category of
drought impact. For example, short-term drought can be represented on a three-month time
scale, which raises the logical question of why a standard drought index cannot define two or
four-month scales. Accordingly, the standard drought index may not be derived from a specific criterion to classify some drought time scales (e.g., SWI-4, SWT-5, etc.) as short, medium,
or long scale. Given the concerns, some researchers suggested decreasing the number of examined drought aggregation time scales and accompanying series by employing a multivariate
approach [100]. Notably, a reduction in SWI series does not mean that the SWI for some time
scales are not calculated, but rather that a substantial portion of the variability of the SWI on
different time scales is summed up into certain series. Thus, the reduction of these drought
time scales is computed by using an effective method called principal component analysis
(PCA).
PCA is a widely used prominent approach for analyzing datasets with high-dimensional
space (features) [101], providing several benefits such as retaining maximum information
while reducing data features, increasing data interpretability, and enabling visualization of
multidimensional data. PCA displays the following linear combinations of the I original variables:
PCk ¼ Ek T X ¼
I
X
eik Xi ; i ¼ 1; 2; 3; . . . ;
ð35Þ
i¼1
In the above Equation, the PCk is the kth principal component while, Ek is the kth eigenvector. The original ith variable is represented in the above formula as Xi and finally, the eik is the
ith element of the kth eigenvector. In PCA, linear combinations are constructed from the original variables. These combinations, known as principal components, are formed in such a way
that they are mutually uncorrelated, meaning that they do not have any shared information.
The number of principal components that can be generated is determined by the number of
original variables in the dataset. The first principal component (PC1) can account for a significant proportion of the variance in the original variables. PCA is particularly useful when there
is a high correlation between variables in the dataset. A cross-correlation matrix of the predictors must be constructed to perform PCA, from which the eigenvalues and eigenvectors are
calculated. The eigenvector corresponding to the highest eigenvalue represents the weighted
coefficients of the first principal component, and this process is repeated for subsequent
components.
PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023
15 / 37
PLOS ONE
Accurate multi-months ahead drought forecasting
Fig 5. Drought categories based on the MSWI value.
https://doi.org/10.1371/journal.pone.0290891.g005
As PC1 holds the most essential information of the data, the MSWI is derived from this
component. However, unlike the standard index (e.g., SWI), which has a mean of zero and a
standard deviation of one, PC1 does not meet these requirements due to its algebraic characteristics, which make its values incomparable across different months or locations. Therefore, the
time series of PC1 must be standardized using the following Equation based on the mean and
standard deviation for various months throughout the year.
Z1YM ¼
PC1YM PC�1M
SD1M
ð36Þ
In the Eq 36, the Z1YM (MSWI) is the PC1 standardized value corresponding to Mth month
and Yth year, PC�1M , and SD1M are the average and PC1 component for Mth month. Notably,
the PC�1M value is very small and close to zero; therefore, it can be neglected. The data of MSWI
is organized in ascending order, and a graphic of its corresponding empirical probability distribution is shown to identify the categories of drought and wet severity (see Fig 5)[102].
4.2. Describing droughts in the Great Lakes
The main focus of this study was on two prominent lakes, specifically Lake Superior and Lake
Michigan-Huron, within the forecasting models. These lakes were given significant attention
due to their larger size and greater depth, distinguishing them from the other lakes. Besides,
they are located upstream, as the water flows from them to the rest of the lakes and then to the
sea (see A1 Fig in S1 Appendix). The figures attached in the appendix (A2 Fig-A4 Figs in
S1 Appendix) show that in the last fifty years, the three lakes (Lake Erie, Lake Ontario, and
Lake St. Clair) did not witness any severe drought events and that the lowest value of the
PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023
16 / 37
PLOS ONE
Accurate multi-months ahead drought forecasting
Fig 6. Water level elevation VS drought. a), and d) Boxplot diagrams showing the monthly variation of water level for Lake Superior and Lake MichiganHuron. b), and c) line graphs showing the yearly water level elevation while c), and f) presenting the MSWI from 1920 to 2020.
https://doi.org/10.1371/journal.pone.0290891.g006
drought indicator is around -0.5. The reason may be related to the dams constructed and the
lakes’ applied regulation process. Notably, Lake Superior and Lake Ontario are currently regulated under some plans implemented by the International Joint Commission. Since 1921, Lake
Superior’s levels have been regulated, while the water level of Lake Ontario was regulated in
1958. Because there is a significant difference in the elevation, the regulation process of Lake
Ontario does not affect the upper lakes’ water level. Also, the regulation of Lake Superior has
an important influence on the whole Great Lakes System. Notably, the detailed drought analysis via MSWI for all lakes can be found in B2 Table in S2 File, which is located in Appendix B.
According to Fig 6, there is a large variation in the monthly levels of Lake Superior compared to Lake Michigan- Huron. Both lakes have a large area, but the first one is located
upstream, and the water flows from it to the other lakes, which significantly varies the monthly
levels of stored water. Based on Fig 6C, and 6F, both lakes suffered from severe drought from
mid-1921 to 1929 and late 1998 to 2015. Lake Michigan-Huron generally experienced more
frequent drought events than Lake Superior; however, the latter faced the severest drought.
The analysis also found that Lake Michigan-Huron experienced four severe drought events
between 1920 and 2020, while Lake Superior Lake has only faced two drought events during
PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023
17 / 37
PLOS ONE
Accurate multi-months ahead drought forecasting
Fig 7. Temporal analysis of drought intensity in the Great Lakes region (1921–2021).
https://doi.org/10.1371/journal.pone.0290891.g007
the same period. The droughts of the 1921s and 1929s were primarily caused by reduced precipitation, whereas the drought of the 2000s was mainly attributed to the elevated water temperatures induced by climate change, resulting in enhanced evaporation due to a substantial
reduction in winter ice cove [60]. It is essential to mention that the MSWI can explain more
than 95% of the variability for the nominated SWI time scales in the observed lakes.
The temporal analysis of drought intensity in the Great Lakes region, as depicted in Fig 7,
unveils a dynamic pattern of fluctuating drought conditions from 1921 to 2021. Notably, Lake
Superior and Lake Michigan-Huron emerge as critical focal points with notable variations in
drought intensity. Conversely, Lake Ontario transitions from drier to wetter conditions, indicating an overall increase in drought values (MSWI) and a positive drought trend. Similarly,
Lake Erie experiences a substantial transition towards positive drought conditions. Lake
St. Clair also exhibits patterns consistent with the other lakes. This comprehensive analysis
demonstrates the diverse drought intensities observed across the Great Lakes over the analyzed
period, particularly emphasizing the significant decrease in drought observed in Lake Superior
and Lake Michigan-Huron from 1998 to 2021. This pronounced finding highlights a significant shift in drought conditions for these lakes, underscoring the region’s evolving nature of
drought dynamics.
5. Results and discussion
The results section is divided into two main scenarios. In the first scenario, the RELM-BWO
model is validated against standalone models, while the second scenario compares the performance of the RELM-BWO model with other hybrid models. Since severe drought periods
were observed in only Lake Superior and Lake Michigan-Huron over the past fifteen years, the
PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023
18 / 37
PLOS ONE
Accurate multi-months ahead drought forecasting
models were compared, and the best performing one was selected to forecast drought over the
remaining three lakes.
5.1 First Scenario: A comparison of RELM-BWO with classical models
5.1.1 Modelling results. This section discusses the evaluated perfo7rmance of four
applied models (RF, SVR, RLM, and RELM-BWO) in forecasting MSWI time series for different lead times using various statistical measures and comparable graphs for Lake Superior and
Lake Michigan-Huron. The performance measures over the testing phase for both lakes are
provided in Tables 2 and 3. The provided assessment shows that the best forecasting result for
Lake Superior was obtained for RELM-BWO (R2 = 0.9687, RMSE = 0.0202, and
MAE = 0.0156) compared to the other drought forecasting models. Similarly, the adopted
model performs better in forecasting one-step month ahead for Lake Michigan-Huron, attaining higher accuracy (R2 = 0.9825, RMSE = 0.0117, and MAE = 0.0093). Generally, the quantitative assessment showed that the performance of the forecasting models in Lake Superior is
slightly lower than in Lake Michigan-Huron. Furthermore, the second-best model for forecasting drought is ELM, followed by SVR and the RF shows the poorest performance.
The evaluated results showed that the forecasting accuracy of the models reduces when the
lead time increases. The forecasting results generally deteriorate with the increase in forecasting multistep month ahead drought due to accumulated errors. For example, for Lake Superior, the RF model is considered among the models the most influenced by the change of time
interval where the MARE% varies from 32.14 to 56.20% followed by SVR (5.53 to 35.74%),
ELM (2.46 to 36.72%), and RELM-BWO (2 to 30.62%). Regarding Lake Michigan-Huron, all
models, except the RF, suffered from a gradual decrease in performance with the increase in
lead time. It can be observed that these models have similar characteristics in forecasting the
drought for both lakes, producing the peak errors in the forecasting drought for a six-month
ahead lead time. The RF has an abnormal performance for forecasting drought at Lake Michigan-Huron as it generates the maximum error at the fourth-step lead time. However, the RF
Table 2. Performance indicators of machine learning-based models for Lake superior.
Model
SVR
Performance metrics
2
3
4
5
6
MAE
0.0497
0.0706
0.0867
0.0867
0.1596
0.1956
RMSE
0.0771
0.0887
0.1089
0.1289
0.1974
0.2380
MAPE %
5.53
9.34
12.11
12.14
26.62
35.74
2
RF
ELM
RELM-BWO
Period lead time
1
R
0.9320
0.9299
0.9256
0.9256
0.9002
0.8854
MAE
0.2126
0.2338
0.2768
0.3305
0.3148
0.3191
RMSE
0.2746
0.3086
0.3521
0.3969
0.3822
0.3838
MAPE %
32.14
37.61
41.83
49.44
43.05
56.20
R2
0.9280
0.9210
0.9000
0.8983
0.8825
0.8720
MAE
0.0206
0.0462
0.0767
0.0944
0.1443
0.1873
RMSE
0.0265
0.0601
0.0976
0.1229
0.1774
0.2329
MAPE %
2.46
6.15
10.45
14.31
28.98
36.72
R2
0.9561
0.9499
0.9561
0.9375
0.9252
0.8942
MAE
0.0156
0.0371
0.0587
0.0886
0.1191
0.1524
RMSE
0.0202
0.0489
0.0773
0.1149
0.1523
0.1919
MAPE %
2.00
4.93
8.44
13.61
20.31
30.62
R2
0.9687
0.9601
0.9572
0.9457
0.9342
0.9179
https://doi.org/10.1371/journal.pone.0290891.t002
PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023
19 / 37
PLOS ONE
Accurate multi-months ahead drought forecasting
Table 3. Performance indicators of machine learning-based models for Lake Michigan-Huron.
Model
SVR
RF
ELM
RELM-BWO
Performance metrics
Period lead time
1
2
3
4
5
6
MAE
0.0222
0.0361
0.0438
0.0610
0.0806
0.1084
RMSE
0.0275
0.0434
0.0533
0.0751
0.0994
0.1338
MAPE %
6.28
7.41
7.77
10.41
13.31
17.11
R2
0.9709
0.9669
0.9652
0.9752
0.9529
0.9352
MAE
0.0798
0.1159
0.1303
0.0939
0.1093
0.1309
RMSE
0.1151
0.1748
0.1846
0.1162
0.1401
0.1688
MAPE%
15.19
24.11
28.01
18.89
20.01
23.06
R2
0.9608
0.9379
0.9300
0.9568
0.9465
0.9278
MAE
0.0138
0.0259
0.0396
0.0528
0.0696
0.0913
RMSE
0.0175
0.0321
0.0488
0.0652
0.0862
0.1129
MAPE %
3.17
6.31
7.23
8.43
14.37
16.19
R2
0.9737
0.9684
0.9640
0.9597
0.9569
0.9503
MAE
0.0093
0.0209
0.0339
0.0490
0.0670
0.0812
RMSE
0.0117
0.0258
0.0416
0.0611
0.0822
0.1005
MAPE %
2.82
5.76
7.69
8.16
13.44
15.45
R2
0.9825
0.9791
0.9772
0.9736
0.9681
0.9639
https://doi.org/10.1371/journal.pone.0290891.t003
has the lowest forecasting accuracy for all time intervals compared to other models for both
lakes.
The quantitative analysis, in general, shows that the most difficult case for the adopted
models is forecasting drought at a six-period lead time. In other words, forecasting drought for
the next six months is a major challenge for the proposed models because of the loss of important information as well as the sharp changes that occur in drought patterns through such a
period, which was revealed by the analyzes previously reported, so it is necessary to test the
efficacy of the models in this case. Accordingly, scatter plot diagrams, as shown in Figs 8 and 9
for both lakes, are created. Such a diagram allows us to visualize each data record and provides
much information on how the forecasting values deviate from their corresponding actual values. The figures show that the suggested model offers fewer scatter points than comparable
models. It is significant to conclude that the forecasting accuracy of Lake Michigan-Huron is
higher than Lake Superior. The reasons may relate to the training conditions of the forecasting
models. In this study, the models were calibrated using 75% of the drought time series data,
while the remaining 25% of the hydrological drought data was reserved for testing purposes.
The training data for Lake Michigan-Huron includes many positive and negative drought values, and the characteristics of the training data are more similar to the test data compared to
Lake Superior (see Fig 6C and 6F). In more detail, the training data for Lake Superior contained a small percentage (5.88%) of the severe drought data (MSWI � -1), while that value
approximately tripled (16.32%) when it came to Lake Michigan-Huron. To sum it up, the presence of multiple periods of severe drought conditions enhances the chance of obtaining accurate forecasts. Conversely, when such observations are few in the training data, it is normal for
a forecasting model to observe a decrease in accuracy, especially when asked to simulate severe
drought patterns.
Assessing the efficacy of the proposed models is crucial, particularly during the severe
drought period (MSWI � -1). The RE% values for each model are calculated (as illustrated in
Fig 10) across various lead-time intervals to determine the practical benefits of the
RELM-BWO model concerning its forecasting precision and performance compared to other
PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023
20 / 37
PLOS ONE
Accurate multi-months ahead drought forecasting
Fig 8. A comparison of observed MSWI and forecast values throughout a six-lead period time for Lake Superior.
https://doi.org/10.1371/journal.pone.0290891.g008
models. The proposed RELM-BWO model generally outperforms other comparable methods,
with the median relative error (depicted as a horizontal line within the boxplot) being lower.
5.1.2 Analyzing forecasts of turning points. The performance metrics listed in Tables 2
and 3 evaluate a model’s overall performance by considering all predicted points. However,
when forecasting drought, metrics such as RMSE, R2, and others may not be ineffective indicators due to high auto-correlation. An overly smoothed time series is unsuitable for testing a
prediction model’s effectiveness or robustness due to high auto-correlation. This can lead to
R2 values exceeding 0.9 for all models, which may not be significant for drought forecasting
PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023
21 / 37
PLOS ONE
Accurate multi-months ahead drought forecasting
Fig 9. A comparison of observed MSWI and forecast values throughout a six-lead period time for Lake Michigan-Huron.
https://doi.org/10.1371/journal.pone.0290891.g009
unless the model can accurately forecast turning points. Hence, assessing the error in turning
points of the drought time series data is critical to gauge the effectiveness of models used in
forecasting drought.
The primary objective of this study’s method was to evaluate the model’s ability to accurately simulate turning points in the drought data. An illustration of the process for identifying
critical and turning points in time series data can be found in Fig 11A. Once the these points
were detected, the corresponding observations from multiple models were collected. The final
step involved calculating the average absolute error of the turning and critical points (AAETP)
PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023
22 / 37
PLOS ONE
Accurate multi-months ahead drought forecasting
Fig 10. Relative error showing the forecasting accuracy of SVR, RF, ELM and RELM-BWO models over different lead times for Lake Superior and Lake
Michigan-Huron.
https://doi.org/10.1371/journal.pone.0290891.g010
for each model separately. It is significant to mention that the suggested process was conducted
for both lakes to assess the model’s predictability in forecasting hydrological drought up to six
months in advance.
AAETP ¼ average ðjEi jÞ
ð37Þ
where Ei is a vector is computed for each model by subtracting the actual values of the critical
and turning points from the values forecasted by the models.
In Fig 11B and 11C, it was demonstrated that the RELM-BWO model outperformed other
models in forecasting turning points and yielded fewer AAETP values. The RELM-BWO model’s effectiveness is measured by its ability to decrease the AAETP in drought forecast for both
lakes at different lead times. The study’s findings show that the RELM-BWO model outperforms traditional models like SVR, RF, and ELM for Lake Superior, resulting in a 37.33%,
76.74%, and 32.21% improvement in forecasting accuracy. However, the accuracy of drought
forecasting for Lake Michigan-Huron decreases slightly, but the RELM-BWO model still outperforms SVR, RF, and ELM by reducing forecasting errors by 16.18%, 56.27%, and 22.93%,
respectively.
5.2 Second Scenario: Validation of the hybrid models
This section presents a comparison of four hybrid models, namely RELM-BWO, SVR-BWO,
RF-BWO, and ELM-BWO, to forecast drought conditions over one to six months in advance
for the two largest lakes. The performance of each model in forecasting drought, considering
various lead times, is demonstrated in Table 4. The statistical analysis revealed that the
PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023
23 / 37
PLOS ONE
Accurate multi-months ahead drought forecasting
Fig 11. Analysis of the errors in the turning and critical points at different lead time forecasting. a) an example of detecting the turning points,
b) results of AAETP for Lake Superior, and c) results of AAETP for Lake Michigan-Huron.
https://doi.org/10.1371/journal.pone.0290891.g011
RELM-BWO model outperforms other models in terms of drought forecasting, exhibiting the
lowest values for MAE (ranging from 0.0156 to 0.1524), RMSE (ranging from 0.020 to 0.191),
MAPE (ranging from 2 to 30.62%), and highest R2 (ranging from 0.9179 to 0.9687) for Lake
Superior. Similarly, for Lake Michigan-Huron, RELM-BWO demonstrates the lowest values
PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023
24 / 37
PLOS ONE
Accurate multi-months ahead drought forecasting
Table 4. Performance indicators of the hybrid machine learning-based models.
Model
ELM-BWO
SVR-BWO
RF-BWO
RELM-BWO
Lake Superior
Lead Time
MAE
RMSE
MAPE%
t+1
0.0194
0.0238
2.34
t+2
0.0381
0.0576
5.62
Lake Michigan-Huron
R2
AAETP
MAE
RMSE
MAPE%
R2
0.9602
0.0165
0.0119
0.0156
2.94
0.9801
0.0140
0.9579
0.0335
0.0233
0.0293
6.07
0.9745
0.0275
AAETP
t+3
0.0701
0.0942
9.06
0.9434
0.0450
0.0370
0.0463
7.12
0.9730
0.0427
t+4
0.0865
0.1218
12.63
0.9291
0.0793
0.0508
0.0647
8.38
0.9703
0.0441
t+5
0.1323
0.1659
25.90
0.9239
0.0949
0.0684
0.0831
14.28
0.9666
0.0581
t+6
0.1675
0.2073
35.35
0.9101
0.1095
0.0891
0.1097
15.97
0.9602
0.0655
0.0420
Overall
0.0857
0.1118
15.15
0.9374
0.0631
0.0468
0.0581
9.13
0.9708
t+1
0.0432
0.0686
4.04
0.9524
0.0384
0.0175
0.0216
3.85
0.9794
0.0203
t+2
0.0557
0.0736
5.96
0.9502
0.0496
0.0268
0.0330
5.59
0.9744
0.0309
t+3
0.0709
0.0942
12.23
0.9473
0.0572
0.0403
0.0498
7.73
0.9725
0.0391
t+4
0.0684
0.1188
11.15
0.9446
0.086
0.0592
0.0725
10.01
0.9694
0.0411
t+5
0.1505
0.1848
24.83
0.9323
0.0985
0.0768
0.0943
13.32
0.9649
0.0489
t+6
0.1848
0.2136
32.45
0.9091
0.1412
0.1020
0.1252
17.48
0.9573
0.0609
Overall
0.0956
0.1256
15.11
0.9393
0.0785
0.0538
0.0661
9.66
0.9697
0.0402
t+1
0.0626
0.0844
10.84
0.9514
0.0545
0.0280
0.0352
7.63
0.9754
0.0280
t+2
0.1026
0.1282
17.65
0.9484
0.0824
0.0454
0.0564
11.78
0.9739
0.0433
t+3
0.1329
0.1700
22.48
0.9442
0.0804
0.0661
0.0818
13.74
0.9711
0.0601
t+4
0.1907
0.2324
33.72
0.9404
0.1387
0.0860
0.1089
14.41
0.9665
0.0698
t+5
0.2230
0.2690
37.27
0.9232
0.1564
0.1062
0.1376
16.98
0.9599
0.0792
t+6
0.2713
0.3287
46.08
0.8907
0.1785
0.1249
0.1604
24.65
0.9487
0.0883
Overall
0.1639
0.2021
0.0761
0.0967
0.0615
28.01
0.9331
0.1152
14.87
0.9659
t+1
0.0156
0.0202
2.00
0.9687
0.0135
0.0093
0.0117
2.82
0.9825
0.0112
t+2
0.0371
0.0489
4.93
0.9601
0.0262
0.0209
0.0258
5.76
0.9791
0.0228
t+3
0.0587
0.0773
8.44
0.9572
0.0350
0.0339
0.0416
7.69
0.9772
0.0383
t+4
0.0886
0.1149
13.61
0.9457
0.0652
0.0490
0.0611
8.16
0.9736
0.0437
t+5
0.1191
0.1523
20.31
0.9342
0.0865
0.0670
0.0822
13.44
0.9681
0.0494
t+6
0.1524
0.1910
30.62
0.9179
0.0870
0.0812
0.1005
15.45
0.9639
0.0582
Overall
0.0786
0.1008
13.32
0.9473
0.0522
0.0436
0.0538
8.89
0.9741
0.0373
https://doi.org/10.1371/journal.pone.0290891.t004
for MAE (ranging from 0.0093 to 0.0812), RMSE (ranging from 0.0117 to 0.1005), MAPE
(ranging from 2.82 to 15.45%), and highest R2 (ranging from 0.9639 to 9825). Overall, the
RELM-BWO model is considered the best-performing model, followed by ELM-BWO,
SVR-BWO and RF-BWO, respectively.
The RELM-BWO model’s accuracy has been tested against other hybrid models by measuring its ability to reduce the AAETP indicator for various lead times in the selected lakes. The
overall results, which are presented in Table 4, demonstrate that the RELM-BWO model outperforms the ELM-BWO, SVR-BWO, and RF-BWO models, respectively, for Lake Superior.
This results in a significant improvement in forecasting accuracy of 17.27%, 29.65%, and
54.69%. Similarly, for Lake Michigan-Huron, the RELM-BWO model was able to significantly
decrease the AAETP indicator percentage values, resulting in an 11.19%, 7.21%, and 39.35%
improvement compared to the ELM-BWO, SVR-BWO, and RF-BWO models, respectively.
The accuracy of drought forecasting six months in advance, as exemplified by the hybrid
models, has been visually analyzed using Taylor plots (refer to Fig 12) and Violin plots (refer
to Fig 13). Taylor plots reveal that the RELM-BWO model demonstrates a closer standard
deviation to the calculated drought than other models, with the highest R2 value and the lowest
RMSE for both lakes. Evaluating the effectiveness of these models, especially during critical
PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023
25 / 37
PLOS ONE
Accurate multi-months ahead drought forecasting
Fig 12. Comparing hybrid models in forecasting MSWI using Taylor diagrams. a) Lake Superior and b) Lake Michigan-Huron.
https://doi.org/10.1371/journal.pone.0290891.g012
drought periods (MSWI � -1), is essential. The RE% values of each model have been calculated
(as depicted in Fig 13) to ascertain the practical advantages of the RELM-BWO model in terms
of its forecasting accuracy and performance compared to other models. The proposed
RELM-BWO model generally outperforms hybrid models, with a lower median relative error
(indicated by the white hollow point near the Figure’s centre) and a smaller interquartile range
(a grey line at the Figure’s core).
5.3 Validation of RELM-BWO: A comparative analysis against previous
models in literature
A more thorough analysis is needed to confirm the superiority of the proposed RELM-BWO
model in forecasting drought one month ahead. This requires verifying the accuracy of the
RELM-BWO model by comparing its performance with effective models developed in previous studies. One such study employed sophisticated models that combined wavelet transform
(WA) with ANFIS, resulting in the WANFIS model [103]. The study’s conclusion indicates
that the WANFIS model performs satisfactorily in forecasting mid-term drought, achieving
the highest accuracy with an R2 value of 0.9133. Another study examined a model created by
merging WA and ANN, which achieved an impressive R2 of 0.938 [104]. Additionally, an
innovative approach was developed by combining LSTM and ARIMA models to enhance the
accuracy of hydrological drought forecasts [105]. The integration of these models resulted in a
hybrid model that achieved an impressive R2 score exceeding 0.91. Furthermore, a separate
study utilized an advanced forecasting model that combined Online Sequential ELM with two
decomposing filters, namely "Kalman filter regression and coupled with the boundary corrected maximal overlap discrete wavelet transform" [1], to forecast drought in Iran and the
model demonstrated a high accuracy with an R2 of 0.93. Furthermore, a hybrid model
PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023
26 / 37
PLOS ONE
Accurate multi-months ahead drought forecasting
Fig 13. Comparing hybrid models in forecasting MSWI using Violin plots. a) Lake Superior and b) Lake Michigan-Huron.
https://doi.org/10.1371/journal.pone.0290891.g013
incorporating ANN, SARIMA, and a genetic algorithm was employed to forecast hydrological
drought, yielding a good R2 of 0.945 [106].
Other researchers have introduced advanced models using Fuzzy-SVR, Boosted-SVR, and
classical SVR for drought forecasting, with the Fuzzy-SVR model performing the best with an
R2 of 0.903 [107]. Additionally, other researchers have examined individual models for drought
forecasting, such as ANN, achieving good results (R2 > 0.9) [62]. Lastly, a sophisticated combined model was developed by integrating WA, ARIMA, and ANN, resulting in a higher R2 of
0.872 [108]. In summary, the models assessed demonstrated satisfactory accuracy, with R2 values ranging from 0.872 to 0.945. However, these models’ forecasting accuracy remains lower
than that of the RELM-SO model, which boasts a superior forecasting accuracy (R2 > 0.968).
5.4 Results of uncertainty analysis (UA)
According to the previous discussions and the results presented, it can be confirmed that the
RELM-BWO has the best performance among the employed models, obtaining the most accurate forecasts. Even though this model is superior, the forecasting accuracy gradually decreases
when the lead time increases. Hence, it is critical to determine the maximum lead time for reliable drought forecasting that the proposed model can provide. Therefore, it is necessary to
conduct the uncertainty analysis to ensure the reliability and validity of the results obtained.
UA results provide much information on how the variations of forecasting results take
place due to the input variations. The robust model has solid stability in its outcomes when
PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023
27 / 37
PLOS ONE
Accurate multi-months ahead drought forecasting
Fig 14. Uncertainty analysis showing the confidence intervals of drought index for A) Lake Superior, and B) Lake Michigan-Huron.
https://doi.org/10.1371/journal.pone.0290891.g014
there are a few chances in the input variables. In this study, the uncertainty analysis was performed for the best models (RELM-BWO) throughout the testing phase. The values bracketed
by lower and upper 95PPU for both lakes exceeded 88.5% (see Fig 14). Notably if the value
bracketed by 95PPU is more than 80%, the models have produced good forecasting [109].
However, Fig 15 shows that the d-factor values generally have a gradual increase when the lead
time increases. For Lake Superior, the model provides excellent forecasting for the one-and
two-month step-ahead drought because the forecasting results have less uncertainty with d-factor values ranging from 0.493 to 0.595. Conversely, the model provides a higher uncertainty
(d-factor is more than one) in three-, four-, five, and six-month ahead, indicating that the
drought forecasting results in these time intervals are not good, and the model performs
poorly. According to the same Figure, it can be observed that the forecasting accuracy of Lake
Michigan-Huron is slightly better than Lake Superior because it provides a lower value of the
d-factor, supporting outcomes of previously described statistical indicators. The slight discrepancy in the accuracy of drought prediction may be due to the different hydrological characteristics of the two studied lakes. Other factors, such as the volume, surface area of the studied
lakes, and regulation operations, may affect the fluctuation of the levels of those lakes and consequently influence the drought patterns. It can be said that the suggested model has a solid
level of accuracy when forecasting a drought of one-, two-, and three-month ahead because
the d-factor (less than one) ranges from 0.403 to 0.885.
Forecasting hydrological drought for extended time horizons is challenging due to various
factors that can increase uncertainty and errors in the forecast. As the forecasting horizon
increases, the relationship between inputs and outputs becomes more complicated, hindering
prediction accuracy. This increasing complexity often results in higher levels of uncertainty,
making it more challenging to predict future values accurately. Other factors impacting forecast accuracy include changes in underlying data patterns, seasonal effects, and unexpected
events that affect data patterns.
PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023
28 / 37
PLOS ONE
Accurate multi-months ahead drought forecasting
Fig 15. Values of d-factor showing the reliable accuracy of forecasted drought event based on lead time intervals for Lake Superior and Lake
Michigan-Huron.
https://doi.org/10.1371/journal.pone.0290891.g015
5.5 Investigating the use of RELM-BWO for forecasting drought in Lake
Ontario, Lake Erie, and Lake St. Clair
It was also demonstrated that the proposed model (RELM-BWO) effectively forecasts drought
conditions in two greatest lakes. Therefore, it is crucial to utilize this model for forecasting
droughts in the remaining three lakes (Lake Ontario, Lake Erie, and Lake St. Clair) that have
not experienced severe droughts in the past fifty years. The performance metrics of the model’s
drought forecasts, ranging from one to six months in advance for all lakes, are provided in
Table 5. The quantitative results indicate that the model exhibits the highest accuracy in forecasting droughts for Lake St. Clair (MAE ranging from 0.0184 to 0.1139, RMSE ranging from
0.0240 to 0.1383, and MAPE ranging from 5.96 to 33.75%). The model also demonstrates the
lowest AAETP values (ranging from 0.0178 to 0.1091) and the highest R2 values of 0.9905 to
0.9635). The model’s accuracy is satisfactory for the other two lakes, with Lake Ontario exhibiting the least accurate forecasts compared to the other lakes due to higher uncertainty values.
Uncertainty analysis shows that the model reliably forecasts droughts up to six months in
PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023
29 / 37
PLOS ONE
Accurate multi-months ahead drought forecasting
Table 5. Performance indicators of the hybrid model for Lake Ontario, Lake Erie, and Lake St. Clair, the use of bold numbers indicates that the forecasting is
deemed unreliable due to significant uncertainty in the projected values.
Lake/ lead time forecasting
Lake Ontario
Lake Erie
Lake St. Clair
Statistical Parameter
Uncertainty Analysis
2
Interval
MAE
RMSE
MAPE%
R
AAETP
PPU95
t+1
0.0260
0.0327
15.31
0.9853
0.0209
85.62
0.4271
t+2
0.0574
0.0717
24.15
0.9701
0.0347
69.23
0.59.15
d-factor
t+3
0.0896
0.1153
45.46
0.9383
0.0592
61.74
0.6545
t+4
0.1276
0.1603
65.74
0.8906
0.1026
49.33
0.6704
t+5
0.1542
0.1946
95.46
0.8446
0.1432
48.32
0.7790
t+6
0.1805
0.2287
114.47
0.7974
0.1703
57.38
1.0757
t+1
0.0207
0.0269
17.33
0.9891
0.0189
0.88
0.2666
t+2
0.0400
0.0499
22.38
0.9871
0.0304
79.60
0.3746
t+3
0.0626
0.0774
55.53
0.9830
0.0495
80.27
0.3753
t+4
0.0849
0.1036
88.54
0.9776
0.0659
67.45
0.5009
t+5
0.0997
0.1239
110.82
0.9719
0.0860
68.46
0.6170
t+6
0.1265
0.1515
171.17
0.9634
0.1134
65.77
0.6624
t+1
0.0184
0.0240
5.96
0.9905
0.0178
86.96
0.2524
t+2
0.0345
0.0442
11.15
0.9884
0.0219
84.28
0.3722
t+3
0.0537
0.0673
17.97
0.9841
0.0421
72.48
0.4156
t+4
0.0716
0.0902
23.54
0.9784
0.0758
68.12
0.4491
t+5
0.0929
0.1146
28.31
0.9723
0.0963
64.77
0.4667
t+6
0.1139
0.1383
33.75
0.9635
0.1091
64.09
0.5195
https://doi.org/10.1371/journal.pone.0290891.t005
advance, except for Lake Ontario, where dependable forecasts are generated only up to three
months ahead.
5.6 Discussion
The study’s results indicate that including the applied metaheuristic algorithm (BWO) significantly improves the accuracy of classical approaches like RF, SVR, and ELM in drought forecasting across different lead times. These results are consistent with previous research [51,
110], indicating that employing hybridized machine learning methods improves forecast accuracy and overall predictive performance. By utilizing the BWO algorithm to optimize hyperparameters of standalone models, the forecasting of MSWI in the studied lakes becomes more
accurate. Both statistical indices and visual inspections confirm that the BWO algorithm significantly enhances the accuracy of all models when forecasting hydrological drought based on
MSWI. The BWO algorithm provides precise parameters to the models, resulting in an
increase in their generalization capacity. The RELM-BWO model emerges as the most effective
among the hybrid models tested. The study illustrates that the optimal computation of the regularization parameter using the BWO algorithm has made the RELM-BWO model superior to
the ELM-BWO model. By enhancing accuracy and performance, the BWO algorithm solidifies
the RELM-BWO model as the top-performing hybrid model for MSWI-based drought
forecasting.
As the forecasting lead time increases, the accuracy of all hybrid models gradually decreases,
making it challenging to select reliable forecasts. To address this, uncertainty analysis proves to
be an effective tool for identifying reliable forecasts. The study reveals that some lakes can reliably forecast drought up to 2 months ahead, while others require 3- or 6-months lead time.
The varying accuracy may be attributed to the data quality, indicating that models trained with
different drought conditions yield more accurate forecasting results.
PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023
30 / 37
PLOS ONE
Accurate multi-months ahead drought forecasting
6. Conclusion
This study utilized four hybrid ML-based models (SVR-BWO, ELM-BWO, RF-BWO, and
RELM-BWO) and three classical models (ELM, RF, and SVR) to forecast drought in the Great
Lakes region for several months in advance (1 to 6 months). Drought calculations were performed via MSWI using water level data from 1918 to 2020. The dataset was split into a training set (December 1921 to January 1997) and a testing set (February 1997 to December 2021)
for model development. Over the past 50 years, severe drought events have been limited to
Lake Superior and Lake Michigan-Huron. Thus, the selected models were employed to forecast droughts for several months in advance, specifically for these two lakes, and the best
model was used to forecast the drought for the other three lakes. To assess the models’ ability
to capture turning points, a new indicator called AAETP was introduced, as traditional criteria
(RMSE and R2) were sometimes insufficient due to high auto-correlation in the drought dataset. The hybrid models, with BWO optimization, outperformed the classical models. Among
them, RELM-BWO was the most accurate and reliable model. Evaluating the recommended
model’s effectiveness in reducing AAETP for various lead times, it was observed that, in the
case of Lake Superior, it achieved substantial improvements of 37.33%, 76.74%, and 32.21%
compared to SVR, RF, and ELM, respectively. Similarly, for Lake Michigan-Huron, the
RELM-BWO model demonstrated reductions of 16.18%, 56.27%, and 22.93% in AAETP compared to SVR, RF, and ELM, respectively. Also, the RELM-BWO consistently outperformed
other hybrid models, with enhancements of 17.27% to 54.69% for Lake Superior and 7.21% to
39.35% for Lake Michigan-Huron.
Generally, as the forecast lead time increased, all models’ accuracy declined. However, the
proposed model showed a slightly smaller accuracy deterioration than the others. To ensure
the reliability of the forecasting results, uncertainty analysis was conducted using d-factor and
95PPU criteria. The analysis exhibited that RELM-BWO reliably forecasts droughts up to two
months in advance for Lake Superior and up to three months ahead for Lake MichiganHuron. Furthermore, uncertainty analysis confirmed that RELM-BWO can estimate droughts
up to six months in advance for Lake Erie and Lake St. Clair, and up to three months ahead for
Lake Ontario.
Supporting information
S1 Appendix.
(DOCX)
S1 File. Water level data.
(CSV)
S2 File. Droughts.
(XLSX)
Acknowledgments
The authors would like to thank the Universiti Kebangsaan Malaysia for supporting this
research.
Author Contributions
Conceptualization: Mohammed Majeed Hameed, Siti Fatin Mohd Razali.
PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023
31 / 37
PLOS ONE
Accurate multi-months ahead drought forecasting
Data curation: Siti Fatin Mohd Razali, Wan Hanna Melini Wan Mohtar, Norinah Abd
Rahman.
Formal analysis: Wan Hanna Melini Wan Mohtar.
Funding acquisition: Norinah Abd Rahman.
Investigation: Wan Hanna Melini Wan Mohtar, Zaher Mundher Yaseen.
Methodology: Mohammed Majeed Hameed.
Project administration: Siti Fatin Mohd Razali.
Software: Mohammed Majeed Hameed.
Validation: Siti Fatin Mohd Razali, Wan Hanna Melini Wan Mohtar, Zaher Mundher Yaseen.
Visualization: Mohammed Majeed Hameed, Norinah Abd Rahman.
Writing – original draft: Mohammed Majeed Hameed, Siti Fatin Mohd Razali.
Writing – review & editing: Zaher Mundher Yaseen.
References
1.
Jamei M, Ahmadianfar I, Karbasi M, Malik A, Kisi O, Yaseen ZM. Development of wavelet-based Kalman Online Sequential Extreme Learning Machine optimized with Boruta-Random Forest for drought
index forecasting. Eng Appl Artif Intell. 2023; 117: 105545. https://doi.org/10.1016/j.engappai.2022.
105545
2.
Kafy A- Al, Bakshi ASaha, Faisal A Al, Almulhim AI, Rahaman ZA, et al. Assessment and prediction of
index based agricultural drought vulnerability using machine learning algorithms. Sci Total Environ.
2023; 867: 161394. https://doi.org/10.1016/j.scitotenv.2023.161394 PMID: 36634773
3.
Fathi-Taperasht A, Shafizadeh-Moghadam H, Minaei M, Xu T. Influence of drought duration and
severity on drought recovery period for different land cover types: evaluation using MODIS-based indices. Ecol Indic. 2022; 141: 109146. https://doi.org/10.1016/j.ecolind.2022.109146
4.
Hasan HH, Razali SF, Muhammad NS, Ahmad A. Modified Hydrological Drought Risk Assessment
Based on Spatial and Temporal Approaches. Sustainability. 2022. https://doi.org/10.3390/
su14106337
5.
Malik A, Kumar A, Salih SQ, Kim S, Kim NW, Yaseen ZM, et al. Drought index prediction using
advanced fuzzy logic model: Regional case study over Kumaon in India. PLoS One. 2020; 15:
e0233280. https://doi.org/10.1371/journal.pone.0233280 PMID: 32437386
6.
Halder B, Haghbin M, Farooque AA. An Assessment of Urban Expansion Impacts on Land Transformation of Rajpur-Sonarpur Municipality. Knowledge-Based Eng Sci. 2021; 2: 34–53.
7.
Ghebrezgabher MG, Yang T, Yang X, Wang C. Assessment of desertification in Eritrea: land degradation based on Landsat images. J Arid Land. 2019; 11: 319–331.
8.
Long-term Mishra V. (1870–2018) drought reconstruction in context of surface water security in India.
J Hydrol. 2020; 580: 124228. https://doi.org/10.1016/j.jhydrol.2019.124228
9.
Awadh SM, Al-Mimar H, Yaseen ZM. Groundwater availability and water demand sustainability over
the upper mega aquifers of Arabian Peninsula and west region of Iraq. Environment, Development
and Sustainability. 2020. https://doi.org/10.1007/s10668-019-00578-z
10.
Bandyopadhyay J, Karar D. Modeling fire station establishment of industrial area using geo-spatial science. Knowledge-Based Eng Sci. 2023; 4: 19–36.
11.
Littell JS, Peterson DL, Riley KL, Liu Y, Luce CH. A review of the relationships between drought and
forest fire in the United States. Glob Chang Biol. 2016; 22: 2353–2369. https://doi.org/10.1111/gcb.
13275 PMID: 27090489
12.
Richardson D, Black AS, Irving D, Matear RJ, Monselesan DP, Risbey JS, et al. Global increase in
wildfire potential from compound fire weather and drought. npj Clim Atmos Sci. 2022; 5: 23. https://doi.
org/10.1038/s41612-022-00248-4
13.
Obasi GOP. WMO’s role in the international decade for natural disaster reduction. Bull Am Meteorol
Soc. 1994; 75: 1655–1661.
PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023
32 / 37
PLOS ONE
Accurate multi-months ahead drought forecasting
14.
Wilhite DA, Glantz MH. Understanding: the drought phenomenon: the role of definitions. Water Int.
1985; 10: 111–120.
15.
Heim RR Jr. A review of twentieth-century drought indices used in the United States. Bull Am Meteorol
Soc. 2002; 83: 1149–1166.
16.
Beyaztas U, Yaseen ZM. Drought interval simulation using functional data analysis. J Hydrol. 2019;
579: 124141. https://doi.org/10.1016/j.jhydrol.2019.124141
17.
Hasan HH, Mohd Razali SF, Muhammad NS, Ahmad A. Research Trends of Hydrological Drought: A
Systematic Review. Water. 2019. https://doi.org/10.3390/w11112252
18.
Hosseini TSM, Hosseini SA, Ghermezcheshmeh B, Sharafati A. Drought hazard depending on elevation and precipitation in Lorestan, Iran. Theor Appl Climatol. 2020. https://doi.org/10.1007/s00704020-03386-y
19.
McKee TB, Doesken NJ, Kleist J. The relationship of drought frequency and duration to time scales.
Proceedings of the 8th Conference on Applied Climatology. Boston, MA, USA; 1993. pp. 179–183.
20.
Abd Alraheem E, Jaber NA, Jamei M, Tangang F. Assessment of Future Meteorological Drought
Under Representative Concentration Pathways (RCP8. 5) Scenario: Case Study of Iraq. KnowledgeBased Eng Sci. 2022; 3: 64–82.
21.
Kumar S, Ahmed SA, Harishnaika N, Arpitha M. Spatial and Temporal Pattern Assessment of Meteorological Drought in Tumakuru District of Karnataka during 1951–2019 using Standardized Precipitation Index. J Geol Soc India. 2022; 98: 822–830. https://doi.org/10.1007/s12594-022-2073-3
22.
Shamshirband S, Hashemi S, Salimi H, Samadianfard S, Asadi E, Shadkani S, et al. Predicting Standardized Streamflow index for hydrological drought using machine learning models. Eng Appl Comput
Fluid Mech. 2020; 14: 339–350. https://doi.org/10.1080/19942060.2020.1715844
23.
Aghelpour P, Bahrami-Pichaghchi H, Varshavian V. Hydrological drought forecasting using multi-scalar streamflow drought index, stochastic models and machine learning approaches, in northern Iran.
Stoch Environ Res Risk Assess. 2021; 35: 1615–1635. https://doi.org/10.1007/s00477-020-01949-z
24.
Nabipour N, Dehghani M, Mosavi A, Shamshirband S. Short-Term Hydrological Drought Forecasting
Based on Different Nature-Inspired Optimization Algorithms Hybridized with Artificial Neural Networks.
IEEE Access. 2020; 8: 15210–15222. https://doi.org/10.1109/ACCESS.2020.2964584
25.
Babre A, Kalvāns A, Avotniece Z, Retiķe I, Bikše J, Jemeljanova KPM, et al. The use of predefined
drought indices for the assessment of groundwater drought episodes in the Baltic States over the
period 1989–2018. J Hydrol Reg Stud. 2022; 40: 101049. https://doi.org/10.1016/j.ejrh.2022.101049
26.
Kubiak-Wójcicka K, Pilarska A, Kamiński D. The Analysis of Long-Term Trends in the Meteorological
and Hydrological Drought Occurrences Using Non-Parametric Methods—Case Study of the Catchment of the Upper Noteć River (Central Poland). Atmosphere. 2021. https://doi.org/10.3390/
atmos12091098
27.
Salimian N, Nazari S, Ahmadi A. Assessment of the uncertainties of global climate models in the evaluation of standardized precipitation and runoff indices: a case study. Hydrol Sci J. 2021; 66: 1419–
1436. https://doi.org/10.1080/02626667.2021.1937178
28.
Achite M, Jehanzaib M, Elshaboury N, Kim T-W. Evaluation of Machine Learning Techniques for
Hydrological Drought Modeling: A Case Study of the Wadi Ouahrane Basin in Algeria. Water. 2022.
https://doi.org/10.3390/w14030431
29.
Jehanzaib M, Bilal Idrees M, Kim D, Kim T-W. Comprehensive Evaluation of Machine Learning Techniques for Hydrological Drought Forecasting. J Irrig Drain Eng. 2021;147. https://doi.org/10.1061/
(ASCE)IR.1943-4774.0001575
30.
Wang GC, Zhang Q, Band SS, Dehghani M, wing Chau K, Tho QT, et al. Monthly and seasonal hydrological drought forecasting using multiple extreme learning machine models. Eng Appl Comput Fluid
Mech. 2022; 16: 1364–1381. https://doi.org/10.1080/19942060.2022.2089732
31.
Henao Casas JD, Fernández Escalante E, Ayuga F. Alleviating drought and water scarcity in the Mediterranean region through managed aquifer recharge. Hydrogeol J. 2022; 30: 1685–1699. https://doi.
org/10.1007/s10040-022-02513-5
32.
Vélez-Nicolás M, Garcı́a-López S, Ruiz-Ortiz V, Zazo S, Molina JL. Precipitation Variability and
Drought Assessment Using the SPI: Application to Long-Term Series in the Strait of Gibraltar Area.
Water. 2022. https://doi.org/10.3390/w14060884
33.
Mishra AK, Desai VR. Drought forecasting using feed-forward recursive neural network. Ecol Modell.
2006; 198: 127–138. https://doi.org/10.1016/j.ecolmodel.2006.04.017
34.
Mossad A, Alazba AA. Drought forecasting using stochastic models in a hyper-arid climate. Atmosphere. 2015. pp. 410–430. https://doi.org/10.3390/atmos6040410
PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023
33 / 37
PLOS ONE
Accurate multi-months ahead drought forecasting
35.
Borji M, Malekian A, Salajegheh A, Ghadimi M. Multi-time-scale analysis of hydrological drought forecasting using support vector regression (SVR) and artificial neural networks (ANN). Arab J Geosci.
2016; 9: 725. https://doi.org/10.1007/s12517-016-2750-x
36.
Deo RC, Salcedo-Sanz S, Carro-Calvo L, Saavedra-Moreno B. Chapter 10—Drought Prediction With
Standardized Precipitation and Evapotranspiration Index and Support Vector Regression Models. In:
Samui P, Kim D, Ghosh CBT-IDS and M, editors. Elsevier; 2018. pp. 151–174. https://doi.org/10.
1016/B978-0-12-812056-9.00010-5
37.
Tian Y, Xu Y-P, Wang G. Agricultural drought prediction using climate indices based on Support Vector Regression in Xiangjiang River basin. Sci Total Environ. 2018;622–623: 710–720. https://doi.org/
10.1016/j.scitotenv.2017.12.025 PMID: 29223897
38.
Mokhtarzad M, Eskandari F, Jamshidi Vanjani N, Arabasadi A. Drought forecasting by ANN, ANFIS,
and SVM and comparison of the models. Environ Earth Sci. 2017; 76: 729. https://doi.org/10.1007/
s12665-017-7064-0
39.
Ali M, Deo RC, Downs NJ, Maraseni T. An ensemble-ANFIS based uncertainty assessment model for
forecasting multi-scalar standardized precipitation index. Atmos Res. 2018; 207: 155–180. https://doi.
org/10.1016/j.atmosres.2018.02.024
40.
Choubin B, Zehtabian G, Azareh A, Rafiei-Sardooi E, Sajedi-Hosseini F, Kişi Ö. Precipitation forecasting using classification and regression trees (CART) model: a comparative study of different
approaches. Environ Earth Sci. 2018; 77: 314. https://doi.org/10.1007/s12665-018-7498-z
41.
Achour K, Meddi M, Zeroual A, Bouabdelli S, Maccioni P, Moramarco T. Spatio-temporal analysis and
forecasting of drought in the plains of northwestern Algeria using the standardized precipitation index.
J Earth Syst Sci. 2020; 129: 42. https://doi.org/10.1007/s12040-019-1306-3
42.
Dikshit A, Pradhan B, Alamri AM. Temporal hydrological drought index forecasting for New South
Wales, Australia using machine learning approaches. Atmosphere (Basel). 2020;11. https://doi.org/
10.3390/atmos11060585
43.
Hakam O, Baali A, Belhaj Ali A. Modeling drought-related yield losses using new geospatial technologies and machine learning approaches: case of the Gharb plain, North-West Morocco. Model Earth
Syst Environ. 2023; 9: 647–667. https://doi.org/10.1007/s40808-022-01523-2
44.
Griebel A, Metzen D, Pendall E, Nolan RH, Clarke H, Renchon AA, et al. Recovery from Severe Mistletoe Infection After Heat- and Drought-Induced Mistletoe Death. Ecosystems. 2022; 25: 1–16. https://
doi.org/10.1007/s10021-021-00635-7
45.
Vo TQ, Kim S-H, Nguyen DH, Bae D-H. LSTM-CM: a hybrid approach for natural drought prediction
based on deep learning and climate models. Stoch Environ Res Risk Assess. 2023; 37: 2035–2051.
https://doi.org/10.1007/s00477-022-02378-w
46.
Dikshit A, Pradhan B, Alamri AM. Short-Term Spatio-Temporal Drought Forecasting Using Random
Forests Model at New South Wales, Australia. Applied Sciences. 2020. https://doi.org/10.3390/
app10124254
47.
Feng P, Wang B, Luo J-J, Liu DL, Waters C, Ji F, et al. Using large-scale climate drivers to forecast
meteorological drought condition in growing season across the Australian wheatbelt. Sci Total Environ. 2020;724. https://doi.org/10.1016/j.scitotenv.2020.138162 PMID: 32247977
48.
Hameed MM, Abed MA, Al-Ansari N, Alomar MK. Predicting Compressive Strength of Concrete Containing Industrial Waste Materials: Novel and Hybrid Machine Learning Model. Yin D, editor. Adv Civ
Eng. 2022; 2022: 5586737. https://doi.org/10.1155/2022/5586737
49.
Mamata RC, Ramlia A, Yazidb MRM, Kasab A, Razalib SFM, Bastamc MN. Slope Stability Prediction
of Road Embankment using Artificial Neural Network Combined with Genetic Algorithm. J Kejuruter.
2022; 34: 165–173.
50.
Kisi O, Docheshmeh Gorgij A, Zounemat-Kermani M, Mahdavi-Meymand A, Kim S. Drought forecasting using novel heuristic methods in a semi-arid environment. J Hydrol. 2019; 578: 124053. https://doi.
org/10.1016/j.jhydrol.2019.124053
51.
Malik A, Tikhamarine Y, Souag-Gamane D, Rai P, Sammen SS, Kisi O. Support vector regression
integrated with novel meta-heuristic algorithms for meteorological drought prediction. Meteorol Atmos
Phys. 2021; 133: 891–909. https://doi.org/10.1007/s00703-021-00787-0
52.
Aghelpour P, Mohammadi B, Mehdizadeh S, Bahrami-Pichaghchi H, Duan Z. A novel hybrid dragonfly
optimization algorithm for agricultural drought prediction. Stoch Environ Res Risk Assess. 2021.
https://doi.org/10.1007/s00477-021-02011-2
53.
Danandeh Mehr A, Vaheddoost B, Mohammadi B. ENN-SA: A novel neuro-annealing model for multistation drought prediction. Comput Geosci. 2020; 145: 104622. https://doi.org/10.1016/j.cageo.2020.
104622
PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023
34 / 37
PLOS ONE
Accurate multi-months ahead drought forecasting
54.
Rahmati O, Panahi M, Kalantari Z, Soltani E, Falah F, Dayal KS, et al. Capability and robustness of
novel hybridized models used for drought hazard modeling in southeast Queensland, Australia. Sci
Total Environ. 2020; 718: 134656. https://doi.org/10.1016/j.scitotenv.2019.134656 PMID: 31839310
55.
AlOmar MK, Khaleel F, AlSaadi AA, Hameed MM, AlSaadi MA, Al-Ansari N. The Influence of Data
Length on the Performance of Artificial Intelligence Models in Predicting Air Pollution. Rathnayake U,
editor. Adv Meteorol. 2022; 2022: 5346647. https://doi.org/10.1155/2022/5346647
56.
Deo RC, Şahin M. Application of the extreme learning machine algorithm for the prediction of monthly
Effective Drought Index in eastern Australia. Atmos Res. 2015; 153: 512–525.
57.
Yaseen ZM, Ali M, Sharafati A, Al-Ansari N, Shahid S. Forecasting standardized precipitation index
using data intelligence models: regional investigation of Bangladesh. Sci Rep. 2021;11. https://doi.org/
10.1038/s41598-021-82977-9 PMID: 33564055
58.
Liu Y, Wang LH, Yang LB, Liu XM. Drought prediction based on an improved VMD-OS-QR-ELM
model. PLoS One. 2022; 17: e0262329. https://doi.org/10.1371/journal.pone.0262329 PMID:
34990468
59.
Fung KF, Huang YF, Koo CH, Soh YW. Drought forecasting: A review of modelling approaches 2007–
2017. J Water Clim Chang. 2019; 11: 771–799. https://doi.org/10.2166/wcc.2019.236
60.
Assani AA, Landry R, Azouaoui O, Massicotte P, Gratton D. Comparison of the Characteristics (Frequency and Timing) of Drought and Wetness Indices of Annual Mean Water Levels in the Five North
American Great Lakes. Water Resour Manag. 2016; 30: 359–373. https://doi.org/10.1007/s11269015-1166-9
61.
Zhong C, Li G, Meng Z. Beluga whale optimization: A novel nature-inspired metaheuristic algorithm.
Knowledge-Based Syst. 2022; 251: 109215. https://doi.org/10.1016/j.knosys.2022.109215
62.
Tareke KA, Awoke AG. Hydrological drought forecasting and monitoring system development using
artificial neural network (ANN) in Ethiopia. Heliyon. 2023; 9: e13287. https://doi.org/10.1016/j.heliyon.
2023.e13287 PMID: 36816247
63.
Ghasemi P, Karbasi M, Zamani Nouri A, Sarai Tabrizi M, Azamathulla HM. Application of Gaussian
process regression to forecast multi-step ahead SPEI drought index. Alexandria Eng J. 2021; 60:
5375–5392. https://doi.org/10.1016/j.aej.2021.04.022
64.
Madadgar S, Moradkhani H. A Bayesian Framework for Probabilistic Seasonal Drought Forecasting. J
Hydrometeorol. 2013; 14: 1685–1705. https://doi.org/10.1175/JHM-D-13-010.1
65.
Malik A, Kumar A. Meteorological drought prediction using heuristic approaches based on effective
drought index: a case study in Uttarakhand. Arab J Geosci. 2020; 13: 276. https://doi.org/10.1007/
s12517-020-5239-6
66.
Özger M, Başakın EE, Ekmekcioğlu Ö, Hacısüleyman V. Comparison of wavelet and empirical mode
decomposition hybrid models in drought prediction. Comput Electron Agric. 2020;179. https://doi.org/
10.1016/j.compag.2020.105851
67.
Khosravi K, Daggupati P, Alami MT, Awadh SM, Ghareb MI, Panahi M, et al. Meteorological data mining and hybrid data-intelligence models for reference evaporation simulation: A case study in Iraq.
Comput Electron Agric. 2019; 167: 105041. https://doi.org/10.1016/j.compag.2019.105041
68.
Hameed MM, AlOmar MK, Khaleel F, Al-Ansari N. An Extra Tree Regression Model for Discharge
Coefficient Prediction: Novel, Practical Applications in the Hydraulic Sector and Future Research
Directions. Armaghani D, editor. Math Probl Eng. 2021; 2021: 7001710. https://doi.org/10.1155/2021/
7001710
69.
Hameed MM, Alomar MK, Mohd Razali SF, Kareem Khalaf MA, Baniya WJ, Sharafati A, et al. Application of Artificial Intelligence Models for Evapotranspiration Prediction along the Southern Coast of Turkey. Xu H, editor. Complexity. 2021; 2021: 1–20. https://doi.org/10.1155/2021/8850243
70.
Elbeltagi A, Kumar M, Kushwaha NL, Pande CB, Ditthakit P, Vishwakarma DK, et al. Drought indicator
analysis and forecasting using data driven models: case study in Jaisalmer, India. Stoch Environ Res
Risk Assess. 2023; 37: 113–131. https://doi.org/10.1007/s00477-022-02277-0
71.
Carranza C, Nolet C, Pezij M, van der Ploeg M. Root zone soil moisture estimation with Random Forest. J Hydrol. 2021; 593: 125840. https://doi.org/10.1016/j.jhydrol.2020.125840
72.
Ahmad MW, Mourshed M, Rezgui Y. Tree-based ensemble methods for predicting PV power generation and their comparison with support vector regression. Energy. 2018; 164: 465–474. https://doi.org/
10.1016/j.energy.2018.08.207
73.
Dong B, Cao C, Lee SE. Applying support vector machines to predict building energy consumption in
tropical region. Energy Build. 2005; 37: 545–553. https://doi.org/10.1016/j.enbuild.2004.09.009
74.
AlOmar MK, Hameed MM, Al-Ansari N, AlSaadi MA. Data-Driven Model for the Prediction of Total Dissolved Gas: Robust Artificial Intelligence Approach. Jiang Y-Z, editor. Adv Civ Eng. 2020; 2020:
6618842. https://doi.org/10.1155/2020/6618842
PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023
35 / 37
PLOS ONE
Accurate multi-months ahead drought forecasting
75.
Li Q, Meng Q, Cai J, Yoshino H, Mochida A. Applying support vector machine to predict hourly cooling
load in the building. Appl Energy. 2009; 86: 2249–2256. https://doi.org/10.1016/j.apenergy.2008.11.
035
76.
Alomar MK, Khaleel F, Aljumaily MM, Masood A, Razali SFM, AlSaadi MA, et al. Data-driven models
for atmospheric air temperature forecasting at a continental climate region. PLoS One. 2022; 17:
e0277079. https://doi.org/10.1371/journal.pone.0277079 PMID: 36327280
77.
Huang G-B, Zhu Q-Y, Siew C-K. Extreme learning machine: theory and applications. Neurocomputing.
2006; 70: 489–501.
78.
AlOmar MK, Hameed MM, AlSaadi MA. Multi hours ahead prediction of surface ozone gas concentration: Robust artificial intelligence approach. Atmos Pollut Res. 2020. https://doi.org/10.1016/j.apr.
2020.06.024
79.
Hameed MM, Khaleel F, AlOmar MK, Mohd Razali SF, AlSaadi MA. Optimising the Selection of Input
Variables to Increase the Predicting Accuracy of Shear Strength for Deep Beams. Naser MZ, editor.
Complexity. 2022; 2022: 6532763. https://doi.org/10.1155/2022/6532763
80.
Moghadam RG, Yaghoubi B, Rajabi A, Shabanlou S, Izadbakhsh MA. Evaluation of discharge coefficient of triangular side orifices by using regularized extreme learning machine. Appl Water Sci. 2022;
12: 145. https://doi.org/10.1007/s13201-022-01665-9
81.
Hameed MM, AlOmar MK, Al-Saadi AAA, AlSaadi MA. Inflow forecasting using regularized extreme
learning machine: Haditha reservoir chosen as case study. Stoch Environ Res Risk Assess. 2022.
https://doi.org/10.1007/s00477-022-02254-7
82.
Bartlett P. For valid generalization the size of the weights is more important than the size of the network. Adv Neural Inf Process Syst. 1996;9.
83.
Deng W, Zheng Q, Chen L. Regularized Extreme Learning Machine. 2009 IEEE Symposium on
Computational Intelligence and Data Mining. 2009. pp. 389–395. https://doi.org/10.1109/CIDM.2009.
4938676
84.
Ewees AA, Gaheen MA, Yaseen ZM, Ghoniem RM. Grasshopper Optimization Algorithm with Crossover Operators for Feature Selection and Solving Engineering Problems. IEEE Access. 2022.
85.
Mantegna RN. Fast, accurate algorithm for numerical simulation of L\’evy stable stochastic processes.
Phys Rev E. 1994; 49: 4677–4683. https://doi.org/10.1103/PhysRevE.49.4677 PMID: 9961762
86.
Pouya A, Ozgur K, Vahid V. Multivariate Drought Forecasting in Short- and Long-Term Horizons
Using MSPI and Data-Driven Approaches. J Hydrol Eng. 2021; 26: 4021006. https://doi.org/10.1061/
(ASCE)HE.1943-5584.0002059
87.
Pande CB, Kushwaha NL, Orimoloye IR, Kumar R, Abdo HG, Tolche AD, et al. Comparative Assessment of Improved SVM Method under Different Kernel Functions for Predicting Multi-scale Drought
Index. Water Resour Manag. 2023. https://doi.org/10.1007/s11269-023-03440-0
88.
Sharafati A, Yasa R, Azamathulla HM. Assessment of stochastic approaches in prediction of waveinduced pipeline scour depth. J Pipeline Syst Eng Pract. 2018;9. https://doi.org/10.1061/(ASCE)PS.
1949-1204.0000347
89.
Antanasijević D, Pocajt V, Perić-Grujić A, Ristić M. Modelling of dissolved oxygen in the Danube River
using artificial neural networks and Monte Carlo Simulation uncertainty analysis. J Hydrol. 2014; 519:
1895–1907. https://doi.org/10.1016/j.jhydrol.2014.10.009
90.
Noori R, Hoshyaripour G, Ashrafi K, Araabi BN. Uncertainty analysis of developed ANN and ANFIS
models in prediction of carbon monoxide daily concentration. Atmos Environ. 2010; 44: 476–482.
https://doi.org/10.1016/j.atmosenv.2009.11.005
91.
Dehghani M, Saghafian B, Nasiri Saleh F, Farokhnia A, Noori R. Uncertainty analysis of streamflow
drought forecast using artificial neural networks and Monte-Carlo simulation. Int J Climatol. 2014; 34:
1169–1180. https://doi.org/10.1002/joc.3754
92.
Gao M, Yin L, Ning J. Artificial neural network model for ozone concentration estimation and Monte
Carlo analysis. Atmos Environ. 2018; 184: 129–139. https://doi.org/10.1016/j.atmosenv.2018.03.027
93.
Nur Adli Zakaria M, Abdul Malek M, Zolkepli M, Najah Ahmed A. Application of artificial intelligence
algorithms for hourly river level forecast: A case study of Muda River, Malaysia. Alexandria Eng J.
2021; 60: 4015–4028. https://doi.org/10.1016/j.aej.2021.02.046
94.
Great Lake Map. Available: https://kids.britannica.com/kids/article/Great-Lakes/346130#:~:text=
TheGreatLakesarefive,offreshwateronEarth.
95.
de Loë RC, Kreutzwiser RD. Climate Variability, Climate Change and Water Resource Management in
the Great Lakes. Clim Change. 2000; 45: 163–179. https://doi.org/10.1023/A:1005649219332
96.
McBean E, Motiee H. Assessment of impact of climate change on water resources: a long term analysis of the Great Lakes of North America. Hydrol Earth Syst Sci. 2008; 12: 239–255. https://doi.org/10.
5194/hess-12-239-2008
PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023
36 / 37
PLOS ONE
Accurate multi-months ahead drought forecasting
97.
Hunter TS, Croley TE. Great Lakes Monthly Hydrologic Data. NOAA Data Report ERL GLERL. 1993.
98.
GrateLakes. Available: https://www.lre.usace.army.mil/Missions/Great-Lakes-Information/GreatLakes-Information-2/Water-Level-Data/
99.
Saase R, Schütt B, Bebermeier W. Analyzing the Dependence of Major Tanks in the Headwaters of
the Aruvi Aru Catchment on Precipitation. Applying Drought Indices to Meteorological and Hydrological Data. Water. 2020. https://doi.org/10.3390/w12102941
100.
Bazrafshan J, Hejabi S, Rahimi J. Drought Monitoring Using the Multivariate Standardized Precipitation Index (MSPI). Water Resour Manag. 2014; 28: 1045–1060. https://doi.org/10.1007/s11269-0140533-2
101.
Najafia T, Jaafara R, Remlib R, Zaidib AW, Chellappana K. The Role of Brain Signal Processing and
Neuronal Modelling in Epilepsy–A Review. J Kejuruter. 2021; 33: 801–815.
102.
Aghelpour P, Varshavian V. Forecasting Different Types of Droughts Simultaneously Using Multivariate Standardized Precipitation Index (MSPI), MLP Neural Network, and Imperialistic Competitive Algorithm (ICA). Shahid S, editor. Complexity. 2021; 2021: 6610228. https://doi.org/10.1155/2021/
6610228
103.
Soh YW, Koo CH, Huang YF, Fung KF. Application of artificial intelligence models for the prediction of
standardized precipitation evapotranspiration index (SPEI) at Langat River Basin, Malaysia. Comput
Electron Agric. 2018; 144: 164–173. https://doi.org/10.1016/j.compag.2017.12.002
104.
Khan MMH, Muhammad NS, El-Shafie A. Wavelet-ANN versus ANN-Based Model for Hydrometeorological Drought Forecasting. Water. 2018. https://doi.org/10.3390/w10080998
105.
Xu D, Zhang Q, Ding Y, Zhang D. Application of a hybrid ARIMA-LSTM model based on the SPEI for
drought forecasting. Environ Sci Pollut Res. 2022; 29: 4128–4144. https://doi.org/10.1007/s11356021-15325-z PMID: 34403057
106.
Alquraish M, Ali. Abuhasel K, S. Alqahtani A, Khadr M. SPI-Based Hybrid Hidden Markov–GA,
ARIMA–GA, and ARIMA–GA–ANN Models for Meteorological Drought Forecasting. Sustainability.
2021. https://doi.org/10.3390/su132212576
107.
Fung KF, Huang YF, Koo CH, Mirzaei M. Improved SVR machine learning models for agricultural
drought prediction at downstream of Langat River Basin, Malaysia. J Water Clim Chang. 2019; 11:
1383–1398. https://doi.org/10.2166/wcc.2019.295
108.
Khan MMH, Muhammad NS, El-Shafie A. Wavelet based hybrid ANN-ARIMA models for meteorological drought forecasting. J Hydrol. 2020; 590: 125380. https://doi.org/10.1016/j.jhydrol.2020.125380
109.
Sattar Hanoon M, Najah Ahmed A, Razzaq A, Oudah AY, Alkhayyat A, Feng Huang Y, et al. Prediction
of hydropower generation via machine learning algorithms at three Gorges Dam, China. Ain Shams
Eng J. 2023; 14: 101919. https://doi.org/10.1016/j.asej.2022.101919
110.
Malik A, Tikhamarine Y, Sammen SS, Abba SI, Shahid S. Prediction of meteorological drought by
using hybrid support vector regression optimized with HHO versus PSO algorithms. Environ Sci Pollut
Res. 2021; 28: 39139–39158. https://doi.org/10.1007/s11356-021-13445-0 PMID: 33751346
PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023
37 / 37
Download