PLOS ONE RESEARCH ARTICLE Machine learning models development for accurate multi-months ahead drought forecasting: Case study of the Great Lakes, North America Mohammed Majeed Hameed ID1, Siti Fatin Mohd Razali ID1,2*, Wan Hanna Melini Wan Mohtar1,2, Norinah Abd Rahman1,2, Zaher Mundher Yaseen3,4 a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 1 Department of Civil Engineering, Faculty of Engineering and Built Environment, Universiti Kebangsaan Malaysia (UKM), Bangi, Selangor, Malaysia, 2 Green Engineering and Net Zero Solution (GREENZ), Universiti Kebangsaan Malaysia (UKM), Bangi, Selangor, Malaysia, 3 Civil and Environmental Engineering Department, King Fahd University of Petroleum & Minerals, Dhahran, Saudi Arabia, 4 Interdisciplinary Research Center for Membranes and Water Security, King Fahd University of Petroleum & Minerals, Dhahran, Saudi Arabia * fatinrazali@ukm.edu.my Abstract OPEN ACCESS Citation: Hameed MM, Razali SFM, Mohtar WHMW, Rahman NA, Yaseen ZM (2023) Machine learning models development for accurate multimonths ahead drought forecasting: Case study of the Great Lakes, North America. PLoS ONE 18(10): e0290891. https://doi.org/10.1371/journal. pone.0290891 Editor: Sani Isah Abba, King Fahd University of Petroleum and Minerals, SAUDI ARABIA Received: May 3, 2023 Accepted: August 15, 2023 Published: October 31, 2023 Copyright: © 2023 Hameed et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Data Availability Statement: All data used in this study are available in the supplementary file and provided in Appendix B. The Great Lakes are critical freshwater sources, supporting millions of people, agriculture, and ecosystems. However, climate change has worsened droughts, leading to significant economic and social consequences. Accurate multi-month drought forecasting is, therefore, essential for effective water management and mitigating these impacts. This study introduces the Multivariate Standardized Lake Water Level Index (MSWI), a modified drought index that utilizes water level data collected from 1920 to 2020. Four hybrid models are developed: Support Vector Regression with Beluga whale optimization (SVR-BWO), Random Forest with Beluga whale optimization (RF-BWO), Extreme Learning Machine with Beluga whale optimization (ELM-BWO), and Regularized ELM with Beluga whale optimization (RELM-BWO). The models forecast droughts up to six months ahead for Lake Superior and Lake Michigan-Huron. The best-performing model is then selected to forecast droughts for the remaining three lakes, which have not experienced severe droughts in the past 50 years. The results show that incorporating the BWO improves the accuracy of all classical models, particularly in forecasting drought turning and critical points. Among the hybrid models, the RELM-BWO model achieves the highest level of accuracy, surpassing both classical and hybrid models by a significant margin (7.21 to 76.74%). Furthermore, Monte-Carlo simulation is employed to analyze uncertainties and ensure the reliability of the forecasts. Accordingly, the RELM-BWO model reliably forecasts droughts for all lakes, with a lead time ranging from 2 to 6 months. The study’s findings offer valuable insights for policymakers, water managers, and other stakeholders to better prepare drought mitigation strategies. Funding: The author(s) received no specific funding for this work. Competing interests: The authors have declared that no competing interests exist. PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023 1 / 37 PLOS ONE Accurate multi-months ahead drought forecasting 1. Introduction Drought is a harmful and creeping climatic crisis that can occur in all climates and negatively influences ecosystems and human societies. Unlike rapidly developing natural disasters (e.g., hurricanes, floods, and storms), drought is a slow-developing phenomenon that poses a significant and enduring threat to human daily life, agriculture, ecology, and the economy [1, 2]. The lack of precipitation for a long time is the direct and leading cause of drought [3]. Since two decades ago, droughts have become increasingly common and severe due to climate change brought on by global warming, causing food and water crises in some regions worldwide [4–6]. Moreover, the occurrence of drought leads to various issues, including desertification, water scarcity, and forest fires, which have negative repercussions at both local and global scales [7–10]. Furthermore, drought that occurs in areas located in the North American continent and other countries has a great relationship with forest fires [11, 12]. Natural disasters that struck almost all world countries have severely affected enormous numbers of people all over the globe. Statistically analyzed data estimates that about 3.8 billion people worldwide have suffered from these catastrophes and that half of those have suffered due to droughts [13]. Also, the drought disaster has directly or indirectly killed 1.3 million people out of 3.5 million who died due to natural disasters. Drought is typically categorized into four types: meteorological drought, agricultural drought, hydrological drought, and socio-economic drought [14–16]. The three physical drought categories (i.e., metrological, agricultural, and hydrological droughts) self-explanatorily describe the lack of water in the hydrological cycle. It can be said that drought is associated significantly with precipitation, and thus the onset of metrological drought begins when there is a rainfall deficit. Hydrological droughts often deal with the decrease in the river flow, groundwater level, dam reservoir storage, and lake water level due to long-term meteorological droughts [17]. In addition, persistent hydrological drought causes agricultural droughts. In agricultural drought, soil moisture is deficient, resulting in a reduction in crop yields. Several drought indices have been developed since the 1960s for monitoring and forecasting droughts based on individual or multiple meteorological or hydrological variables (e.g., rainfall, evapotranspiration, streamflow, groundwater level, and runoff) [18]. It is worth mentioning that among all drought indexes, the "standardized precipitation index (SPI)", proposed by [19], is broadly used to quantify drought across the world and receives considerable attention from researchers [5, 20, 21]. Based on SPI principles, several hydrological parameters have been derived and widely used to assess and monitor droughts, such as standardized streamflow index (SSI) [22, 23], standardized hydrological drought index (SHDI) [24], standardized groundwater level index (SGI) [25], and standardized runoff index (SRI) [26, 27]. These indices’ popularity and widespread use can be attributed to their ability to effectively evaluate and track drought conditions across diverse timeframes and geographic locations while requiring minimal data inputs and maintaining strong comparability in time and space [28, 29]. Since drought is a manifestation of climate variability events, scholars have developed drought forecasting models to minimize the adverse effects of drought on agriculture, water resource systems, socio-economic aspects, and hydropower generation. Notably, climate variability events refer to natural fluctuations in climate patterns that can affect temperature, precipitation, and other climate factors over time. Climate change can intensify the impacts of drought, leading to more frequent occurrences of water scarcity [30, 31]. This poses significant challenges for ecosystems and communities that depend on reliable access to water resources. Thus, drought forecasting is substantial and necessary to make effective decisions in managing water resources, executing drought mitigation strategies, and establishing robust early warning systems [32]. Therefore, selecting an appropriate model is the most crucial factor in drought PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023 2 / 37 PLOS ONE Accurate multi-months ahead drought forecasting forecasting after identifying the drought type. The forecast of droughts is frequently implemented using physical and data-driven models. Conceptual and physical models that necessitate substantial data about a catchment or basin process, resulting in complicated predicting models [24]. However, data-driven models are more popular because they are flexible, require less data, and do not consider the process of a studied catchment. In recent decades, different machine learning (ML) models and statistical approaches were intensively utilized, such as autoregressive integrated moving average [33, 34], support vector regression [35–37], adaptive neuro-fuzzy inference system [38–40], artificial neural network (ANN) [41–43], deep learning models [44, 45], and assembling based approaches like random forest [46, 47] for forecasting drought over the world. Some researchers have developed hybrid models based on natureinspired algorithms that are effective and adaptable in forecasting drought. The meta-heuristic algorithms can help produce models with high prediction accuracy by training or optimizing the parameters of ML models, which substantially affects the model performance [48, 49]. Notably, fourteen meta-algorithms have been intensively used to improve the ML model’s performance in drought forecasting [24, 50–54]. Enhancing the structure of ML-based models in drought forecasting has aroused the interest of researchers in the last few years. Some scholars tackled fundamental issues of some ML models, such as ANN. Despite being a popular approach among ML-based models in drought sectors, this model encounters difficulties related to generalization, complexity, and learning speed, along with the challenge of becoming stuck in a local minimum. Thus, an extreme learning machine (ELM) was introduced to address the drawbacks of ANN (i.e., complexity, slow convergence, and overfitting) and then used widely to solve engineering problems and drought forecasting. ELM has several advantages compared to ANN, such as better generalization, fast learning, simpler structure, and superior performance [55, 56]. Accordingly, ELM has been applied for drought forecasting in different climate regions and obtained promising results [1, 30, 57, 58]. Nevertheless, it is possible to adjust the model structure to enhance its performance and stability by using regularization parameters and nature-inspired algorithms to train the model effectively. Researchers have not paid enough attention to monitoring and forecasting hydrological drought, despite its importance in managing water resources [59]. The main objective of this study is to evaluate and forecast hydrological drought in the Great Lakes, situated in North America, which contains over one-fifth of the world’s fresh water. The Great Lakes are an essential source of fresh water for people, and droughts in these water bodies can significantly impact water availability and the ecosystem. Notably, the recent droughts in the Great Lakes have been mainly attributed to climate change-induced elevated water temperatures, resulting in increased evaporation due to a substantial decline in winter ice cover [60]. This study used Multivariate Standardized Lake Water Level Index (MSWI) to assess the hydrological drought over the Grate Lakes. MSWI can simultaneously display the drought condition from the standpoints of many time scales. To forecast hydrological drought for different lead-time horizons (from one to six months ahead), four hybrid models were developed by combining regularized extreme learning machine (RELM), RF, ELM, and SVR with the Beluga whale optimization (BWO) algorithm. The BWO algorithm is a novel nature-inspired algorithm with robust and efficient performance in solving engineering problems. Moreover, it displays excellent and superior results when tested against fifteen meta-heuristic algorithms in solving different engineering problems [61]. The hybrid models were compared and validated against established standard models (RF, ELM, and SVR). Previous drought works have suggested using standard performance metrics such as correlation of determinations (R2), root mean square error (RMSE), and others to choose the best prediction models [62, 63]. However, drought time series data, particularly at higher time PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023 3 / 37 PLOS ONE Accurate multi-months ahead drought forecasting scales, include a significantly high auto-correlation [64–66]. This can result in very high R2 values and lower forecasting errors, rendering such metrics less effective in selecting the best models. Accordingly, this paper’s proposed advanced evaluation method focuses on accurately detecting the turning points in drought data and evaluating the model’s ability to capture them effectively. Finally, uncertainty analysis was used to evaluate the forecasting accuracy and select which period lead times can be reliably forecasted. 2. Methodological overview 2.1 Random forest (RF) The principle of an ensemble and bagging learning technique served as the foundation for creating the RF. The decision tree approach utilizing the bagging methodology is a crucial factor in the effectiveness of this model in solving engineering problems, especially those related to regression [67]. In RF, nodes are nominated randomly based on the most significant input features, enhancing the learning, forecasting, and maintaining reliability to improve the generalization capacity. The following procedures are considered to establish the RF model [57]: i. From training data observations, select random i data samples bootstrap sample. ii. Establish the decision tree related to the data selected (i) from the above step. iii. Select the n-trees (ntrees) that require to be built. iv. Repeat the steps from one to three. v. Forecast MSWI by accumulating the aggregative forecasts of ntrees. The RF model’s capacity has gained a good reputation in solving hydrological and environmental problems and also conducted in drought forecasting [46, 68–71]. Fig 1 provides the random forest model’s flowchart. 2.2 Support vector regression (SVR) SVR is a supervisor learning machine used widely in water resource applications and drought forecasting. The main idea behind SVR is to map the input vectors to higher feature dimensional spaces through a nonlinear mapping and then conduct a regression model based on the calculated feature spaces. Even with limited observations from the training datasets, the approach can effectively find an excellent solution to nonlinear problems [48, 72]. The structural risk minimization (SRM) concept, which SVM employs, seeks to minimize the upper and lower bounds of the generalization error, which is the sum of the training error and a confidence level [73, 74]. This concept is distinct from the more common empirical risk minimization (ERM) method, which focuses only on reducing the total error generated during the training stage. Assuming the MSWIi in the datapoints is the actual normalized value of the ith sample while the input parameter of that target is Xi. The sample set accordingly can be derived as {(Xi, Yi)}i=1N where the N is the total samples, SVR uses the form in the following Equation to approximate the function. Finally, the term Yi is this case is MSWIi. MSWI ¼ f ðXÞ ¼ W:;ðXÞ þ b ð1Þ The ;(X) is the high-dimensional space. Estimating coefficients (W and b) requires using a regularized risk function, represented by Eq (2). Minimize : PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023 1 1 XN 2 kWk þ C L ðYi ; f ðXi ÞÞ i¼1 ε 2 N ð2Þ 4 / 37 PLOS ONE Accurate multi-months ahead drought forecasting Fig 1. Schematic of RF model. https://doi.org/10.1371/journal.pone.0290891.g001 ( Lε ðMSWIi ; f ðXi ÞÞ ¼ jYi f ðXi Þj � ε f ðXi Þj Others 0; jMSWIi ð3Þ The regularized term in the above equations is kWk2, and C to calculate the trade-off between training error and model flatness. In Eq (2), the second term represents the empirical error determined by the intensity loss function (Eq 3). The loss function equals zero if the anticipated value corresponds with the tube (a region around the predicted regression line). Conversely, the loss is a value representing the difference between the tube residual of the tube (ε) and the projected value [75]. The Equation above is converted into the principal objective function represented by Eq 4 to estimate the coefficients (W and b). Xn �1 minmize W; x1 ; x∗1 ; b k Wk2 þ C ðx1 þ x∗1 Þ i¼1 2 8 > < Subject to : > : MSWIi b � ε þ x1 ; W:;ðxi Þ ∗ 1 W:;ðxi Þ þ b � ε þ x ; i ¼ 1; 2; . . . ; N ð4Þ ∗ 1 x1 ; x � 0 In the Equation above the slack variables are ξ1, and x∗1 . Eq (5) is expressed as follows once PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023 5 / 37 PLOS ONE Accurate multi-months ahead drought forecasting the kernel function K(Xi, Xj) is introduced; � Minimize ai ; a∗i � � 1 XN XN ∗ ∗ : ða a Þða a Þ:K X ; X i j i j i j i¼1 j¼1 2 Subject to : ε 8 N X > < ða i > : XN i¼1 ðai a∗i Þ þ XN a∗i Þ ¼ 0 i¼1 i¼1 MSWIi ðai a∗i Þ ð5Þ ∗ i ai ; a 2 ½0; C� In the above Equation, the αi, and a∗i are the Lagrange multipliers while the i, j are different samples that are used in the training dataset. Accordingly, Eq (3) can be expressed as below: MSWI ¼ f ðXÞ ¼ N X ðai a∗i ÞKðXi ; Xj Þ þ b ð6Þ i¼1 Notably, the Gaussian kernel density function is used in this study [76]. 2.3 Extreme learning machine (ELM) ELM is a cutting-edge algorithm invented in 2006 to train the single-layer feedforward neural network. ELM’s inventor affirmed that this algorithm could replace the conventional algorithm (i.e., backpropagation algorithms) to train ANN due to its quick learning and superior generalization [77]. Fig 2 shows the general schematic of the ELM model. The construction of Fig 2. ELM model for forecasting MSWI. https://doi.org/10.1371/journal.pone.0290891.g002 PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023 6 / 37 PLOS ONE Accurate multi-months ahead drought forecasting the ELM model can be expressed mathematically as follows: M X f ðxÞ ¼ yi hi ðxi Þ ¼ hðxÞy ð7Þ i¼1 where θi = [θ1, θ2, I, θM]T is the output layer weight value. These values are significant because they connect the m output nodes in the hidden layer of M nodes. The nonlinear ELM feature mapping represents as hi(xi), hi(xi) = [hi(x), h2(xI. . .,hm(x)]T. The final ELM output or forecast is f(x). The xi is the independent variable, and hi(x) is the output magnitude of the ith hidden note output. It is possible in the ELM model that hidden nodes do not have unique output functions. It can be said that each hidden neuron may have a different output function. However, Hi(xi) in practical problems can be expressed as: Hi ðxi Þ ¼ Gðai ; bi ; xÞ; ai 2 Rd ; bi 2 R ð8Þ Parameters a and b are the hidden node parameters (weights and bias values), xi is the input variable, R refers to the set of real number values, Rd is the d-dimensional set of real numbers, and G is the nonlinear transfer function. The transfer function is essential in the ELM model because it is responsible for mapping input data to a complex and higher dimensional space; hence, the model can find a satisfactory solution for complex engineering problems [78]. ELM algorithm through training ANN is based on two essential phases, the first stage is random features mapping, and the second one is called linear parameters solving. ELM uses several nonlinear activation functions to convert the inputs (i.e., variables) into feature space by randomly adjusting weights and bias values of t’e ELM’s hidden layer. Accordingly, the hidden layer parameters (a, b) are assigned randomly. The other significant stage in ELM is to calculate the weight values (θ) which connect the hidden layer of the ELM model with the output layer. These wights are linearly calculated using least square error sense by minimizing the approximations error: min kHy y Yk ð9Þ The term Y is this case is MSWI (training data) and the H matrix depends on input variables, and hidden layer parameters in the above formula can be calculated below [79]: 2 6 H¼6 4 hðx1 Þ 3 2 Gða1 x1 þ b1 Þ ��� GðaM x1 þ bM Þ .. . .. .. . 7 6 7¼6 5 4 .. . hðxN Þ Gða1 xN þ b1 Þ . 3 7 7 5 ð10Þ � � � GðaM xN þ bM Þ The Y is the training data response, which can be expressed as: 2 y1 T 3 2 t11 6 . 7 6 . 7 6 Y¼6 4 .. 5 ¼ 4 .. y1 T PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023 tN1 ��� .. . t1M 3 .. 7 7 . 5 ð11Þ � � � tNM 7 / 37 PLOS ONE Accurate multi-months ahead drought forecasting The k.k is the Frobenius norm, and thus, the calculated output weights |(θ*) can be determined as follow: y∗ ¼ H þ Y ð12Þ In the above mathematical expression, the H+ is the Moore–Penrose generalized inverse of the matrix of H. ELM differs from conventional neural networks in that there is no requirement to fine-tune every parameter of the model, including the weights and biases of the hidden layer. After random selection of the hidden layer weight and bias values, the ELM can be considered a linear system. Then, the weight values that connect the hidden layer with the output layer of ELM are systematically determined by the generalized inverse procedure of the hidden layer output matrices. As a result, the ELM model is several times quicker than classical algorithms used for training feedforward networks. 2.4 Regularized extreme learning machine As discussed in the previous section, the only output weight matrix of the ELM model is calculated from the training procedure. The training error should decrease to a minimum value to obtain these weights. However, minimizing only the training error might reduce the generalization capacity of the developed model when estimating samples that were not included in training data points [80]. Thus, the classical ELM solutions would have some serious weaknesses. The first problem is that ELM, based on the empirical risk minimization (ERM) concept, frequently leads to over-fitting models [81]. Due to its direct calculation of the lowest norm least-squares solutions, ELM has a poor control capability, which is regarded as the second disadvantage. Another drawback is that it might lead to less reliable estimates, for example, if data includes outlier values or heteroskedasticity. The highest level of generalizability is achieved in some advanced models when the weights’ norm values and forecasting training -error rate are minimized simultaneously [82]. Accordingly, this paper suggested RELM based on the structural risk minimization (SRM) concept of statistical learning theory to address mentioned disadvantages [83]. The uniqueness of this characteristic in the RELM model is attributed to its regularization parameter (γ). It is important to mention that the loss function for the ELM model is modified in this work using γ parameter to reduce both the training error and the weight values norm simultaneously. The mathematical expression of the cost function for RELM model can be illustrated below: ERELM ¼ Min gkY y 2 Hyk2 þ kyk2 2 ð13Þ Subject to error ε: ε¼Y Hy ð14Þ Thus, the relation can be expressed as the following mathematical expression: 2 ERELM ¼ Min gkεk2 þ kyk2 2 ð15Þ y The Lagrangian Equation is defined as the follow to calculate the optimum solution: 2 2 Lðy; ε; lÞ ¼ gkεk2 þ kyk2 þ lT ðY PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023 Hy εÞ ð16Þ 8 / 37 PLOS ONE Accurate multi-months ahead drought forecasting 8 @L > ¼ 0 ) 2y > > > @y > > < @L ¼ 0 ) 2y > @ε > > > > > : @L ¼ 0 ) Y @l HT l ¼ 0 HT l ¼ 0 ð17Þ Hy ¼ 0 In the above equations, the Lagrangian multiplier matrix is λ, and ε is the training error, ε = [ε1, ε2,. . .εN]T. The N is the total number of training samples. According to the previous equations, the optimal weights of the RELM output layer are computed as the following Equation: 1 I y^ ¼ ðH T HÞ ∗ H T Y g ð18Þ The term I in the above Equation is called the unit matrix. The formula of calculated y^ vector is suitable when the hidden neurons are more than the training samples. Otherwise, the following Equation can be used to find the weights. � � 1 I y^ ¼ H T HH T þ ∗Y g ð19Þ 2.5 Beluga whale algorithm (BWO) BWO is a sophisticated metaheuristic algorithm invented in 2022 to solve engineering and optimization problems [61]. The algorithm was employed to improve the efficiency of drought forecasting by training the RELM model. BWO simulates the behaviour of beluga whales that live in regular groups, searching for and catching prey in the sea [84]. The mathematical model of BWO depends on three essential stages: exploration, exploitation, and whale fall. The first stage (exploration phase) in the mathematical model simulates the swimming behaviour of the Beluga whale, while the exploitation stage is inspired by preying behaviour of these animals. The third stage is inspired by the whale’s fall into the depths of the sea, or it was hunted by people or predators such as polar bears; hence it’s called "whale fall". It is worth noting that during the optimization process, the whale fall stage is executed in each iteration after the exploration and exploitation stages are completed. The main steps needed to implement this algorithm can be summarized as follows: i. Initialization: input the algorithm parameters, such as the maximum number of iterations (Tmax) and population size (n). The initial position of each population size (beluga whales) is randomly assigned in the space search. Besides, the fitness values are computed according to the given objective function. ii. Extraction and exploration phases update: according to the balance factor (Bf) value, each beluga whale is appointed to be involved in the exploration or exploitation stage. If the Bf PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023 9 / 37 PLOS ONE Accurate multi-months ahead drought forecasting value exceeds 0.5, that means the Beluga whale is entering into the exploration phase, and thus its position will be updated according to the following Equation: ( xi;j Tþ1 ¼ xi;pj T þ ðxr;p1 T xi;pj T Þ þ ð1 þ r1 Þsinð2pr2 Þ; j ¼ even xi;j Tþ1 ¼ xi;pj T þ ðxr;p1 T xi;pj T Þ þ ð1 þ r1 Þcosð2pr2 Þ; j ¼ odd ) ð20Þ The current iteration in the evacuation is T while the new position of ith whale on jth dimensional space is xi,jT+1. The symbols r1, and r2 are random numbers between (0, 1). xi;pj T is the position of the ith beluga whale on pj dimension. Otherwise, when the Bf value is less than 0.5, the updating mechanism is controlled by the exploitation stage, and eventually, the position of each whale is updated using Eq 21. After that, the fitness values of new positions are determined and sorted to achieve the best outcome in the current iteration. Xi Tþ1 ¼ r3 Xbest T r4 Xi T þ C1 :LF :ðXr T Xi T Þ ð21Þ XiT and XrT are current positions for the ith beluga whale and a random beluga whale, respectively. The two terms in the Equation (r3 and r4) are random numbers between (0, 1) while C1 is a constant number that depends on r4 current and maximum iteration (T and � � T Tmax); C1 ¼ 2r4 1 Tmax . Finally, the LF is the Levy flight function [85]. LF ¼ 0:05 � � s¼ m�s ð22Þ 1 jvjb Gð1 þ bÞ � sinðpb=2Þ Gðð1 þ bÞ=2Þ � b � 2ð1 bÞ=2 �b1 ð23Þ β is the default constant (1.5), while μ, and v are normally distributed random numbers. iii. Whale fall phases update: The probability of whale fall (Wf) is determined in each iteration, considering that some beluga whales may fall into the deep sea or die. Consequently, it is essential in such cases to update the position of a beluga whale. X Tþ1 i ¼ r5 Xi T r6 Xr T þ r7 Xstep ð24Þ In the above equations, the r5, r6, and r7 are assigned randomly between one and zero. Besides, Xstep is the step size of a whale fall. Xstep ¼ ðub lb Þ � expð C2 T=Tmax Þ ð25Þ In the above Equation, the ub, and lb are upper and lower bond variables. The C2 is a step factor that depends mainly on Wf. C2 ¼ 2Wf � n ð26Þ Finally, the probability of whale fall can be calculated using the following formula. Wf ¼ 0:1 PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023 0:05 T=Tmax ð27Þ 10 / 37 PLOS ONE Accurate multi-months ahead drought forecasting iv. Terminating condition check: If the current iteration exceeds the maximum number of iterations, the BWO algorithm terminates. Otherwise, repeat steps ii to iii. 2.6 Model development Multiple ML-based models have been created to forecast droughts in the Great Lakes region with the ability to forecast several months in advance. This study integrated a modified version of ELM called RELM with BWO to create a hybrid model known as RELM-BWO. The performance of this model was compared to classical models such as ELM, SVR, and RF. The RELM-BWO model was also validated against other hybrid models that combined classical models with BWO, namely SVR-BWO, RF-BWO, and ELM-BWO. All models were trained using past values (5-month lag) to forecast MSWI for several time steps ranging from 1 to 6 months ahead. In this study, time series drought data is divided into a training set (75% of the data from December 1921 to January 1997) and a testing set (data points from February 1997 to December 2021). This division ratio balances the amount of data for training and testing in machine learning models, resulting in good results [86, 87]. The method aims to train the model on historical data and assess its predictive efficiency on the remaining portion. This approach provides insights into the accuracy of future forecasts for drought data using historical information and is considered more challenging than random data division in time series analysis. Once the input selection was determined, all input vectors and their corresponding targets were normalized to a range between 0 and 1. For the hybrid models, the primary objective of BWO was to optimize the hyperparameters of the classical models, as they greatly influence the accuracy of the model’s predictions. The objective function used was the root mean square error, and BWO aimed to minimize this function by selecting the best hyperparameters for each model. In the case of RELM-BWO, BWO was employed to determine the optimal value of the regularization parameter (γ) and the weight and bias values of the input layer. The remaining parameters for the other models calculated by BWO are listed in Table 1. All the models were implemented using MATLAB 2020a, and the primary process can be observed in Fig 3. 2.7 Statistical parameters The forecasting results were evaluated to identify the best forecasting model. Several statistical indicators have been used to assess the forecasting results, such as Mean absolute error (MAE), Root means square error (RMSE), Relative error (RE%), and Correlation of determination Table 1. Hyperparameters of the applied models. Algorithm/ Model RF Parameter Number of trees ranges from 26 to 70, while the leaf 2 [3,8] SVR Kernel Scale range from 0.2441 to o.91061, and Box Constrain range from 0.1449 to 0.9554. ELM Weights range from -1 to 1, Hidden Layer Nodes are 7. BWO The maximum iteration is 100, and the number population is 50. RELM-BWO SVR-BWO RF-BWO ELM-BWO Weights range from -1 to 1, Hidden Layer Nodes are 7. BWO applied to compute hidden layer’s weights, biases values, and the Regularization parameter (γ) BWO applied to compute Kernel Scale, Box Constrain, and Epsilon BWO applied to determine Number of trees, and leaf numbers. Weights range from -1 to 1, Hidden Layer Nodes are 7. BWO applied to compute the hidden layer’s weights, and biases values, https://doi.org/10.1371/journal.pone.0290891.t001 PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023 11 / 37 PLOS ONE Accurate multi-months ahead drought forecasting Fig 3. Flow chart of model development. https://doi.org/10.1371/journal.pone.0290891.g003 (R2). MAE is used widely to assess regression models in hydrological studies. This criterion measures the mean absolute forecasting error. RMSE is one of the most famous indices used to evaluate comparable models. RMSE measures the mean square forecasting error and has a higher value than MAE. Besides, RE% is the absolute error divided by the actual data value, while R2 is a measurement used to explain how much variability of the forecasted values can be caused by its relationship to the measured data. The mathematical expression of the statistical metrics can be seen as following [76]: MAE ¼ n 1X jMSWIobsi n i¼1 MSWIpredi j sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi N 1X 2 RMSE ¼ ðMSWIobsi MSWIpredi Þ n i¼1 PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023 ð28Þ ð29Þ 12 / 37 PLOS ONE Accurate multi-months ahead drought forecasting RE% ¼ MSWIobsi MSWIpredi � 100% MSWIobsi Pn 2 R ¼1 2 i¼1 ðMSWIobsi i¼1 ðMSWIpredi Pn ð30Þ MSWIpredi Þ � pred Þ2 MSWI ð31Þ The best forecasting models provide the possible lower value of RMSE, RE%, and the highest R2. In the above formulas, the n is the total number of samples, MSWIobsi is the observed � pred is the mean forecasted value. drought value, MSWIpred is the forecasted values, and MSWI i 2.8 Uncertainty analysis The forecasting accuracy decreases when the lead time horizon increases [59, 88]; hence, adopting an efficient norm to assess forecasting efficiency is crucial. Since it is difficult to evaluate the reliability of the ML-based models via classical performance measures (i.e., RMSE, R2, etc.) in forecasting the drought at longer lead time scales, UN is used because it is an efficient tool to examine the dependability of forecasting results. Based on the concept of Monte Carlo Simulation, the UN of the adopted model is quantified. This study’s consideration of input parameter uncertainty relates to the accuracy and representativeness of the input data used to make forecasts [89]. The applied approach comprises producing random input data that matches the empirical probability distribution utilized to represent the input parameter, executing the forecasting model, and generating output from the data obtained [90]. The current study depends on randomly resampling the input data set without replacement 100000 times while maintaining the same ratio rate between the training and testing sets [91, 92]. In other words, for each observation, 100000 data points are generated from their corresponding empirical probability distribution and then fed to the applied model to generate the responses. To calculate the 95% confidence intervals for each response, the 2.5th(XL) and 97.5th(XU) percentiles of the cumulative distribution comprising 100000 datapoints are determined. The validity of the final drought model is assessed by calculating its robustness measure, which is determined by the proportion of observed drought values that fall within the 95% confidence interval. A higher ratio indicates greater validity, while a lower ratio indicates the opposite. The Equation for the 95% prediction uncertainty (95PPU) is provided below. � 1 Bracketted by 95 PPU ¼ count mjX i L � X i Obs � X l U � 100 n ð32Þ In the Eq 32, the term 95PPU denotes 95% predicted uncertainty; n is the total number of drought data records; i is the current month that can be changed from 1 to m; XiL, and XiU are lower and upper bands of uncertainty, and XiObs is the observed data of ith month. Additionally, the average width of the confidence interval is calculated using the d−factor, which can be expressed through the following Equation: PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023 d factor ¼ 1 d�x ¼ n n X ðXU i d�x s�x XL i Þ ð33Þ ð34Þ i¼1 13 / 37 PLOS ONE Accurate multi-months ahead drought forecasting The s�x is the standard deviation of the observed data while the average distance between the upper band (97.5th percentile) and the lower band (2.5th percentile). The higher value of the d-factor represents a higher degree of uncertainty. Theoretically, the d-factor is zero, but it normally cannot be attained due to the model uncertainty. Thus, the ideal value of the d-factor is thus considered to be smaller than one [93]. 3. Case study and data description The Great Lakes are a collection of interconnected freshwater lakes with some sea features. They are in the central-eastern part of North America and are linked to the Atlantic Ocean by the Saint Lawrence River. All five lakes—Superior, Michigan, Huron, Erie, and Ontario—are located inside or close to the international boundary between Canada and the United States (see Fig 4 [94]). Lakes Michigan and Huron (Lake Michigan- Huron) are hydrologically considered one body of water since they are connected at the Straits of Mackinac. Furthermore, the Great Lakes Waterway qualifies modern ships to navigate between the lakes. Since approximately one-fifth (21%) of the world’s surface freshwater is contained in the Great Lakes, they are considered the largest group of freshwater lakes on the earth in terms of the total surface area (244,106 km2) [95]. The Great Lakes are one of the most significant water resources in the world, and they provide more than fifty million people in eastern North America with water that can be used for various purposes [96]. The five lakes’ cumulative capacity is approximately 6 x 1015 gallons, which is enough water to drown North America to an average depth of one meter and indicates how huge the lakes are. It is essential to mention that Lake Superior is the most immense, profound, and upstream of the Lakes [97]. It is important to note that the Fig 4. Location of the Great Lakes [1]. 1. Great Lake Map. Available: https://kids.britannica.com/kids/article/Great-Lakes/ 346130#:—:text = The Great Lakes are five,of fresh water on Earth. https://doi.org/10.1371/journal.pone.0290891.g004 PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023 14 / 37 PLOS ONE Accurate multi-months ahead drought forecasting water level data were acquired and gathered from open access source website [98]. The data covers a long period of time from 1918 to 2021. Notably, the raw data for all lakes are available in B1 Table in S1 File, which can be found in Appendix B. 4. Drought calculation 4.1 Multivariate standardized water level index (MSWI) Standardized water level index (SWI) is calculated using the same method as the SPI, but instead of using monthly rain, the water level is used as input [99]. The MSWI is calculated based on several SWI time series for a specific station, and each time scale represents a particular time horizon. The MSWI considers every subset of SWI time scales (i.e., SWI-1, SWI-2, SWI-3, . . ., to SWI-48 months) for studied lakes. Examining multiple SWI time series in detail can offer valuable insights into drought events. However, comparing and analyzing these time series at different scales can be challenging for scholars and may lead to confusion and inefficiencies in addressing the issues. In addition, it is challenging to precisely define the duration of the timescale, such as long or short-term, and accurately connect it to a specific category of drought impact. For example, short-term drought can be represented on a three-month time scale, which raises the logical question of why a standard drought index cannot define two or four-month scales. Accordingly, the standard drought index may not be derived from a specific criterion to classify some drought time scales (e.g., SWI-4, SWT-5, etc.) as short, medium, or long scale. Given the concerns, some researchers suggested decreasing the number of examined drought aggregation time scales and accompanying series by employing a multivariate approach [100]. Notably, a reduction in SWI series does not mean that the SWI for some time scales are not calculated, but rather that a substantial portion of the variability of the SWI on different time scales is summed up into certain series. Thus, the reduction of these drought time scales is computed by using an effective method called principal component analysis (PCA). PCA is a widely used prominent approach for analyzing datasets with high-dimensional space (features) [101], providing several benefits such as retaining maximum information while reducing data features, increasing data interpretability, and enabling visualization of multidimensional data. PCA displays the following linear combinations of the I original variables: PCk ¼ Ek T X ¼ I X eik Xi ; i ¼ 1; 2; 3; . . . ; ð35Þ i¼1 In the above Equation, the PCk is the kth principal component while, Ek is the kth eigenvector. The original ith variable is represented in the above formula as Xi and finally, the eik is the ith element of the kth eigenvector. In PCA, linear combinations are constructed from the original variables. These combinations, known as principal components, are formed in such a way that they are mutually uncorrelated, meaning that they do not have any shared information. The number of principal components that can be generated is determined by the number of original variables in the dataset. The first principal component (PC1) can account for a significant proportion of the variance in the original variables. PCA is particularly useful when there is a high correlation between variables in the dataset. A cross-correlation matrix of the predictors must be constructed to perform PCA, from which the eigenvalues and eigenvectors are calculated. The eigenvector corresponding to the highest eigenvalue represents the weighted coefficients of the first principal component, and this process is repeated for subsequent components. PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023 15 / 37 PLOS ONE Accurate multi-months ahead drought forecasting Fig 5. Drought categories based on the MSWI value. https://doi.org/10.1371/journal.pone.0290891.g005 As PC1 holds the most essential information of the data, the MSWI is derived from this component. However, unlike the standard index (e.g., SWI), which has a mean of zero and a standard deviation of one, PC1 does not meet these requirements due to its algebraic characteristics, which make its values incomparable across different months or locations. Therefore, the time series of PC1 must be standardized using the following Equation based on the mean and standard deviation for various months throughout the year. Z1YM ¼ PC1YM PC�1M SD1M ð36Þ In the Eq 36, the Z1YM (MSWI) is the PC1 standardized value corresponding to Mth month and Yth year, PC�1M , and SD1M are the average and PC1 component for Mth month. Notably, the PC�1M value is very small and close to zero; therefore, it can be neglected. The data of MSWI is organized in ascending order, and a graphic of its corresponding empirical probability distribution is shown to identify the categories of drought and wet severity (see Fig 5)[102]. 4.2. Describing droughts in the Great Lakes The main focus of this study was on two prominent lakes, specifically Lake Superior and Lake Michigan-Huron, within the forecasting models. These lakes were given significant attention due to their larger size and greater depth, distinguishing them from the other lakes. Besides, they are located upstream, as the water flows from them to the rest of the lakes and then to the sea (see A1 Fig in S1 Appendix). The figures attached in the appendix (A2 Fig-A4 Figs in S1 Appendix) show that in the last fifty years, the three lakes (Lake Erie, Lake Ontario, and Lake St. Clair) did not witness any severe drought events and that the lowest value of the PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023 16 / 37 PLOS ONE Accurate multi-months ahead drought forecasting Fig 6. Water level elevation VS drought. a), and d) Boxplot diagrams showing the monthly variation of water level for Lake Superior and Lake MichiganHuron. b), and c) line graphs showing the yearly water level elevation while c), and f) presenting the MSWI from 1920 to 2020. https://doi.org/10.1371/journal.pone.0290891.g006 drought indicator is around -0.5. The reason may be related to the dams constructed and the lakes’ applied regulation process. Notably, Lake Superior and Lake Ontario are currently regulated under some plans implemented by the International Joint Commission. Since 1921, Lake Superior’s levels have been regulated, while the water level of Lake Ontario was regulated in 1958. Because there is a significant difference in the elevation, the regulation process of Lake Ontario does not affect the upper lakes’ water level. Also, the regulation of Lake Superior has an important influence on the whole Great Lakes System. Notably, the detailed drought analysis via MSWI for all lakes can be found in B2 Table in S2 File, which is located in Appendix B. According to Fig 6, there is a large variation in the monthly levels of Lake Superior compared to Lake Michigan- Huron. Both lakes have a large area, but the first one is located upstream, and the water flows from it to the other lakes, which significantly varies the monthly levels of stored water. Based on Fig 6C, and 6F, both lakes suffered from severe drought from mid-1921 to 1929 and late 1998 to 2015. Lake Michigan-Huron generally experienced more frequent drought events than Lake Superior; however, the latter faced the severest drought. The analysis also found that Lake Michigan-Huron experienced four severe drought events between 1920 and 2020, while Lake Superior Lake has only faced two drought events during PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023 17 / 37 PLOS ONE Accurate multi-months ahead drought forecasting Fig 7. Temporal analysis of drought intensity in the Great Lakes region (1921–2021). https://doi.org/10.1371/journal.pone.0290891.g007 the same period. The droughts of the 1921s and 1929s were primarily caused by reduced precipitation, whereas the drought of the 2000s was mainly attributed to the elevated water temperatures induced by climate change, resulting in enhanced evaporation due to a substantial reduction in winter ice cove [60]. It is essential to mention that the MSWI can explain more than 95% of the variability for the nominated SWI time scales in the observed lakes. The temporal analysis of drought intensity in the Great Lakes region, as depicted in Fig 7, unveils a dynamic pattern of fluctuating drought conditions from 1921 to 2021. Notably, Lake Superior and Lake Michigan-Huron emerge as critical focal points with notable variations in drought intensity. Conversely, Lake Ontario transitions from drier to wetter conditions, indicating an overall increase in drought values (MSWI) and a positive drought trend. Similarly, Lake Erie experiences a substantial transition towards positive drought conditions. Lake St. Clair also exhibits patterns consistent with the other lakes. This comprehensive analysis demonstrates the diverse drought intensities observed across the Great Lakes over the analyzed period, particularly emphasizing the significant decrease in drought observed in Lake Superior and Lake Michigan-Huron from 1998 to 2021. This pronounced finding highlights a significant shift in drought conditions for these lakes, underscoring the region’s evolving nature of drought dynamics. 5. Results and discussion The results section is divided into two main scenarios. In the first scenario, the RELM-BWO model is validated against standalone models, while the second scenario compares the performance of the RELM-BWO model with other hybrid models. Since severe drought periods were observed in only Lake Superior and Lake Michigan-Huron over the past fifteen years, the PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023 18 / 37 PLOS ONE Accurate multi-months ahead drought forecasting models were compared, and the best performing one was selected to forecast drought over the remaining three lakes. 5.1 First Scenario: A comparison of RELM-BWO with classical models 5.1.1 Modelling results. This section discusses the evaluated perfo7rmance of four applied models (RF, SVR, RLM, and RELM-BWO) in forecasting MSWI time series for different lead times using various statistical measures and comparable graphs for Lake Superior and Lake Michigan-Huron. The performance measures over the testing phase for both lakes are provided in Tables 2 and 3. The provided assessment shows that the best forecasting result for Lake Superior was obtained for RELM-BWO (R2 = 0.9687, RMSE = 0.0202, and MAE = 0.0156) compared to the other drought forecasting models. Similarly, the adopted model performs better in forecasting one-step month ahead for Lake Michigan-Huron, attaining higher accuracy (R2 = 0.9825, RMSE = 0.0117, and MAE = 0.0093). Generally, the quantitative assessment showed that the performance of the forecasting models in Lake Superior is slightly lower than in Lake Michigan-Huron. Furthermore, the second-best model for forecasting drought is ELM, followed by SVR and the RF shows the poorest performance. The evaluated results showed that the forecasting accuracy of the models reduces when the lead time increases. The forecasting results generally deteriorate with the increase in forecasting multistep month ahead drought due to accumulated errors. For example, for Lake Superior, the RF model is considered among the models the most influenced by the change of time interval where the MARE% varies from 32.14 to 56.20% followed by SVR (5.53 to 35.74%), ELM (2.46 to 36.72%), and RELM-BWO (2 to 30.62%). Regarding Lake Michigan-Huron, all models, except the RF, suffered from a gradual decrease in performance with the increase in lead time. It can be observed that these models have similar characteristics in forecasting the drought for both lakes, producing the peak errors in the forecasting drought for a six-month ahead lead time. The RF has an abnormal performance for forecasting drought at Lake Michigan-Huron as it generates the maximum error at the fourth-step lead time. However, the RF Table 2. Performance indicators of machine learning-based models for Lake superior. Model SVR Performance metrics 2 3 4 5 6 MAE 0.0497 0.0706 0.0867 0.0867 0.1596 0.1956 RMSE 0.0771 0.0887 0.1089 0.1289 0.1974 0.2380 MAPE % 5.53 9.34 12.11 12.14 26.62 35.74 2 RF ELM RELM-BWO Period lead time 1 R 0.9320 0.9299 0.9256 0.9256 0.9002 0.8854 MAE 0.2126 0.2338 0.2768 0.3305 0.3148 0.3191 RMSE 0.2746 0.3086 0.3521 0.3969 0.3822 0.3838 MAPE % 32.14 37.61 41.83 49.44 43.05 56.20 R2 0.9280 0.9210 0.9000 0.8983 0.8825 0.8720 MAE 0.0206 0.0462 0.0767 0.0944 0.1443 0.1873 RMSE 0.0265 0.0601 0.0976 0.1229 0.1774 0.2329 MAPE % 2.46 6.15 10.45 14.31 28.98 36.72 R2 0.9561 0.9499 0.9561 0.9375 0.9252 0.8942 MAE 0.0156 0.0371 0.0587 0.0886 0.1191 0.1524 RMSE 0.0202 0.0489 0.0773 0.1149 0.1523 0.1919 MAPE % 2.00 4.93 8.44 13.61 20.31 30.62 R2 0.9687 0.9601 0.9572 0.9457 0.9342 0.9179 https://doi.org/10.1371/journal.pone.0290891.t002 PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023 19 / 37 PLOS ONE Accurate multi-months ahead drought forecasting Table 3. Performance indicators of machine learning-based models for Lake Michigan-Huron. Model SVR RF ELM RELM-BWO Performance metrics Period lead time 1 2 3 4 5 6 MAE 0.0222 0.0361 0.0438 0.0610 0.0806 0.1084 RMSE 0.0275 0.0434 0.0533 0.0751 0.0994 0.1338 MAPE % 6.28 7.41 7.77 10.41 13.31 17.11 R2 0.9709 0.9669 0.9652 0.9752 0.9529 0.9352 MAE 0.0798 0.1159 0.1303 0.0939 0.1093 0.1309 RMSE 0.1151 0.1748 0.1846 0.1162 0.1401 0.1688 MAPE% 15.19 24.11 28.01 18.89 20.01 23.06 R2 0.9608 0.9379 0.9300 0.9568 0.9465 0.9278 MAE 0.0138 0.0259 0.0396 0.0528 0.0696 0.0913 RMSE 0.0175 0.0321 0.0488 0.0652 0.0862 0.1129 MAPE % 3.17 6.31 7.23 8.43 14.37 16.19 R2 0.9737 0.9684 0.9640 0.9597 0.9569 0.9503 MAE 0.0093 0.0209 0.0339 0.0490 0.0670 0.0812 RMSE 0.0117 0.0258 0.0416 0.0611 0.0822 0.1005 MAPE % 2.82 5.76 7.69 8.16 13.44 15.45 R2 0.9825 0.9791 0.9772 0.9736 0.9681 0.9639 https://doi.org/10.1371/journal.pone.0290891.t003 has the lowest forecasting accuracy for all time intervals compared to other models for both lakes. The quantitative analysis, in general, shows that the most difficult case for the adopted models is forecasting drought at a six-period lead time. In other words, forecasting drought for the next six months is a major challenge for the proposed models because of the loss of important information as well as the sharp changes that occur in drought patterns through such a period, which was revealed by the analyzes previously reported, so it is necessary to test the efficacy of the models in this case. Accordingly, scatter plot diagrams, as shown in Figs 8 and 9 for both lakes, are created. Such a diagram allows us to visualize each data record and provides much information on how the forecasting values deviate from their corresponding actual values. The figures show that the suggested model offers fewer scatter points than comparable models. It is significant to conclude that the forecasting accuracy of Lake Michigan-Huron is higher than Lake Superior. The reasons may relate to the training conditions of the forecasting models. In this study, the models were calibrated using 75% of the drought time series data, while the remaining 25% of the hydrological drought data was reserved for testing purposes. The training data for Lake Michigan-Huron includes many positive and negative drought values, and the characteristics of the training data are more similar to the test data compared to Lake Superior (see Fig 6C and 6F). In more detail, the training data for Lake Superior contained a small percentage (5.88%) of the severe drought data (MSWI � -1), while that value approximately tripled (16.32%) when it came to Lake Michigan-Huron. To sum it up, the presence of multiple periods of severe drought conditions enhances the chance of obtaining accurate forecasts. Conversely, when such observations are few in the training data, it is normal for a forecasting model to observe a decrease in accuracy, especially when asked to simulate severe drought patterns. Assessing the efficacy of the proposed models is crucial, particularly during the severe drought period (MSWI � -1). The RE% values for each model are calculated (as illustrated in Fig 10) across various lead-time intervals to determine the practical benefits of the RELM-BWO model concerning its forecasting precision and performance compared to other PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023 20 / 37 PLOS ONE Accurate multi-months ahead drought forecasting Fig 8. A comparison of observed MSWI and forecast values throughout a six-lead period time for Lake Superior. https://doi.org/10.1371/journal.pone.0290891.g008 models. The proposed RELM-BWO model generally outperforms other comparable methods, with the median relative error (depicted as a horizontal line within the boxplot) being lower. 5.1.2 Analyzing forecasts of turning points. The performance metrics listed in Tables 2 and 3 evaluate a model’s overall performance by considering all predicted points. However, when forecasting drought, metrics such as RMSE, R2, and others may not be ineffective indicators due to high auto-correlation. An overly smoothed time series is unsuitable for testing a prediction model’s effectiveness or robustness due to high auto-correlation. This can lead to R2 values exceeding 0.9 for all models, which may not be significant for drought forecasting PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023 21 / 37 PLOS ONE Accurate multi-months ahead drought forecasting Fig 9. A comparison of observed MSWI and forecast values throughout a six-lead period time for Lake Michigan-Huron. https://doi.org/10.1371/journal.pone.0290891.g009 unless the model can accurately forecast turning points. Hence, assessing the error in turning points of the drought time series data is critical to gauge the effectiveness of models used in forecasting drought. The primary objective of this study’s method was to evaluate the model’s ability to accurately simulate turning points in the drought data. An illustration of the process for identifying critical and turning points in time series data can be found in Fig 11A. Once the these points were detected, the corresponding observations from multiple models were collected. The final step involved calculating the average absolute error of the turning and critical points (AAETP) PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023 22 / 37 PLOS ONE Accurate multi-months ahead drought forecasting Fig 10. Relative error showing the forecasting accuracy of SVR, RF, ELM and RELM-BWO models over different lead times for Lake Superior and Lake Michigan-Huron. https://doi.org/10.1371/journal.pone.0290891.g010 for each model separately. It is significant to mention that the suggested process was conducted for both lakes to assess the model’s predictability in forecasting hydrological drought up to six months in advance. AAETP ¼ average ðjEi jÞ ð37Þ where Ei is a vector is computed for each model by subtracting the actual values of the critical and turning points from the values forecasted by the models. In Fig 11B and 11C, it was demonstrated that the RELM-BWO model outperformed other models in forecasting turning points and yielded fewer AAETP values. The RELM-BWO model’s effectiveness is measured by its ability to decrease the AAETP in drought forecast for both lakes at different lead times. The study’s findings show that the RELM-BWO model outperforms traditional models like SVR, RF, and ELM for Lake Superior, resulting in a 37.33%, 76.74%, and 32.21% improvement in forecasting accuracy. However, the accuracy of drought forecasting for Lake Michigan-Huron decreases slightly, but the RELM-BWO model still outperforms SVR, RF, and ELM by reducing forecasting errors by 16.18%, 56.27%, and 22.93%, respectively. 5.2 Second Scenario: Validation of the hybrid models This section presents a comparison of four hybrid models, namely RELM-BWO, SVR-BWO, RF-BWO, and ELM-BWO, to forecast drought conditions over one to six months in advance for the two largest lakes. The performance of each model in forecasting drought, considering various lead times, is demonstrated in Table 4. The statistical analysis revealed that the PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023 23 / 37 PLOS ONE Accurate multi-months ahead drought forecasting Fig 11. Analysis of the errors in the turning and critical points at different lead time forecasting. a) an example of detecting the turning points, b) results of AAETP for Lake Superior, and c) results of AAETP for Lake Michigan-Huron. https://doi.org/10.1371/journal.pone.0290891.g011 RELM-BWO model outperforms other models in terms of drought forecasting, exhibiting the lowest values for MAE (ranging from 0.0156 to 0.1524), RMSE (ranging from 0.020 to 0.191), MAPE (ranging from 2 to 30.62%), and highest R2 (ranging from 0.9179 to 0.9687) for Lake Superior. Similarly, for Lake Michigan-Huron, RELM-BWO demonstrates the lowest values PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023 24 / 37 PLOS ONE Accurate multi-months ahead drought forecasting Table 4. Performance indicators of the hybrid machine learning-based models. Model ELM-BWO SVR-BWO RF-BWO RELM-BWO Lake Superior Lead Time MAE RMSE MAPE% t+1 0.0194 0.0238 2.34 t+2 0.0381 0.0576 5.62 Lake Michigan-Huron R2 AAETP MAE RMSE MAPE% R2 0.9602 0.0165 0.0119 0.0156 2.94 0.9801 0.0140 0.9579 0.0335 0.0233 0.0293 6.07 0.9745 0.0275 AAETP t+3 0.0701 0.0942 9.06 0.9434 0.0450 0.0370 0.0463 7.12 0.9730 0.0427 t+4 0.0865 0.1218 12.63 0.9291 0.0793 0.0508 0.0647 8.38 0.9703 0.0441 t+5 0.1323 0.1659 25.90 0.9239 0.0949 0.0684 0.0831 14.28 0.9666 0.0581 t+6 0.1675 0.2073 35.35 0.9101 0.1095 0.0891 0.1097 15.97 0.9602 0.0655 0.0420 Overall 0.0857 0.1118 15.15 0.9374 0.0631 0.0468 0.0581 9.13 0.9708 t+1 0.0432 0.0686 4.04 0.9524 0.0384 0.0175 0.0216 3.85 0.9794 0.0203 t+2 0.0557 0.0736 5.96 0.9502 0.0496 0.0268 0.0330 5.59 0.9744 0.0309 t+3 0.0709 0.0942 12.23 0.9473 0.0572 0.0403 0.0498 7.73 0.9725 0.0391 t+4 0.0684 0.1188 11.15 0.9446 0.086 0.0592 0.0725 10.01 0.9694 0.0411 t+5 0.1505 0.1848 24.83 0.9323 0.0985 0.0768 0.0943 13.32 0.9649 0.0489 t+6 0.1848 0.2136 32.45 0.9091 0.1412 0.1020 0.1252 17.48 0.9573 0.0609 Overall 0.0956 0.1256 15.11 0.9393 0.0785 0.0538 0.0661 9.66 0.9697 0.0402 t+1 0.0626 0.0844 10.84 0.9514 0.0545 0.0280 0.0352 7.63 0.9754 0.0280 t+2 0.1026 0.1282 17.65 0.9484 0.0824 0.0454 0.0564 11.78 0.9739 0.0433 t+3 0.1329 0.1700 22.48 0.9442 0.0804 0.0661 0.0818 13.74 0.9711 0.0601 t+4 0.1907 0.2324 33.72 0.9404 0.1387 0.0860 0.1089 14.41 0.9665 0.0698 t+5 0.2230 0.2690 37.27 0.9232 0.1564 0.1062 0.1376 16.98 0.9599 0.0792 t+6 0.2713 0.3287 46.08 0.8907 0.1785 0.1249 0.1604 24.65 0.9487 0.0883 Overall 0.1639 0.2021 0.0761 0.0967 0.0615 28.01 0.9331 0.1152 14.87 0.9659 t+1 0.0156 0.0202 2.00 0.9687 0.0135 0.0093 0.0117 2.82 0.9825 0.0112 t+2 0.0371 0.0489 4.93 0.9601 0.0262 0.0209 0.0258 5.76 0.9791 0.0228 t+3 0.0587 0.0773 8.44 0.9572 0.0350 0.0339 0.0416 7.69 0.9772 0.0383 t+4 0.0886 0.1149 13.61 0.9457 0.0652 0.0490 0.0611 8.16 0.9736 0.0437 t+5 0.1191 0.1523 20.31 0.9342 0.0865 0.0670 0.0822 13.44 0.9681 0.0494 t+6 0.1524 0.1910 30.62 0.9179 0.0870 0.0812 0.1005 15.45 0.9639 0.0582 Overall 0.0786 0.1008 13.32 0.9473 0.0522 0.0436 0.0538 8.89 0.9741 0.0373 https://doi.org/10.1371/journal.pone.0290891.t004 for MAE (ranging from 0.0093 to 0.0812), RMSE (ranging from 0.0117 to 0.1005), MAPE (ranging from 2.82 to 15.45%), and highest R2 (ranging from 0.9639 to 9825). Overall, the RELM-BWO model is considered the best-performing model, followed by ELM-BWO, SVR-BWO and RF-BWO, respectively. The RELM-BWO model’s accuracy has been tested against other hybrid models by measuring its ability to reduce the AAETP indicator for various lead times in the selected lakes. The overall results, which are presented in Table 4, demonstrate that the RELM-BWO model outperforms the ELM-BWO, SVR-BWO, and RF-BWO models, respectively, for Lake Superior. This results in a significant improvement in forecasting accuracy of 17.27%, 29.65%, and 54.69%. Similarly, for Lake Michigan-Huron, the RELM-BWO model was able to significantly decrease the AAETP indicator percentage values, resulting in an 11.19%, 7.21%, and 39.35% improvement compared to the ELM-BWO, SVR-BWO, and RF-BWO models, respectively. The accuracy of drought forecasting six months in advance, as exemplified by the hybrid models, has been visually analyzed using Taylor plots (refer to Fig 12) and Violin plots (refer to Fig 13). Taylor plots reveal that the RELM-BWO model demonstrates a closer standard deviation to the calculated drought than other models, with the highest R2 value and the lowest RMSE for both lakes. Evaluating the effectiveness of these models, especially during critical PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023 25 / 37 PLOS ONE Accurate multi-months ahead drought forecasting Fig 12. Comparing hybrid models in forecasting MSWI using Taylor diagrams. a) Lake Superior and b) Lake Michigan-Huron. https://doi.org/10.1371/journal.pone.0290891.g012 drought periods (MSWI � -1), is essential. The RE% values of each model have been calculated (as depicted in Fig 13) to ascertain the practical advantages of the RELM-BWO model in terms of its forecasting accuracy and performance compared to other models. The proposed RELM-BWO model generally outperforms hybrid models, with a lower median relative error (indicated by the white hollow point near the Figure’s centre) and a smaller interquartile range (a grey line at the Figure’s core). 5.3 Validation of RELM-BWO: A comparative analysis against previous models in literature A more thorough analysis is needed to confirm the superiority of the proposed RELM-BWO model in forecasting drought one month ahead. This requires verifying the accuracy of the RELM-BWO model by comparing its performance with effective models developed in previous studies. One such study employed sophisticated models that combined wavelet transform (WA) with ANFIS, resulting in the WANFIS model [103]. The study’s conclusion indicates that the WANFIS model performs satisfactorily in forecasting mid-term drought, achieving the highest accuracy with an R2 value of 0.9133. Another study examined a model created by merging WA and ANN, which achieved an impressive R2 of 0.938 [104]. Additionally, an innovative approach was developed by combining LSTM and ARIMA models to enhance the accuracy of hydrological drought forecasts [105]. The integration of these models resulted in a hybrid model that achieved an impressive R2 score exceeding 0.91. Furthermore, a separate study utilized an advanced forecasting model that combined Online Sequential ELM with two decomposing filters, namely "Kalman filter regression and coupled with the boundary corrected maximal overlap discrete wavelet transform" [1], to forecast drought in Iran and the model demonstrated a high accuracy with an R2 of 0.93. Furthermore, a hybrid model PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023 26 / 37 PLOS ONE Accurate multi-months ahead drought forecasting Fig 13. Comparing hybrid models in forecasting MSWI using Violin plots. a) Lake Superior and b) Lake Michigan-Huron. https://doi.org/10.1371/journal.pone.0290891.g013 incorporating ANN, SARIMA, and a genetic algorithm was employed to forecast hydrological drought, yielding a good R2 of 0.945 [106]. Other researchers have introduced advanced models using Fuzzy-SVR, Boosted-SVR, and classical SVR for drought forecasting, with the Fuzzy-SVR model performing the best with an R2 of 0.903 [107]. Additionally, other researchers have examined individual models for drought forecasting, such as ANN, achieving good results (R2 > 0.9) [62]. Lastly, a sophisticated combined model was developed by integrating WA, ARIMA, and ANN, resulting in a higher R2 of 0.872 [108]. In summary, the models assessed demonstrated satisfactory accuracy, with R2 values ranging from 0.872 to 0.945. However, these models’ forecasting accuracy remains lower than that of the RELM-SO model, which boasts a superior forecasting accuracy (R2 > 0.968). 5.4 Results of uncertainty analysis (UA) According to the previous discussions and the results presented, it can be confirmed that the RELM-BWO has the best performance among the employed models, obtaining the most accurate forecasts. Even though this model is superior, the forecasting accuracy gradually decreases when the lead time increases. Hence, it is critical to determine the maximum lead time for reliable drought forecasting that the proposed model can provide. Therefore, it is necessary to conduct the uncertainty analysis to ensure the reliability and validity of the results obtained. UA results provide much information on how the variations of forecasting results take place due to the input variations. The robust model has solid stability in its outcomes when PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023 27 / 37 PLOS ONE Accurate multi-months ahead drought forecasting Fig 14. Uncertainty analysis showing the confidence intervals of drought index for A) Lake Superior, and B) Lake Michigan-Huron. https://doi.org/10.1371/journal.pone.0290891.g014 there are a few chances in the input variables. In this study, the uncertainty analysis was performed for the best models (RELM-BWO) throughout the testing phase. The values bracketed by lower and upper 95PPU for both lakes exceeded 88.5% (see Fig 14). Notably if the value bracketed by 95PPU is more than 80%, the models have produced good forecasting [109]. However, Fig 15 shows that the d-factor values generally have a gradual increase when the lead time increases. For Lake Superior, the model provides excellent forecasting for the one-and two-month step-ahead drought because the forecasting results have less uncertainty with d-factor values ranging from 0.493 to 0.595. Conversely, the model provides a higher uncertainty (d-factor is more than one) in three-, four-, five, and six-month ahead, indicating that the drought forecasting results in these time intervals are not good, and the model performs poorly. According to the same Figure, it can be observed that the forecasting accuracy of Lake Michigan-Huron is slightly better than Lake Superior because it provides a lower value of the d-factor, supporting outcomes of previously described statistical indicators. The slight discrepancy in the accuracy of drought prediction may be due to the different hydrological characteristics of the two studied lakes. Other factors, such as the volume, surface area of the studied lakes, and regulation operations, may affect the fluctuation of the levels of those lakes and consequently influence the drought patterns. It can be said that the suggested model has a solid level of accuracy when forecasting a drought of one-, two-, and three-month ahead because the d-factor (less than one) ranges from 0.403 to 0.885. Forecasting hydrological drought for extended time horizons is challenging due to various factors that can increase uncertainty and errors in the forecast. As the forecasting horizon increases, the relationship between inputs and outputs becomes more complicated, hindering prediction accuracy. This increasing complexity often results in higher levels of uncertainty, making it more challenging to predict future values accurately. Other factors impacting forecast accuracy include changes in underlying data patterns, seasonal effects, and unexpected events that affect data patterns. PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023 28 / 37 PLOS ONE Accurate multi-months ahead drought forecasting Fig 15. Values of d-factor showing the reliable accuracy of forecasted drought event based on lead time intervals for Lake Superior and Lake Michigan-Huron. https://doi.org/10.1371/journal.pone.0290891.g015 5.5 Investigating the use of RELM-BWO for forecasting drought in Lake Ontario, Lake Erie, and Lake St. Clair It was also demonstrated that the proposed model (RELM-BWO) effectively forecasts drought conditions in two greatest lakes. Therefore, it is crucial to utilize this model for forecasting droughts in the remaining three lakes (Lake Ontario, Lake Erie, and Lake St. Clair) that have not experienced severe droughts in the past fifty years. The performance metrics of the model’s drought forecasts, ranging from one to six months in advance for all lakes, are provided in Table 5. The quantitative results indicate that the model exhibits the highest accuracy in forecasting droughts for Lake St. Clair (MAE ranging from 0.0184 to 0.1139, RMSE ranging from 0.0240 to 0.1383, and MAPE ranging from 5.96 to 33.75%). The model also demonstrates the lowest AAETP values (ranging from 0.0178 to 0.1091) and the highest R2 values of 0.9905 to 0.9635). The model’s accuracy is satisfactory for the other two lakes, with Lake Ontario exhibiting the least accurate forecasts compared to the other lakes due to higher uncertainty values. Uncertainty analysis shows that the model reliably forecasts droughts up to six months in PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023 29 / 37 PLOS ONE Accurate multi-months ahead drought forecasting Table 5. Performance indicators of the hybrid model for Lake Ontario, Lake Erie, and Lake St. Clair, the use of bold numbers indicates that the forecasting is deemed unreliable due to significant uncertainty in the projected values. Lake/ lead time forecasting Lake Ontario Lake Erie Lake St. Clair Statistical Parameter Uncertainty Analysis 2 Interval MAE RMSE MAPE% R AAETP PPU95 t+1 0.0260 0.0327 15.31 0.9853 0.0209 85.62 0.4271 t+2 0.0574 0.0717 24.15 0.9701 0.0347 69.23 0.59.15 d-factor t+3 0.0896 0.1153 45.46 0.9383 0.0592 61.74 0.6545 t+4 0.1276 0.1603 65.74 0.8906 0.1026 49.33 0.6704 t+5 0.1542 0.1946 95.46 0.8446 0.1432 48.32 0.7790 t+6 0.1805 0.2287 114.47 0.7974 0.1703 57.38 1.0757 t+1 0.0207 0.0269 17.33 0.9891 0.0189 0.88 0.2666 t+2 0.0400 0.0499 22.38 0.9871 0.0304 79.60 0.3746 t+3 0.0626 0.0774 55.53 0.9830 0.0495 80.27 0.3753 t+4 0.0849 0.1036 88.54 0.9776 0.0659 67.45 0.5009 t+5 0.0997 0.1239 110.82 0.9719 0.0860 68.46 0.6170 t+6 0.1265 0.1515 171.17 0.9634 0.1134 65.77 0.6624 t+1 0.0184 0.0240 5.96 0.9905 0.0178 86.96 0.2524 t+2 0.0345 0.0442 11.15 0.9884 0.0219 84.28 0.3722 t+3 0.0537 0.0673 17.97 0.9841 0.0421 72.48 0.4156 t+4 0.0716 0.0902 23.54 0.9784 0.0758 68.12 0.4491 t+5 0.0929 0.1146 28.31 0.9723 0.0963 64.77 0.4667 t+6 0.1139 0.1383 33.75 0.9635 0.1091 64.09 0.5195 https://doi.org/10.1371/journal.pone.0290891.t005 advance, except for Lake Ontario, where dependable forecasts are generated only up to three months ahead. 5.6 Discussion The study’s results indicate that including the applied metaheuristic algorithm (BWO) significantly improves the accuracy of classical approaches like RF, SVR, and ELM in drought forecasting across different lead times. These results are consistent with previous research [51, 110], indicating that employing hybridized machine learning methods improves forecast accuracy and overall predictive performance. By utilizing the BWO algorithm to optimize hyperparameters of standalone models, the forecasting of MSWI in the studied lakes becomes more accurate. Both statistical indices and visual inspections confirm that the BWO algorithm significantly enhances the accuracy of all models when forecasting hydrological drought based on MSWI. The BWO algorithm provides precise parameters to the models, resulting in an increase in their generalization capacity. The RELM-BWO model emerges as the most effective among the hybrid models tested. The study illustrates that the optimal computation of the regularization parameter using the BWO algorithm has made the RELM-BWO model superior to the ELM-BWO model. By enhancing accuracy and performance, the BWO algorithm solidifies the RELM-BWO model as the top-performing hybrid model for MSWI-based drought forecasting. As the forecasting lead time increases, the accuracy of all hybrid models gradually decreases, making it challenging to select reliable forecasts. To address this, uncertainty analysis proves to be an effective tool for identifying reliable forecasts. The study reveals that some lakes can reliably forecast drought up to 2 months ahead, while others require 3- or 6-months lead time. The varying accuracy may be attributed to the data quality, indicating that models trained with different drought conditions yield more accurate forecasting results. PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023 30 / 37 PLOS ONE Accurate multi-months ahead drought forecasting 6. Conclusion This study utilized four hybrid ML-based models (SVR-BWO, ELM-BWO, RF-BWO, and RELM-BWO) and three classical models (ELM, RF, and SVR) to forecast drought in the Great Lakes region for several months in advance (1 to 6 months). Drought calculations were performed via MSWI using water level data from 1918 to 2020. The dataset was split into a training set (December 1921 to January 1997) and a testing set (February 1997 to December 2021) for model development. Over the past 50 years, severe drought events have been limited to Lake Superior and Lake Michigan-Huron. Thus, the selected models were employed to forecast droughts for several months in advance, specifically for these two lakes, and the best model was used to forecast the drought for the other three lakes. To assess the models’ ability to capture turning points, a new indicator called AAETP was introduced, as traditional criteria (RMSE and R2) were sometimes insufficient due to high auto-correlation in the drought dataset. The hybrid models, with BWO optimization, outperformed the classical models. Among them, RELM-BWO was the most accurate and reliable model. Evaluating the recommended model’s effectiveness in reducing AAETP for various lead times, it was observed that, in the case of Lake Superior, it achieved substantial improvements of 37.33%, 76.74%, and 32.21% compared to SVR, RF, and ELM, respectively. Similarly, for Lake Michigan-Huron, the RELM-BWO model demonstrated reductions of 16.18%, 56.27%, and 22.93% in AAETP compared to SVR, RF, and ELM, respectively. Also, the RELM-BWO consistently outperformed other hybrid models, with enhancements of 17.27% to 54.69% for Lake Superior and 7.21% to 39.35% for Lake Michigan-Huron. Generally, as the forecast lead time increased, all models’ accuracy declined. However, the proposed model showed a slightly smaller accuracy deterioration than the others. To ensure the reliability of the forecasting results, uncertainty analysis was conducted using d-factor and 95PPU criteria. The analysis exhibited that RELM-BWO reliably forecasts droughts up to two months in advance for Lake Superior and up to three months ahead for Lake MichiganHuron. Furthermore, uncertainty analysis confirmed that RELM-BWO can estimate droughts up to six months in advance for Lake Erie and Lake St. Clair, and up to three months ahead for Lake Ontario. Supporting information S1 Appendix. (DOCX) S1 File. Water level data. (CSV) S2 File. Droughts. (XLSX) Acknowledgments The authors would like to thank the Universiti Kebangsaan Malaysia for supporting this research. Author Contributions Conceptualization: Mohammed Majeed Hameed, Siti Fatin Mohd Razali. PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023 31 / 37 PLOS ONE Accurate multi-months ahead drought forecasting Data curation: Siti Fatin Mohd Razali, Wan Hanna Melini Wan Mohtar, Norinah Abd Rahman. Formal analysis: Wan Hanna Melini Wan Mohtar. Funding acquisition: Norinah Abd Rahman. Investigation: Wan Hanna Melini Wan Mohtar, Zaher Mundher Yaseen. Methodology: Mohammed Majeed Hameed. Project administration: Siti Fatin Mohd Razali. Software: Mohammed Majeed Hameed. Validation: Siti Fatin Mohd Razali, Wan Hanna Melini Wan Mohtar, Zaher Mundher Yaseen. Visualization: Mohammed Majeed Hameed, Norinah Abd Rahman. Writing – original draft: Mohammed Majeed Hameed, Siti Fatin Mohd Razali. Writing – review & editing: Zaher Mundher Yaseen. References 1. Jamei M, Ahmadianfar I, Karbasi M, Malik A, Kisi O, Yaseen ZM. Development of wavelet-based Kalman Online Sequential Extreme Learning Machine optimized with Boruta-Random Forest for drought index forecasting. Eng Appl Artif Intell. 2023; 117: 105545. https://doi.org/10.1016/j.engappai.2022. 105545 2. Kafy A- Al, Bakshi ASaha, Faisal A Al, Almulhim AI, Rahaman ZA, et al. Assessment and prediction of index based agricultural drought vulnerability using machine learning algorithms. Sci Total Environ. 2023; 867: 161394. https://doi.org/10.1016/j.scitotenv.2023.161394 PMID: 36634773 3. Fathi-Taperasht A, Shafizadeh-Moghadam H, Minaei M, Xu T. Influence of drought duration and severity on drought recovery period for different land cover types: evaluation using MODIS-based indices. Ecol Indic. 2022; 141: 109146. https://doi.org/10.1016/j.ecolind.2022.109146 4. Hasan HH, Razali SF, Muhammad NS, Ahmad A. Modified Hydrological Drought Risk Assessment Based on Spatial and Temporal Approaches. Sustainability. 2022. https://doi.org/10.3390/ su14106337 5. Malik A, Kumar A, Salih SQ, Kim S, Kim NW, Yaseen ZM, et al. Drought index prediction using advanced fuzzy logic model: Regional case study over Kumaon in India. PLoS One. 2020; 15: e0233280. https://doi.org/10.1371/journal.pone.0233280 PMID: 32437386 6. Halder B, Haghbin M, Farooque AA. An Assessment of Urban Expansion Impacts on Land Transformation of Rajpur-Sonarpur Municipality. Knowledge-Based Eng Sci. 2021; 2: 34–53. 7. Ghebrezgabher MG, Yang T, Yang X, Wang C. Assessment of desertification in Eritrea: land degradation based on Landsat images. J Arid Land. 2019; 11: 319–331. 8. Long-term Mishra V. (1870–2018) drought reconstruction in context of surface water security in India. J Hydrol. 2020; 580: 124228. https://doi.org/10.1016/j.jhydrol.2019.124228 9. Awadh SM, Al-Mimar H, Yaseen ZM. Groundwater availability and water demand sustainability over the upper mega aquifers of Arabian Peninsula and west region of Iraq. Environment, Development and Sustainability. 2020. https://doi.org/10.1007/s10668-019-00578-z 10. Bandyopadhyay J, Karar D. Modeling fire station establishment of industrial area using geo-spatial science. Knowledge-Based Eng Sci. 2023; 4: 19–36. 11. Littell JS, Peterson DL, Riley KL, Liu Y, Luce CH. A review of the relationships between drought and forest fire in the United States. Glob Chang Biol. 2016; 22: 2353–2369. https://doi.org/10.1111/gcb. 13275 PMID: 27090489 12. Richardson D, Black AS, Irving D, Matear RJ, Monselesan DP, Risbey JS, et al. Global increase in wildfire potential from compound fire weather and drought. npj Clim Atmos Sci. 2022; 5: 23. https://doi. org/10.1038/s41612-022-00248-4 13. Obasi GOP. WMO’s role in the international decade for natural disaster reduction. Bull Am Meteorol Soc. 1994; 75: 1655–1661. PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023 32 / 37 PLOS ONE Accurate multi-months ahead drought forecasting 14. Wilhite DA, Glantz MH. Understanding: the drought phenomenon: the role of definitions. Water Int. 1985; 10: 111–120. 15. Heim RR Jr. A review of twentieth-century drought indices used in the United States. Bull Am Meteorol Soc. 2002; 83: 1149–1166. 16. Beyaztas U, Yaseen ZM. Drought interval simulation using functional data analysis. J Hydrol. 2019; 579: 124141. https://doi.org/10.1016/j.jhydrol.2019.124141 17. Hasan HH, Mohd Razali SF, Muhammad NS, Ahmad A. Research Trends of Hydrological Drought: A Systematic Review. Water. 2019. https://doi.org/10.3390/w11112252 18. Hosseini TSM, Hosseini SA, Ghermezcheshmeh B, Sharafati A. Drought hazard depending on elevation and precipitation in Lorestan, Iran. Theor Appl Climatol. 2020. https://doi.org/10.1007/s00704020-03386-y 19. McKee TB, Doesken NJ, Kleist J. The relationship of drought frequency and duration to time scales. Proceedings of the 8th Conference on Applied Climatology. Boston, MA, USA; 1993. pp. 179–183. 20. Abd Alraheem E, Jaber NA, Jamei M, Tangang F. Assessment of Future Meteorological Drought Under Representative Concentration Pathways (RCP8. 5) Scenario: Case Study of Iraq. KnowledgeBased Eng Sci. 2022; 3: 64–82. 21. Kumar S, Ahmed SA, Harishnaika N, Arpitha M. Spatial and Temporal Pattern Assessment of Meteorological Drought in Tumakuru District of Karnataka during 1951–2019 using Standardized Precipitation Index. J Geol Soc India. 2022; 98: 822–830. https://doi.org/10.1007/s12594-022-2073-3 22. Shamshirband S, Hashemi S, Salimi H, Samadianfard S, Asadi E, Shadkani S, et al. Predicting Standardized Streamflow index for hydrological drought using machine learning models. Eng Appl Comput Fluid Mech. 2020; 14: 339–350. https://doi.org/10.1080/19942060.2020.1715844 23. Aghelpour P, Bahrami-Pichaghchi H, Varshavian V. Hydrological drought forecasting using multi-scalar streamflow drought index, stochastic models and machine learning approaches, in northern Iran. Stoch Environ Res Risk Assess. 2021; 35: 1615–1635. https://doi.org/10.1007/s00477-020-01949-z 24. Nabipour N, Dehghani M, Mosavi A, Shamshirband S. Short-Term Hydrological Drought Forecasting Based on Different Nature-Inspired Optimization Algorithms Hybridized with Artificial Neural Networks. IEEE Access. 2020; 8: 15210–15222. https://doi.org/10.1109/ACCESS.2020.2964584 25. Babre A, Kalvāns A, Avotniece Z, Retiķe I, Bikše J, Jemeljanova KPM, et al. The use of predefined drought indices for the assessment of groundwater drought episodes in the Baltic States over the period 1989–2018. J Hydrol Reg Stud. 2022; 40: 101049. https://doi.org/10.1016/j.ejrh.2022.101049 26. Kubiak-Wójcicka K, Pilarska A, Kamiński D. The Analysis of Long-Term Trends in the Meteorological and Hydrological Drought Occurrences Using Non-Parametric Methods—Case Study of the Catchment of the Upper Noteć River (Central Poland). Atmosphere. 2021. https://doi.org/10.3390/ atmos12091098 27. Salimian N, Nazari S, Ahmadi A. Assessment of the uncertainties of global climate models in the evaluation of standardized precipitation and runoff indices: a case study. Hydrol Sci J. 2021; 66: 1419– 1436. https://doi.org/10.1080/02626667.2021.1937178 28. Achite M, Jehanzaib M, Elshaboury N, Kim T-W. Evaluation of Machine Learning Techniques for Hydrological Drought Modeling: A Case Study of the Wadi Ouahrane Basin in Algeria. Water. 2022. https://doi.org/10.3390/w14030431 29. Jehanzaib M, Bilal Idrees M, Kim D, Kim T-W. Comprehensive Evaluation of Machine Learning Techniques for Hydrological Drought Forecasting. J Irrig Drain Eng. 2021;147. https://doi.org/10.1061/ (ASCE)IR.1943-4774.0001575 30. Wang GC, Zhang Q, Band SS, Dehghani M, wing Chau K, Tho QT, et al. Monthly and seasonal hydrological drought forecasting using multiple extreme learning machine models. Eng Appl Comput Fluid Mech. 2022; 16: 1364–1381. https://doi.org/10.1080/19942060.2022.2089732 31. Henao Casas JD, Fernández Escalante E, Ayuga F. Alleviating drought and water scarcity in the Mediterranean region through managed aquifer recharge. Hydrogeol J. 2022; 30: 1685–1699. https://doi. org/10.1007/s10040-022-02513-5 32. Vélez-Nicolás M, Garcı́a-López S, Ruiz-Ortiz V, Zazo S, Molina JL. Precipitation Variability and Drought Assessment Using the SPI: Application to Long-Term Series in the Strait of Gibraltar Area. Water. 2022. https://doi.org/10.3390/w14060884 33. Mishra AK, Desai VR. Drought forecasting using feed-forward recursive neural network. Ecol Modell. 2006; 198: 127–138. https://doi.org/10.1016/j.ecolmodel.2006.04.017 34. Mossad A, Alazba AA. Drought forecasting using stochastic models in a hyper-arid climate. Atmosphere. 2015. pp. 410–430. https://doi.org/10.3390/atmos6040410 PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023 33 / 37 PLOS ONE Accurate multi-months ahead drought forecasting 35. Borji M, Malekian A, Salajegheh A, Ghadimi M. Multi-time-scale analysis of hydrological drought forecasting using support vector regression (SVR) and artificial neural networks (ANN). Arab J Geosci. 2016; 9: 725. https://doi.org/10.1007/s12517-016-2750-x 36. Deo RC, Salcedo-Sanz S, Carro-Calvo L, Saavedra-Moreno B. Chapter 10—Drought Prediction With Standardized Precipitation and Evapotranspiration Index and Support Vector Regression Models. In: Samui P, Kim D, Ghosh CBT-IDS and M, editors. Elsevier; 2018. pp. 151–174. https://doi.org/10. 1016/B978-0-12-812056-9.00010-5 37. Tian Y, Xu Y-P, Wang G. Agricultural drought prediction using climate indices based on Support Vector Regression in Xiangjiang River basin. Sci Total Environ. 2018;622–623: 710–720. https://doi.org/ 10.1016/j.scitotenv.2017.12.025 PMID: 29223897 38. Mokhtarzad M, Eskandari F, Jamshidi Vanjani N, Arabasadi A. Drought forecasting by ANN, ANFIS, and SVM and comparison of the models. Environ Earth Sci. 2017; 76: 729. https://doi.org/10.1007/ s12665-017-7064-0 39. Ali M, Deo RC, Downs NJ, Maraseni T. An ensemble-ANFIS based uncertainty assessment model for forecasting multi-scalar standardized precipitation index. Atmos Res. 2018; 207: 155–180. https://doi. org/10.1016/j.atmosres.2018.02.024 40. Choubin B, Zehtabian G, Azareh A, Rafiei-Sardooi E, Sajedi-Hosseini F, Kişi Ö. Precipitation forecasting using classification and regression trees (CART) model: a comparative study of different approaches. Environ Earth Sci. 2018; 77: 314. https://doi.org/10.1007/s12665-018-7498-z 41. Achour K, Meddi M, Zeroual A, Bouabdelli S, Maccioni P, Moramarco T. Spatio-temporal analysis and forecasting of drought in the plains of northwestern Algeria using the standardized precipitation index. J Earth Syst Sci. 2020; 129: 42. https://doi.org/10.1007/s12040-019-1306-3 42. Dikshit A, Pradhan B, Alamri AM. Temporal hydrological drought index forecasting for New South Wales, Australia using machine learning approaches. Atmosphere (Basel). 2020;11. https://doi.org/ 10.3390/atmos11060585 43. Hakam O, Baali A, Belhaj Ali A. Modeling drought-related yield losses using new geospatial technologies and machine learning approaches: case of the Gharb plain, North-West Morocco. Model Earth Syst Environ. 2023; 9: 647–667. https://doi.org/10.1007/s40808-022-01523-2 44. Griebel A, Metzen D, Pendall E, Nolan RH, Clarke H, Renchon AA, et al. Recovery from Severe Mistletoe Infection After Heat- and Drought-Induced Mistletoe Death. Ecosystems. 2022; 25: 1–16. https:// doi.org/10.1007/s10021-021-00635-7 45. Vo TQ, Kim S-H, Nguyen DH, Bae D-H. LSTM-CM: a hybrid approach for natural drought prediction based on deep learning and climate models. Stoch Environ Res Risk Assess. 2023; 37: 2035–2051. https://doi.org/10.1007/s00477-022-02378-w 46. Dikshit A, Pradhan B, Alamri AM. Short-Term Spatio-Temporal Drought Forecasting Using Random Forests Model at New South Wales, Australia. Applied Sciences. 2020. https://doi.org/10.3390/ app10124254 47. Feng P, Wang B, Luo J-J, Liu DL, Waters C, Ji F, et al. Using large-scale climate drivers to forecast meteorological drought condition in growing season across the Australian wheatbelt. Sci Total Environ. 2020;724. https://doi.org/10.1016/j.scitotenv.2020.138162 PMID: 32247977 48. Hameed MM, Abed MA, Al-Ansari N, Alomar MK. Predicting Compressive Strength of Concrete Containing Industrial Waste Materials: Novel and Hybrid Machine Learning Model. Yin D, editor. Adv Civ Eng. 2022; 2022: 5586737. https://doi.org/10.1155/2022/5586737 49. Mamata RC, Ramlia A, Yazidb MRM, Kasab A, Razalib SFM, Bastamc MN. Slope Stability Prediction of Road Embankment using Artificial Neural Network Combined with Genetic Algorithm. J Kejuruter. 2022; 34: 165–173. 50. Kisi O, Docheshmeh Gorgij A, Zounemat-Kermani M, Mahdavi-Meymand A, Kim S. Drought forecasting using novel heuristic methods in a semi-arid environment. J Hydrol. 2019; 578: 124053. https://doi. org/10.1016/j.jhydrol.2019.124053 51. Malik A, Tikhamarine Y, Souag-Gamane D, Rai P, Sammen SS, Kisi O. Support vector regression integrated with novel meta-heuristic algorithms for meteorological drought prediction. Meteorol Atmos Phys. 2021; 133: 891–909. https://doi.org/10.1007/s00703-021-00787-0 52. Aghelpour P, Mohammadi B, Mehdizadeh S, Bahrami-Pichaghchi H, Duan Z. A novel hybrid dragonfly optimization algorithm for agricultural drought prediction. Stoch Environ Res Risk Assess. 2021. https://doi.org/10.1007/s00477-021-02011-2 53. Danandeh Mehr A, Vaheddoost B, Mohammadi B. ENN-SA: A novel neuro-annealing model for multistation drought prediction. Comput Geosci. 2020; 145: 104622. https://doi.org/10.1016/j.cageo.2020. 104622 PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023 34 / 37 PLOS ONE Accurate multi-months ahead drought forecasting 54. Rahmati O, Panahi M, Kalantari Z, Soltani E, Falah F, Dayal KS, et al. Capability and robustness of novel hybridized models used for drought hazard modeling in southeast Queensland, Australia. Sci Total Environ. 2020; 718: 134656. https://doi.org/10.1016/j.scitotenv.2019.134656 PMID: 31839310 55. AlOmar MK, Khaleel F, AlSaadi AA, Hameed MM, AlSaadi MA, Al-Ansari N. The Influence of Data Length on the Performance of Artificial Intelligence Models in Predicting Air Pollution. Rathnayake U, editor. Adv Meteorol. 2022; 2022: 5346647. https://doi.org/10.1155/2022/5346647 56. Deo RC, Şahin M. Application of the extreme learning machine algorithm for the prediction of monthly Effective Drought Index in eastern Australia. Atmos Res. 2015; 153: 512–525. 57. Yaseen ZM, Ali M, Sharafati A, Al-Ansari N, Shahid S. Forecasting standardized precipitation index using data intelligence models: regional investigation of Bangladesh. Sci Rep. 2021;11. https://doi.org/ 10.1038/s41598-021-82977-9 PMID: 33564055 58. Liu Y, Wang LH, Yang LB, Liu XM. Drought prediction based on an improved VMD-OS-QR-ELM model. PLoS One. 2022; 17: e0262329. https://doi.org/10.1371/journal.pone.0262329 PMID: 34990468 59. Fung KF, Huang YF, Koo CH, Soh YW. Drought forecasting: A review of modelling approaches 2007– 2017. J Water Clim Chang. 2019; 11: 771–799. https://doi.org/10.2166/wcc.2019.236 60. Assani AA, Landry R, Azouaoui O, Massicotte P, Gratton D. Comparison of the Characteristics (Frequency and Timing) of Drought and Wetness Indices of Annual Mean Water Levels in the Five North American Great Lakes. Water Resour Manag. 2016; 30: 359–373. https://doi.org/10.1007/s11269015-1166-9 61. Zhong C, Li G, Meng Z. Beluga whale optimization: A novel nature-inspired metaheuristic algorithm. Knowledge-Based Syst. 2022; 251: 109215. https://doi.org/10.1016/j.knosys.2022.109215 62. Tareke KA, Awoke AG. Hydrological drought forecasting and monitoring system development using artificial neural network (ANN) in Ethiopia. Heliyon. 2023; 9: e13287. https://doi.org/10.1016/j.heliyon. 2023.e13287 PMID: 36816247 63. Ghasemi P, Karbasi M, Zamani Nouri A, Sarai Tabrizi M, Azamathulla HM. Application of Gaussian process regression to forecast multi-step ahead SPEI drought index. Alexandria Eng J. 2021; 60: 5375–5392. https://doi.org/10.1016/j.aej.2021.04.022 64. Madadgar S, Moradkhani H. A Bayesian Framework for Probabilistic Seasonal Drought Forecasting. J Hydrometeorol. 2013; 14: 1685–1705. https://doi.org/10.1175/JHM-D-13-010.1 65. Malik A, Kumar A. Meteorological drought prediction using heuristic approaches based on effective drought index: a case study in Uttarakhand. Arab J Geosci. 2020; 13: 276. https://doi.org/10.1007/ s12517-020-5239-6 66. Özger M, Başakın EE, Ekmekcioğlu Ö, Hacısüleyman V. Comparison of wavelet and empirical mode decomposition hybrid models in drought prediction. Comput Electron Agric. 2020;179. https://doi.org/ 10.1016/j.compag.2020.105851 67. Khosravi K, Daggupati P, Alami MT, Awadh SM, Ghareb MI, Panahi M, et al. Meteorological data mining and hybrid data-intelligence models for reference evaporation simulation: A case study in Iraq. Comput Electron Agric. 2019; 167: 105041. https://doi.org/10.1016/j.compag.2019.105041 68. Hameed MM, AlOmar MK, Khaleel F, Al-Ansari N. An Extra Tree Regression Model for Discharge Coefficient Prediction: Novel, Practical Applications in the Hydraulic Sector and Future Research Directions. Armaghani D, editor. Math Probl Eng. 2021; 2021: 7001710. https://doi.org/10.1155/2021/ 7001710 69. Hameed MM, Alomar MK, Mohd Razali SF, Kareem Khalaf MA, Baniya WJ, Sharafati A, et al. Application of Artificial Intelligence Models for Evapotranspiration Prediction along the Southern Coast of Turkey. Xu H, editor. Complexity. 2021; 2021: 1–20. https://doi.org/10.1155/2021/8850243 70. Elbeltagi A, Kumar M, Kushwaha NL, Pande CB, Ditthakit P, Vishwakarma DK, et al. Drought indicator analysis and forecasting using data driven models: case study in Jaisalmer, India. Stoch Environ Res Risk Assess. 2023; 37: 113–131. https://doi.org/10.1007/s00477-022-02277-0 71. Carranza C, Nolet C, Pezij M, van der Ploeg M. Root zone soil moisture estimation with Random Forest. J Hydrol. 2021; 593: 125840. https://doi.org/10.1016/j.jhydrol.2020.125840 72. Ahmad MW, Mourshed M, Rezgui Y. Tree-based ensemble methods for predicting PV power generation and their comparison with support vector regression. Energy. 2018; 164: 465–474. https://doi.org/ 10.1016/j.energy.2018.08.207 73. Dong B, Cao C, Lee SE. Applying support vector machines to predict building energy consumption in tropical region. Energy Build. 2005; 37: 545–553. https://doi.org/10.1016/j.enbuild.2004.09.009 74. AlOmar MK, Hameed MM, Al-Ansari N, AlSaadi MA. Data-Driven Model for the Prediction of Total Dissolved Gas: Robust Artificial Intelligence Approach. Jiang Y-Z, editor. Adv Civ Eng. 2020; 2020: 6618842. https://doi.org/10.1155/2020/6618842 PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023 35 / 37 PLOS ONE Accurate multi-months ahead drought forecasting 75. Li Q, Meng Q, Cai J, Yoshino H, Mochida A. Applying support vector machine to predict hourly cooling load in the building. Appl Energy. 2009; 86: 2249–2256. https://doi.org/10.1016/j.apenergy.2008.11. 035 76. Alomar MK, Khaleel F, Aljumaily MM, Masood A, Razali SFM, AlSaadi MA, et al. Data-driven models for atmospheric air temperature forecasting at a continental climate region. PLoS One. 2022; 17: e0277079. https://doi.org/10.1371/journal.pone.0277079 PMID: 36327280 77. Huang G-B, Zhu Q-Y, Siew C-K. Extreme learning machine: theory and applications. Neurocomputing. 2006; 70: 489–501. 78. AlOmar MK, Hameed MM, AlSaadi MA. Multi hours ahead prediction of surface ozone gas concentration: Robust artificial intelligence approach. Atmos Pollut Res. 2020. https://doi.org/10.1016/j.apr. 2020.06.024 79. Hameed MM, Khaleel F, AlOmar MK, Mohd Razali SF, AlSaadi MA. Optimising the Selection of Input Variables to Increase the Predicting Accuracy of Shear Strength for Deep Beams. Naser MZ, editor. Complexity. 2022; 2022: 6532763. https://doi.org/10.1155/2022/6532763 80. Moghadam RG, Yaghoubi B, Rajabi A, Shabanlou S, Izadbakhsh MA. Evaluation of discharge coefficient of triangular side orifices by using regularized extreme learning machine. Appl Water Sci. 2022; 12: 145. https://doi.org/10.1007/s13201-022-01665-9 81. Hameed MM, AlOmar MK, Al-Saadi AAA, AlSaadi MA. Inflow forecasting using regularized extreme learning machine: Haditha reservoir chosen as case study. Stoch Environ Res Risk Assess. 2022. https://doi.org/10.1007/s00477-022-02254-7 82. Bartlett P. For valid generalization the size of the weights is more important than the size of the network. Adv Neural Inf Process Syst. 1996;9. 83. Deng W, Zheng Q, Chen L. Regularized Extreme Learning Machine. 2009 IEEE Symposium on Computational Intelligence and Data Mining. 2009. pp. 389–395. https://doi.org/10.1109/CIDM.2009. 4938676 84. Ewees AA, Gaheen MA, Yaseen ZM, Ghoniem RM. Grasshopper Optimization Algorithm with Crossover Operators for Feature Selection and Solving Engineering Problems. IEEE Access. 2022. 85. Mantegna RN. Fast, accurate algorithm for numerical simulation of L\’evy stable stochastic processes. Phys Rev E. 1994; 49: 4677–4683. https://doi.org/10.1103/PhysRevE.49.4677 PMID: 9961762 86. Pouya A, Ozgur K, Vahid V. Multivariate Drought Forecasting in Short- and Long-Term Horizons Using MSPI and Data-Driven Approaches. J Hydrol Eng. 2021; 26: 4021006. https://doi.org/10.1061/ (ASCE)HE.1943-5584.0002059 87. Pande CB, Kushwaha NL, Orimoloye IR, Kumar R, Abdo HG, Tolche AD, et al. Comparative Assessment of Improved SVM Method under Different Kernel Functions for Predicting Multi-scale Drought Index. Water Resour Manag. 2023. https://doi.org/10.1007/s11269-023-03440-0 88. Sharafati A, Yasa R, Azamathulla HM. Assessment of stochastic approaches in prediction of waveinduced pipeline scour depth. J Pipeline Syst Eng Pract. 2018;9. https://doi.org/10.1061/(ASCE)PS. 1949-1204.0000347 89. Antanasijević D, Pocajt V, Perić-Grujić A, Ristić M. Modelling of dissolved oxygen in the Danube River using artificial neural networks and Monte Carlo Simulation uncertainty analysis. J Hydrol. 2014; 519: 1895–1907. https://doi.org/10.1016/j.jhydrol.2014.10.009 90. Noori R, Hoshyaripour G, Ashrafi K, Araabi BN. Uncertainty analysis of developed ANN and ANFIS models in prediction of carbon monoxide daily concentration. Atmos Environ. 2010; 44: 476–482. https://doi.org/10.1016/j.atmosenv.2009.11.005 91. Dehghani M, Saghafian B, Nasiri Saleh F, Farokhnia A, Noori R. Uncertainty analysis of streamflow drought forecast using artificial neural networks and Monte-Carlo simulation. Int J Climatol. 2014; 34: 1169–1180. https://doi.org/10.1002/joc.3754 92. Gao M, Yin L, Ning J. Artificial neural network model for ozone concentration estimation and Monte Carlo analysis. Atmos Environ. 2018; 184: 129–139. https://doi.org/10.1016/j.atmosenv.2018.03.027 93. Nur Adli Zakaria M, Abdul Malek M, Zolkepli M, Najah Ahmed A. Application of artificial intelligence algorithms for hourly river level forecast: A case study of Muda River, Malaysia. Alexandria Eng J. 2021; 60: 4015–4028. https://doi.org/10.1016/j.aej.2021.02.046 94. Great Lake Map. Available: https://kids.britannica.com/kids/article/Great-Lakes/346130#:~:text= TheGreatLakesarefive,offreshwateronEarth. 95. de Loë RC, Kreutzwiser RD. Climate Variability, Climate Change and Water Resource Management in the Great Lakes. Clim Change. 2000; 45: 163–179. https://doi.org/10.1023/A:1005649219332 96. McBean E, Motiee H. Assessment of impact of climate change on water resources: a long term analysis of the Great Lakes of North America. Hydrol Earth Syst Sci. 2008; 12: 239–255. https://doi.org/10. 5194/hess-12-239-2008 PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023 36 / 37 PLOS ONE Accurate multi-months ahead drought forecasting 97. Hunter TS, Croley TE. Great Lakes Monthly Hydrologic Data. NOAA Data Report ERL GLERL. 1993. 98. GrateLakes. Available: https://www.lre.usace.army.mil/Missions/Great-Lakes-Information/GreatLakes-Information-2/Water-Level-Data/ 99. Saase R, Schütt B, Bebermeier W. Analyzing the Dependence of Major Tanks in the Headwaters of the Aruvi Aru Catchment on Precipitation. Applying Drought Indices to Meteorological and Hydrological Data. Water. 2020. https://doi.org/10.3390/w12102941 100. Bazrafshan J, Hejabi S, Rahimi J. Drought Monitoring Using the Multivariate Standardized Precipitation Index (MSPI). Water Resour Manag. 2014; 28: 1045–1060. https://doi.org/10.1007/s11269-0140533-2 101. Najafia T, Jaafara R, Remlib R, Zaidib AW, Chellappana K. The Role of Brain Signal Processing and Neuronal Modelling in Epilepsy–A Review. J Kejuruter. 2021; 33: 801–815. 102. Aghelpour P, Varshavian V. Forecasting Different Types of Droughts Simultaneously Using Multivariate Standardized Precipitation Index (MSPI), MLP Neural Network, and Imperialistic Competitive Algorithm (ICA). Shahid S, editor. Complexity. 2021; 2021: 6610228. https://doi.org/10.1155/2021/ 6610228 103. Soh YW, Koo CH, Huang YF, Fung KF. Application of artificial intelligence models for the prediction of standardized precipitation evapotranspiration index (SPEI) at Langat River Basin, Malaysia. Comput Electron Agric. 2018; 144: 164–173. https://doi.org/10.1016/j.compag.2017.12.002 104. Khan MMH, Muhammad NS, El-Shafie A. Wavelet-ANN versus ANN-Based Model for Hydrometeorological Drought Forecasting. Water. 2018. https://doi.org/10.3390/w10080998 105. Xu D, Zhang Q, Ding Y, Zhang D. Application of a hybrid ARIMA-LSTM model based on the SPEI for drought forecasting. Environ Sci Pollut Res. 2022; 29: 4128–4144. https://doi.org/10.1007/s11356021-15325-z PMID: 34403057 106. Alquraish M, Ali. Abuhasel K, S. Alqahtani A, Khadr M. SPI-Based Hybrid Hidden Markov–GA, ARIMA–GA, and ARIMA–GA–ANN Models for Meteorological Drought Forecasting. Sustainability. 2021. https://doi.org/10.3390/su132212576 107. Fung KF, Huang YF, Koo CH, Mirzaei M. Improved SVR machine learning models for agricultural drought prediction at downstream of Langat River Basin, Malaysia. J Water Clim Chang. 2019; 11: 1383–1398. https://doi.org/10.2166/wcc.2019.295 108. Khan MMH, Muhammad NS, El-Shafie A. Wavelet based hybrid ANN-ARIMA models for meteorological drought forecasting. J Hydrol. 2020; 590: 125380. https://doi.org/10.1016/j.jhydrol.2020.125380 109. Sattar Hanoon M, Najah Ahmed A, Razzaq A, Oudah AY, Alkhayyat A, Feng Huang Y, et al. Prediction of hydropower generation via machine learning algorithms at three Gorges Dam, China. Ain Shams Eng J. 2023; 14: 101919. https://doi.org/10.1016/j.asej.2022.101919 110. Malik A, Tikhamarine Y, Sammen SS, Abba SI, Shahid S. Prediction of meteorological drought by using hybrid support vector regression optimized with HHO versus PSO algorithms. Environ Sci Pollut Res. 2021; 28: 39139–39158. https://doi.org/10.1007/s11356-021-13445-0 PMID: 33751346 PLOS ONE | https://doi.org/10.1371/journal.pone.0290891 October 31, 2023 37 / 37