Creation of a Physics-Inspired Neural Network for Predicting Weather Parameters Dax Fabella, Jasler Sapetin, Rommel Yray Advisers: John Paul Vergara, Clark Kendrick Go November 25, 2023 1 Introduction The intricate nature of atmospheric processes poses a significant challenge in predicting weather parameters [7]. Accurate weather prediction is an important task because it mitigates the effects of natural disasters; for instance, predicting rainfall is crucial because it lowers the risk of floods, landslides, and droughts [6]. Similarly, reliable weather models alert humans to potential health risks; predicting extreme values of relative humidity serves as a cautionary measure for overheating, which incidentally can lead to strokes [8]. Given the unpredictability of weather parameters such as precipitation and humidity, accurate prediction models are essential for human health and safety. Today, the most commonly used method in weather forecasting is numerical weather prediction (NWP), which employs mathematical equations to represent physical processes. While NWP models have advanced the understanding of these processes, they rely on simplifying assumptions that may fail to capture complex atmospheric relationships [4]. Recognizing these limitations, an emerging approach to weather forecasting utilizes machine learning (ML) to capture complex relationships within weather data [5]. Previous studies have harnessed ML methods, particularly neural networks, to model physical processes [3]; however, there has been limited success when using these techniques for weather prediction. Consequently, this study aims to predict weather parameters by employing artificial neural networks, with a specific focus on predicting either relative humidity or rainfall. 2 2.1 Materials and Methods Artificial Neural Network The initial phase of this research focuses on the implementation of an Artificial Neural Network (ANN), a computational model inspired by the structure of the human brain. Once the ANN framework is established, the researchers first aim to probe and identify which target variable—rainfall or relative humidity—would exhibit better predictability. Delving further, the ANN’s architecture comprises three layers (input, hidden, and output layer), with each layer containing ‘neurons’ through which data passes, connecting one layer to another [9]. The input layer takes in the input data, its dimension determined by the number of features in the dataset. Alternatively, the hidden layer applies a transformation on the input data through a dense layer of neurons using an activation function. In this study, the rectified linear unit (ReLU) is the activation function utilized, which is a piecewise linear function given by f (x) = max{0, x} where x is the input data. ReLU is effective in introducing non-linear properties to the neural network necessary for the model to learn complex patterns. Lastly, the output layer is the final layer where the desired continuous numeric predictions are obtained after the network transforms the extracted meaningful features of the hidden layer. Furthermore, the Adaptive Moment Estimation (Adam) optimizer is used as the optimizer for the ANN. The Adam optimizer is essential because it updates the weights of the neurons in each layer during the training process; this minimizes the ‘loss function’ and ensures that the neural network learns meaningful patterns from the training data and generalizes well to unseen data or the testing data [2]. 2.2 Loss Function The loss function is an equation that measures the disparity between the model’s target and predicted output; it measures how well a neural network models the data [10]. For the initial phase of the research, the loss function utilized is the Mean Squared Error (MSE), but it would be modified in the future by incorporating the suitable Partial Differential Equation (PDE). To be more specific, the researchers plan to enhance the precision and interpretability of the predictive models by integrating physics laws. This will be achieved by augmenting a PDE into the loss function, hence the ’Physics-Informed’ loss function. 2.3 Data Preprocessing This study uses the weather dataset acquired from the Manila Observatory, which contains 11 years of weather data (2010-2020) captured at 5-minute intervals. The raw data comprises 38 unique weather parameters and spans over a million entries. For data preprocessing, the first step was to check for the null value distribution of the raw data. Upon analysis, three features—highest wind speed direction, wind speed direction, and solar index—were dropped due to a significant percentage of missing values (23%, 43%, 78%) as well as a widespread distribution of null values. Following this, the dataset was sliced to acquire a particular timeframe with rela- tively fewer missing entries. The resulting sliced data covers the 7 years from 2011-2017. Next, the remaining data was aggregated from a 5-minute to a daily granularity to condense the dataset in preparation for modeling. To handle the missing data, the Multiple Imputation by Chained Equations (MICE) algorithm was used to iteratively impute the null values in the aggregated dataset based on Bayesian Ridge regression. Compared with a single imputation, this procedure is more effective since it can “learn” the underlying pattern of the data and takes into account data variability and uncertainty [11]. For feature selection, a correlation matrix was used to determine the highly correlated variables (those with correlation coefficients ≥ 0.8) from the interpolated dataset; these variables were dropped to reduce multicollinearity and improve model efficiency. Min-max scaling was applied to the resulting dataset to ensure a standardized scale across all weather variables. Finally, the TimeSeries Split method was employed for data splitting. This method ensures that test sets consistently occur after training sets, an important consideration for the Manila Observatory dataset which involves time-series weather data. The split data serves as the training and testing sets that are fed into the ANN model. 2.4 Evaluation Metrics Various metrics, namely R2 , Root Mean Squared Error (RMSE), Weighted Absolute Percentage Error (WAPE), and Mean Absolute Percentage Error (MAPE) are used to assess model performance. Below are the corresponding formulas where ŷi represents the predicted values and yi denotes the actual observations: Pn (yi − ŷi )2 R2 = 1 − Pi=1 n 2 i=1 (yi − ȳ) v u n u1 X (yi − ŷi )2 RM SE = t n i=1 W AP E = Pn |(y − ŷi )| i=1 Pn i i=1 |yi | n M AP E = 1 X yi − ŷi n i=1 yi R2 is a measure of how well predicted values fit the actual values. It ranges from 0 to 1, 1 indicating a perfect fit. On the other hand, RMSE, WAPE, and MAPE are error metrics concerned with the difference between predicted and actual values—lower values indicate better accuracy. In the initial phase of this study, the researchers compare the performance of rainfall and humidity ANN models to determine their predictability. This comparative analysis seeks to pinpoint the weather variable that will shape the research focus in the following months. relatively small RMSE (0.0409), it exhibits an extremely large MAPE (2.4214 × 1013 ) indicating significant inaccuracy in percentage errors. On the other hand, the R2 value (0.5044) for the Humidity ANN model (H-ANN) shows an improvement from the rainfall model, which explains less variation in the data. The slightly overestimated RMSE (0.0527) for the H-ANN indicates that errors are slightly larger than those observed in the RANN. However, the H-ANN excels better in predicting percentage differences, as evidenced by the considerably lower MAPE (0.1436). The H-ANN also boasts a lower WAPE (0.0785) than that of the R-ANN (1.0412), implying that it exhibits smaller weighted percentage errors and places increased importance on larger absolute errors. Below are the model scores: ANN for Predicting Weather Parameters Metric R2 RMSE WAPE MAPE R-ANN 0.3234 0.0409 1.0412 2.4214 × 1013 H-ANN 0.5044 0.0527 0.0785 0.1436 In summary, the H-ANN demonstrates superiority due to a higher R2 , significantly reduced MAPE, and lower WAPE values. It also has a comparable RMSE (0.0527 vs 0.0409) with the R-ANN model. Moreover, this version of the H-ANN outperforms all other models tested. 4 Conclusion and Future Work Based on the assessment criteria, the ANN model demonstrates the potential for forecasting critical meteorological parameters such as rainfall and humidity. In terms of the R2 , MAPE, and WAPE criteria, the HANN model showed better performance than the R-ANN model. From this, it can be concluded that the humidity model provides more accurate predictions than the rainfall model. Humidity can also be depicted better by the ANN model, having a lower percentage error in comparison with other approaches. Having identified greater predictability in humidity compared to rainfall using the ANN approach, the researchers will proceed to enhance the ANN model with humidity as the forecasting variable. 4.1 Future Considerations An integral next phase will involve including PDEs in the loss function to increase the physics awareness and predictive skill of the ANN model. This will lead to the development of the ’Physics-informed Neural Network’ (PINN) that incorporates insights from atmospheric dynamics and constraints. The PDEs would act as a regularization measure and thus, a guiding function during model training. In addition, modifying the model to focus on nowcasting with recent, area-specific data would yield highly accurate short-term forecasts and real-time storm tracking. Consequently, through an iterative process, the current model could evolve into a nowcasting 3 Results and Discussion PINN, with the potential to combine ensembles for longterm forecasting. Through physical knowledge injection The R2 value (0.3234) for the Rainfall ANN model (R- using PDEs and transitioning to recurrent nowcasting ANN) suggests that it only accounts for a small pro- frameworks, the ANN model becomes a multi-timescale portion of the variation in true rainfall. While it has a weather prediction tool for various forecast horizons. References [1] An Introduction to the ReLU Activation Function — builtin.com. https://builtin.com/machinelearning / relu - activation - function. [Accessed 17-11-2023]. [2] Jason Brownlee. Gentle Introduction to the Adam Optimization Algorithm for Deep Learning MachineLearningMastery.com — machinelearningmastery.com. https : / / machinelearningmastery . com / adam optimization - algorithm - for - deep learning/. [Accessed 17-11-2023]. [3] Salvatore Cuomo et al. “Scientific Machine Learning Through Physics–Informed Neural Networks: Where we are and What’s Next”. en. In: J. Sci. Comput. 92.3 (Sept. 2022). [4] Arka Daw et al. “Physics-guided neural networks (PGNN): An application in lake temperature modeling”. In: (2017). [5] K Kashinath et al. “Physics-informed machine learning: case studies for weather and climate modelling”. en. In: Philos. Trans. A Math. Phys. Eng. Sci. 379.2194 (Apr. 2021), p. 20200093. [6] Sarmad Dashti Latif et al. “Assessing rainfall prediction models: Exploring the advantages of machine learning and remote sensing approaches”. en. In: Alex. Eng. J. 82 (Nov. 2023), pp. 16–25. [7] Manmeet Singh et al. “Deep learning for improved global precipitation in numerical weather prediction systems”. In: (2021). [8] Enes Slatina et al. “Correlation between change in air humidity and the incidence of stroke”. en. In: Mater. Sociomed. 25.4 (Dec. 2013), pp. 242–245. [9] What are Neural Networks? — IBM — ibm.com. https : / / www . ibm . com / topics / neural networks. [Accessed 17-11-2023]. [10] Vishal Yathish. Loss Functions and Their Use In Neural Networks — towardsdatascience.com. https : / / towardsdatascience . com / loss functions - and - their - use - in - neural networks-a470e703f1e9. [Accessed 17-11-2023]. [11] Zhongheng Zhang. “Multiple imputation with multivariate imputation by chained equation (MICE) package”. en. In: Ann. Transl. Med. 4.2 (Jan. 2016), p. 30.