Bulletin of TUIT: Management and Communication Technologies The Implementation of Machine Learning and Deep Learning Algorithms for Crop Yield Prediction in Agriculture Nodir Rahimov 1 1 Software Engineering, Tashkent University of Information Technologies, nikobek82@gmail.com Dilmurod Khasanov 2 2 Software Engineering, Tashkent University of Information Technologies, tatusf2015@gmail.com Abstract. In most Asian countries, since economy of the country rely on agriculture, in such countries, the agricultural system is one of the most important sectors. Crop yield prediction is a crucial task in agriculture that can help farmers make informed decisions and optimize their crop production. Accurate predictions can help farmers better plan their resources and reduce waste, ultimately leading to higher profits and a more sustainable agricultural industry. This article presents a comprehensive study on the utilization of machine learning and deep learning techniques to predict the crop yield in agriculture, implemented and compared some AI algorithms based on a given dataset. To this end, dynamic analyses data have been collected for crop yield prediction and used to construct a regression prediction model using a multivariate regression (MR), a deep neural network (DNN), multiple linear regression (MLR), gradient boosting regressor tree (GBRT) to analyze a range of agricultural factors that impact wheat crop yields. These factors include soil moisture, temperature, rainfall, and crop growth stages. The model is trained on a large dataset of wheat crop yields and corresponding agricultural factors, allowing it to learn patterns and make accurate predictions. The experiments conducted on the dataset demonstrate the effectiveness of the proposed model. The model outperforms traditional statistical methods for crop yield prediction and achieves an accuracy of up to 90%. The results show that the use of both deep learning and machine learning techniques can significantly improve the accuracy of crop yield prediction in agriculture. The proposed approach has the potential to revolutionize the agricultural industry by providing farmers and agricultural organizations with a more accurate and efficient means of predicting crop yields. This, in turn, can help reduce waste and optimize resources, leading to a more sustainable and profitable agricultural industry. The model can be integrated into existing agricultural systems and can be used to make timely and informed decisions about crop management. Keywords: machine learning, deep learning, MR, MLR, DNN,GBRT, gradient descent. Introduction. Artificial intelligence (AI) has become a crucial technology in the Fourth Industrial Revolution, gaining substantial recognition in various domains such as finance, healthcare, and manufacturing. The subfields of AI, specifically machine learning and deep learning, have become widely prevalent in diverse areas, including speech recognition, computer vision, language models, and industrial fault diagnosis [1]. Consequently, AI has attracted significant attention as a revolutionary force capable of driving these fields forward and enhancing human abilities, presenting enormous potential for industry transformation. Given the critical role of agriculture in the global economy, understanding global crop yield patterns is essential for addressing food security challenges and mitigating the impact of climate change amid a growing human population. Accurately predicting crop yields is a significant agricultural challenge that depends on multiple factors such as weather conditions (e.g., rainfall, temperature) and pesticide application. Therefore, having precise knowledge of crop yield history is crucial when making decisions related to agricultural risk management and yield forecasting [1][2]. Crop yield prediction poses a challenge for decision-makers at various levels, from global to local scales. Farmers, for instance, can leverage reliable crop yield prediction models to determine optimal planting schedules and crop selection. There are various approaches to forecasting crop Nodir Rahimov, Dilmurod Khasanov 2023.Vol-2(9) Bulletin of TUIT: Management and Communication Technologies yields [2]. Machine learning represents a practical approach that can facilitate improved crop yield prediction by leveraging multiple attributes. As a subdivision of Artificial Intelligence (AI) that emphasizes learning, machine learning (ML) is capable of extracting insights from datasets by identifying correlations and patterns. During the training phase, ML models are trained using datasets that capture prior experiential outcomes, and the resulting predictive models incorporate a range of features and parameters calculated from previous data. During the testing phase, unused historical data is employed to evaluate model performance. Depending on the research question and topic, ML models can be descriptive or predictive. Predictive models leverage past data to forecast future events, while descriptive models help to characterize current conditions or historical trends. Machine learning techniques have been instrumental in improving crop yield prediction and crop management decision-making. In recent years, a range of machine learning algorithms such as multivariate regression, decision trees, association rule mining, and artificial neural networks have been deployed to enhance crop yield forecasting in agriculture [5] [6]. This paper's primary contributions are as follows: 1. The primary objective of this study is to compare the performance of various machine learning and deep learning models, including the multivariate regression (MR), deep neural network (DNN), and multiple linear regression (MLR), in predicting the wheat yield for the next season using previously collected data. 2. We provide a detailed description of the data preprocessing procedure used in our study. This process includes importing raw data and constructing a dataset that includes soil moisture, rainfall, temperature, and volume of minerals (Nitrogen, Phosphorus, natural minerals). Additionally, we remove unnecessary data and classify the specifications for model training and validation. 3. To validate and evaluate the effectiveness of our study, we conduct a comprehensive comparative analysis of various models used to predict crop yield for the next season. This analysis includes an assessment of the accuracy of these models, allowing us to determine which approach is most effective in accurately predicting crop yield. 1. Related works In recent years, there has been significant research interest and activity focused on the topic of crop yield prediction. Numerous studies have been conducted in this area, exploring various techniques and methodologies for predicting crop yields with greater accuracy and precision. Koirala et al. (2019) reviewed the use of Deep Learning methods for fruit counting and estimating yield. They revealed the ability of Deep Learning methods to extract important features while recommending approaches such as CNN detectors, deep regression, and LSTM for estimating the fruit load [3]. Dharani et al. (2021) conducted a review on crop yield prediction using Deep Learning and found that hybrid networks and RNN-LSTM networks outperformed other networks. The superior performance of RNN and LSTM can be attributed to their storage and feedback loop capabilities, enabling them to make accurate predictions with time-series data on crop yield [4]. In their study on crop yield prediction using Machine Learning, van Klompenburg et al. (2020) found that neural networks, specifically CNN, LSTM, and DNN, were the most commonly used models. They also noted that the number of features used varied depending on the study and that in some cases, yield prediction relied on object counting and detection instead of tabular data [5]. Amit et al. proposed their model that predicts winter crop yield of wheat using DNN, convolutional neural network(CNN) and XGboost. Their proposed CNN model outperformed all other baseline models used for winter wheat yield prediction (7 to 14% lower RMSE, 3 to 15% lower MAE, and 4 to 50% higher correlation coefficient than the best performing baseline across test data) [2]. Nodir Rahimov, Dilmurod Khasanov 2023.Vol-2(9) Bulletin of TUIT: Management and Communication Technologies Table 1. Targets and methods of related works [6] . Reference Koirala et al. (2019) Target Method Fruit detection for yield Convolutional Neural Network estimation (CNN), Long Short-Term Memory (LSTM) Dharani et al. (2021) Crop prediction using deep Convolutional Neural Network learning techniques (CNN), Recurrent Neural Network (RNN), Long Short-Term Memory (LSTM) van Klompenburg et al. Crop yield prediction with Long Short-Term Memory (2020) machine learning (LSTM), Deep Neural Network (DNN) Amit et al.(2022) Winter wheat prediction 2. Methods 2.1.Multivariate regression (MR) Multivariate regression is a statistical method that is widely used in various fields of research, such as economics, finance, psychology, and social sciences. The primary goal of multivariate regression is to model the relationship between multiple independent variables and a single or multiple dependent variables. This modeling is done by fitting a linear equation to the data, which allows for the prediction of the value of the dependent variable for any given combination of values of the independent variables. Multivariate regression is a more general statistical method than multiple linear regression, which focuses on modeling linear relationships between the dependent variable and two or more independent variables. In contrast, multivariate regression allows for the analysis of complex relationships between multiple variables that may not be linear and can account for correlations among the dependent variables. One of the significant advantages of multivariate regression is its ability to analyze the relationship between multiple variables simultaneously, which can lead to more accurate and robust results compared to analyzing each variable separately. For example, in economics, multivariate regression is used to model yield Convolutional Neural Network (CNN) the relationship between multiple economic indicators, such as inflation, interest rates, and GDP, to predict the behaviour of the economy as a whole. Another advantage of multivariate regression is its ability to handle missing data and outliers, which can occur in real-world data. By considering multiple variables simultaneously, multivariate regression can better handle missing data and outliers, leading to more accurate results. The structure of multivariate regression involves modeling the relationship between multiple independent variables (X1, X2, X3, ...) and a single or multiple dependent variables (Y1, Y2, Y3, ...) by fitting a linear equation to the data. The general form of the multivariate regression equation is as follows: Y = β0 + β1X1 + β2X2 + β3X3 + ... + ε where Y is the dependent variable, X1, X2, X3, ... are the independent variables, β0 is the intercept or constant term, β1, β2, β3, ... are the coefficients or regression weights that represent the impact of each independent variable on the dependent variable, and ε is the error term or residual. The coefficients (β1, β2, β3, ...) are estimated from the data using a method called ordinary least squares (OLS) regression, which minimizes the sum of the squared residuals to find the best-fitting line to the data. The OLS regression method finds the values of the coefficients that minimize the difference between the predicted Nodir Rahimov, Dilmurod Khasanov 2023.Vol-2(9) Bulletin of TUIT: Management and Communication Technologies values of the dependent variable and the actual values of the dependent variable. In multivariate regression, the number of independent variables can vary, and the number of dependent variables can be more than one. In cases where there are multiple dependent variables, the regression equation takes the form: Y1 = β01 + β11X1 + β12X2 + β13X3 + ... + ε1 Y2 = β02 + β21X1 + β22X2 + β23X3 + ... + ε2 Y3 = β03 + β31X1 + β32X2 + β33X3 + ... + ε3 ... Yn = β0n + βn1X1 + βn2X2 + βn3X3 + ... + εn where Y1, Y2, Y3, ..., Yn are the n dependent variables, X1, X2, X3, ... are the independent variables, β01, β11, β12, β13, ..., βn1, βn2, βn3, ... are the coefficients or regression weights, and ε1, ε2, ε3, ..., εn are the error terms. Overall, the structure of multivariate regression involves fitting a linear equation to the data to model the relationship between multiple independent variables and a single or multiple dependent variables, and estimating the coefficients using the OLS regression method. 2.2. Multiple Linear Regression (MLR) Multiple linear regression (MLR), also referred to as multiple regression, is a statistical approach that employs several explanatory variables to forecast the outcome of a response variable. The objective of MLR is to establish a linear relationship between the independent or explanatory variables and dependent or response variables. Essentially, multiple regression is an extension of ordinary least-squares (OLS) regression, as it involves more than one explanatory variable [8]. In the context of publishing an article, MLR can be a powerful tool for analyzing data and drawing conclusions that are supported by statistical evidence. For example, MLR can be used to investigate the relationship between various demographic factors and a specific health outcome, or to analyze the relationship between different types of marketing strategies and sales outcomes. To use MLR effectively, researchers must carefully choose their independent and dependent variables and ensure that they are measuring each variable accurately and consistently. They must also ensure that they have a sufficient sample size to achieve statistically significant results. Once the data is collected, researchers can use MLR to determine the strength and direction of the relationships between the independent variables and the dependent variable. They can also use MLR to create predictive models that can be used to estimate the value of the dependent variable based on specific values of the independent variables [8]. Multiple linear regression is a statistical method that aims to model the relationship between a dependent variable and multiple independent variables. The structure of a multiple linear regression model can be represented as follows: Y = β0 + β1X1 + β2X2 + ... + βnXn + ε Where: Y is the dependent variable; X1, X2, ..., Xn are the independent variables; β0 is the intercept or constant term; β1, β2, ..., βn are the regression coefficients, which represent the expected change in Y for a one-unit change in X1, X2, ..., Xn, while holding all other independent variables constant; ε is the error term, which represents the unexplained variability in Y that is not accounted for by the independent variables; The multiple linear regression model aims to estimate the values of the regression coefficients that best fit the data, in order to make predictions about the dependent variable based on the independent variables. The quality of the model fit can be assessed using measures such as the R-squared value, which indicates the proportion of variance in the dependent variable that is explained by the independent variables [8]. In practice, multiple linear regression models can be complex and may involve interactions or nonlinear relationships between the independent variables and the dependent variable. However, the basic structure remains the same, with the aim of modeling and predicting the relationship between a Nodir Rahimov, Dilmurod Khasanov 2023.Vol-2(9) Bulletin of TUIT: Management and Communication Technologies dependent variable and multiple independent variables. 2.3. Deep Neural Network (DNN) A deep neural network (DNN) is a particular type of artificial neural network (ANN) that includes multiple hidden layers situated between the input and output layers, as depicted in Figure 1. The process of learning in DNNs involves a repetitive error backpropagation procedure, which modifies weights to minimize the loss function's value through optimization functions such as pure propagation and stochastic gradient descent [1]. Nonetheless, increasing the depth of a neural network can lead to gradient vanishing or exploding, while increasing the number of neurons may lead to overfitting. To tackle the issue of gradient vanishing or exploding, an appropriate weight initialization technique that is based on the type of activation function can be employed. Additionally, overfitting can be reduced by utilizing techniques such as dropout and batch normalization. Furthermore, advancements in hardware, such as improved graphics processing units (GPUs), have significantly reduced the computation time of complex matrices in deep learning. DNNs that address these challenges can perform complex nonlinear modeling. Therefore, these techniques are highly effective in developing highly accurate machine learning models capable of handling complex, high-dimensional data. In conclusion, DNNs are a powerful tool for addressing complex machine learning problems, and their ability to learn complex non-linear mappings from high-dimensional data makes them highly effective in various fields [1]. Figure 1. Construction of the deep neural network (DNN) model [9]. 2.4.Gradient Boosting Regressor Tree (GBRT) Boosting is a type of ensemble machine learning technique that combines multiple weak learners to create a strong learner, as demonstrated in Figure 2. Gradient boosting is one of the most popular and commonly utilized boosting algorithms, which focuses on improving the accuracy of the model by enhancing the predictions made by prior models [1]. Figure 2. The typical structure of GBRT model [16]. To start the gradient boosting algorithm, the first model calculates the average prediction value of the Nodir Rahimov, Dilmurod Khasanov 2023.Vol-2(9) Bulletin of TUIT: Management and Communication Technologies target variables across the entire dataset and technique that can significantly enhance the prediction accuracy of machine learning models. computes the residual. This residual is then utilized to train multiple decision trees that create a stronger 3. Data preprocessing model. The process of enhancing the model iteratively continues by obtaining the gradient of the Figure 2 illustrates the data preprocessing residual and using it to reduce the residual even process used for model learning. Initially, the dataset further in the next model. was imported into Python from kaggle.com. Additional features were added to create a new Gradient boosting has been found to be highly dataset. Next, normalization was performed to effective in improving the accuracy of machine analyze the data. Finally, certain specification data learning models [15-17]. It can be applied to a broad were identified as model learning data, while the range of data types and has been extensively utilized Base specification data were set aside to evaluate the to address regression problems. Therefore, gradient performance of the generated predictive model. boosting is a robust and powerful ensemble Figure 2. The preprocessing process for prediction crop yield data. 4. Results and discussion 4.1.Evaluation metrics This study utilizes dataset that collected during over 20 years, including measure of rainfall, productivity of the each year, temperature. Machine learning techniques, including a multivariate regression (MR), deep neural networks (DNN), and multiple linear regression predict, were employed to construct the predictive model using Python, Scikit-learn and Seaborn libraries. The predictive performance of the models was evaluated using a mean absolute error (MAE), and a root mean squared error (RMSE), while a separate dataset was used to test and verify the selected model. The test included assessing the performance of each prediction model on a separate dataset and generating graphs to compare the predicted and actual values of crop yield such as changing temperature, rainfall. For regression problems MAE and RMSE metrics are most implemented. In this section we compare the results taken from four models through MAE and RMSE according to mentioned four algorithms in section 3. MAE = 𝑛 1 ∑ |𝑦𝑎𝑐𝑡 − 𝑦𝑝𝑟𝑒𝑑 | 𝑛 𝑖=1 ∑𝑛𝑖=1(𝑦𝑎𝑐𝑡 − 𝑦𝑝𝑟𝑒𝑑 )2 √ RMSE = 𝑛 Where: n (1) (2) is the number of data points, ypred is the predicted value of the dependent variable for the ith data point, Nodir Rahimov, Dilmurod Khasanov 2023.Vol-2(9) Bulletin of TUIT: Management and Communication Technologies based on the test dataset. The prediction results of the yact is the actual value of the dependent variable for th MR model for the test dataset are visualized in the i data point. Figure 3 and Figure 4. 4.2.Multivariate Regression Prediction Model Table 2. The performance evaluation results of the MR Performance prediction model. The performance evaluation of the predictive model is presented in Table 2, where the MR model exhibits RMSE values of 83256.2 and 84955.1, and MAE values of 93365.8 and 64242.0 when predicting the crop yield prediction, respectively, Figure 3. High-correlation Figure 4. True values among features. (blue) and predictions (orange). 4.3.Multiple Linear Regression Prediction Model Performance The performance evaluation of the predictive model is presented in Table 3, where the MLR model exhibits MAE values of 63879.3 and 64099.9, and RMSE values of 84145.8 and 84254.6 when predicting the crop yield prediction, respectively, Figure 5. The dynamics Figure 6. True values Metric Target Predictio n MAE Train Test 63365. 64242. 8 0 RMSE Train Test 83256. 84955. 2 1 based on the test dataset. The prediction results of the MLR model for the test dataset are visualized in Figure 5 and Figure 6. Table 3. The performance evaluation results of the MLR prediction model. Metric Target Prediction MAE Train Test 63879.3 of crop by years. 64099.9 RMSE Train Test 84145.8 84254.6 and predictions. Nodir Rahimov, Dilmurod Khasanov 2023.Vol-2(9) Bulletin of TUIT: Management and Communication Technologies in certain circumstances, MLR has been identified as 4.4.Deep Neural Network Prediction Model the most optimal algorithm for predicting crop yield Performance based on the research findings. The performance evaluation of the predictive model is presented in Table 4, where the DNN model exhibits MAE values of 63713.9 and 63747.2, and RMSE values of 83510.5 and 83493.9 when predicting the crop yield prediction, respectively, based on the test dataset. Table 4. The performance evaluation results of the DNN prediction model. Metric Target Prediction MAE Train Test 63713.9 63747.2 RMSE Train Test 83510.5 The performance evaluation of the predictive model is presented in Table 5, where the GBRT model exhibits MAE values of 61378.7 and 61749.5, and RMSE values of 79139.3 and 79641.6 when predicting the crop yield prediction, respectively, based on the test dataset. Table 5. The performance evaluation results of the GBRT prediction model. 83493.9 The study has revealed that multiple linear regression (MLR) outperforms other algorithms that were evaluated in terms of dataset size, sorting, and key features. Although models based on deep neural network (DNN) and multiple regression (MR) algorithms have been observed to be highly effective Metrics → Models ↓ 4.5. Gradient Boosting Regressor Tree Model Performance Root Mean squared error (./1000 ha) Metric Target Prediction MAE Train Test 61378.7 61749.5 79641.6 Mean percentage error (%) MR 84.10 61.8 83 MLR 84.2 63.98 80 DNN 83.5 62.7 82 GBRT 70.39 59.5 88 In recent years, machine learning techniques, such as multivariate linear regression (MLR), multiple regression (MR), and deep neural networks (DNN), have shown promising results in crop yield prediction. In this paper, we evaluated the performance of MLR, MR, and DNN models in predicting crop yield using a publicly available dataset. Our results show that GBRT outperforms MLR,DNN and MR models in terms of prediction accuracy, with lower mean absolute error (MAE) and root mean squared error (RMSE) values. This indicates that GBRT is better suited for modeling 79139.3 Table 6. The SOTA comparison of models. Mean absolute error (./1000 ha) 5. Conclusion RMSE Train Test multi-functional relationships between crop yield and various environmental and management factors. However, we also note that the choice of algorithm for crop yield prediction depends on several factors, including the complexity of the problem, the amount and quality of data, and the specific application requirements. While GBRT may perform better in some cases, MLR, DNN and MR models can be more interpretable and easier to implement in certain scenarios. In our future works, we aim to expand the dataset by collecting more data with varying specifications, including a new features for effecting crops, creating new application to collect the agricultural data from farmers, reducing range of Nodir Rahimov, Dilmurod Khasanov 2023.Vol-2(9) Bulletin of TUIT: Management and Communication Technologies learning area (specific area from central Asia). By 11. Jiang, S.; Li, J.; Zhang, S.; Gu, Q.; Lu, C.; doing so, we can improve the generalization and Liu, H. Landslide risk prediction by using prediction performance of the prediction model, GBRT algorithm: Application of artificial making it more effective in the real world. intelligence in disaster prevention of energy mining. Process. Saf. Environ. Prot. 2022, References 166, 384–392. 12. Saeed Khaki*, Lizhi Wang, Crop Yield 1. Lee,W.; Jung, T.-Y.; Lee, S. Dynamic Prediction Using Deep Neural Networks. Characteristics Prediction Model for Diesel 2019 Engine Valve Train Design Parameters Based on Deep Learning. Electronics 2023, 12, 13. N.Rahimov, D.Khasanov,“The application 1806. of multiple linear regression algorithm and https://doi.org/10.3390/electronics12081806 python for crop yield prediction in 2. Amit Kumar Srivastava, Nima Safaei, Saeed agriculture”, Harvard educational and Khaki, Gina Lopez, Wenzhi Zeng, Frank scientific review, Vol.2. Issue 1 Pg. 181-187. Ewert, Thomas Gaiser, Jaber Rahimi. Winter 14. N.Rahimov, D.Khasanov,J.Kuvandikov, wheat yield prediction using convolutional neural networks from environmental and “Structural-funtional organization phenological data. 2022 correctness of knowledge models of product 3. Koirala A, Walsh KB, Wang Z, McCarthy C. systems”, Harvard educational and scientific Deep learning–method overview and review review, Vol.2. Issue 2 Pg. 1-9. of use for fruit detection and yield estimation. 15. N.Rahimov, D.Khasanov, “The 2019 mathematical essence of logistic regression 4. Dharani M, Thamilselvan R, Natesan P, for machine learning”, International Kalaivaani P, Santhoshkumar S. Review on Journal of Contemporary Scientific and crop prediction using deep learning Technical Research. Pg. 102-105. techniques. 2021 16. Hui, H. Rong, J. Xiaoyu, S. Jun,L. Jian, D., 5. van Klompenburg T, Kassahun A, Catal C. Feature selection and hyper parameters Crop yield prediction using machine learning: optimizationfor short-term wind power a systematic literature review. 2020 forecast. https://doi.org/10.1007/s104896. Alexandros Oikonomidis,Cagatay Catal, 021-02191-y Ayalew Kassahuna. Deep learning for crop yield prediction: a systematic literature review. 2022 7. Yifei Huang, Yuhua Liu, Chenhui, Changbo Wang. GBRTVis: online analysis of gradient boosting regression tree. 2018 8. Huang Hui, Rong Jia, Xiaoyu Shi. Feature selection and hyper parameters optimization for short-term wind power forecast. 2021 9. Chuan Lin, Qing Chang, Xianxu Li. A Deep Learning Approach for MIMO-NOMA Downlink Signal Detection. 2019 10. Nie, P.; Roccotelli, M.; Fanti, M.P.; Ming, Z.; Li, Z. Prediction of home energy consumption based on gradient boosting regression tree. Energy Rep. 2021, 7, 1246– 1255. Nodir Rahimov, Dilmurod Khasanov 2023.Vol-2(9)