International Journal of Engineering Trends and Technology (IJETT) – Volume 29 Number 2 - November 2015 Modeling for Prediction of Tomato Yield and Its Deviation using Artificial Neural Network Syed Abou Iltaf Hussain#1, DigantaHatibaruah*2 1 ME student,Department of Mechanical Engineering, Jorhat Engineering College, Jorhat, Assam, India Associate Professor, Department of Mechanical Engineering, Jorhat Engineering College, Jorhat, Assam, India mm2/plant respectively in the leaf area index subI. INTRODUCTION model. In 2007, B. J I et al have developed a better Prediction is the way things will occur in the model than the multiple linear regression-based yield future. Prediction is a difficult job especially in models to predict the Fujian rice yield of the Fujian agriculture predicting the crop yield. As the yield is province of China from the location-specific rainfall dependent upon various factors varying from weather data, soil fertility data and the weather variables such to the amount of fertilizers required. Out of the as sunshine hours per day, solar radiation per day and various factors some factor changes from day to day temperature sum per day. The values of R2 and like the temperature of the day, rainfall, sunshine etc. RMSE are comparatively higher and lower On the other hand some factors remains constant like respectively in case of ANN rice yield model than the pH of soil but some factors like the amount of multiple linear regression-based yield models. In fertilizers used depend upon the will of farmer how 2010, Rahman and Balamodeled a network to predict much they are applying. the jute production from i) Julian day, ii) solar According to FAOSTAT, India is the second radiation, iii) maximum temperature, iv) minimum largest producer of tomatoes producing temperature, v) rainfall and vi) type of biomass. Jiří approximately 12 million tonnes in 2010 but the ŠŤASTNÝ et al in 2011 has predicted the crop yield value rose to 17.5 million metric tonnes of tomatoes level using artificial neural network. The input values annually in 2014. Out of all the states growing were density of nurslings per meter square and tomatoes, Andhra Pradesh leads the tomato growth in average onion yield. The model was less complex India covering approximately 35% of the total than the other existing models and higher accuracy so production followed by Karnataka. Assam is one of that the model is easy to use and the prediction is the lowest tomato growing states of the country more accurate. In 2014, SaisuneeJabjone and producing approximately 402.49 thousand tonnes in SuraWannasang developed a model that could predict 2013. the rice production of Phimai district, Thailand from Basically Tomato cultivators of Phesual, the technique used for irrigation, rice breed, season, Assam are not economically strong so they cannot rice-field area and characteristics, cultivation invest large amount of money in their farming. technique and damage area. Moreover the farmers of Assam have small land The output from the model was compared stepwise holdings. Their main motive is to increase their with the linear regression models and the result production by putting minimum efforts, involving obtained from the neural network was better than the less labours and investing minimum money. Hence a linear regression method. need for a predicting tool arises that could predict or give a rough idea about the amount of the tomato III. OBJECTIVE yield at the end of the season so that a decision could The main objective of this paper is to develop a be made weather to cultivate or not. Artificial Neural model to predict tomato yield and its deviation from Network is one of the many computing models that the maximum possible amount of production from could predict the live in situation accurately. the amount of fertilizers used by the farmers, pH of soil and land available for tomato cultivation for each II. LITERATURE SURVEY farmer. In 2005 Kaul et al used ANN to develop model to predict the yield of corn and soybeans by considering IV. METHODOLOGY various environmental factors as inputs. In 2006 Data collected were based on the interviewed Ushadaand Murase used artificial neural network to survey of random tomato cultivators from the Phesual develop a model that showed relationship of region of Jorhat district. minimum temperature Tmin, maximum temperature Tmax, optimum temperature T opt and ambient A. Artificial Neural Network temperature Tamb with the as heat unit accumulation, Artificial Neural Networks are a family of relative rate of growth, leaf area index, height of statistical learning models inspired by the central moss, mass of moss and temperature stress factor. nervous system of animals particularly brain. The The specific leaf area and the ground area had the brain learns from the past experience. Brain is a 2 best experimental values of 1.498 m /kg and of 28 complex network of neurons which process signals as 2 ISSN: 2231-5381 http://www.ijettjournal.org Page 102 International Journal of Engineering Trends and Technology (IJETT) – Volume 29 Number 2 - November 2015 received by the sensory organs and asked to react according to the situation. Similarly artificial neural networks are a system of interconnected neurons which exchange messages with each other. There are some numeric weights at the connections of the neurons. These numeric weights can be readjusted through training. Training is the process of learning new jobs by repeatedly doing a particular job. When an artificial neural network model is trained the predicted output obtained is compared with the actual output and the numeric weights are updated. When a particular artificial neural model is trained repeatedly the numeric weights are updated until the predicted output and actual output are similar or the error between the two is the least. The fundamental unit of neural network is known as neuron. Artificial neural network are represented by a set of interconnected neurons which exchanges messages between each other. The artificial neuron receives one or more inputs and sums them to produce an output. The inputs of each node are summed and then weighted. The sum is passed through an activation function or transfer function which is a non-linear function. Mathematically, for neuron k: uk = wkj.xj) and yk= (uk+bk) wherexj are input signals and wkjare synaptic weights of neuron k, ukis the linear combiner output due to input signals. bk is the bias, is the activation function and output signal of the neuronis theyk. The use of bias bk has the effect of applying an affine transformation to to the output uk of the linear combiner in the model of Figure as shown by the following equation vk= uk + bk With respect to the weights in the network the backpropagation methods calculates the gradient of loss function. The gradient is fed to the optimization method which in turn uses it to update the weights, in an attempt to minimize the loss function. Feedforward backpropagation algorithm A feedforward neural network consists of three layers. The first layer is the input layer, second or the middle layerconsists of one or more hidden layers and the third is the output layer. Each of the input layer, hidden layer and the output layer consists of a number of neurons. Every neuron of the input layer is connected to the neurons of the first hidden layer and every neuron of first hidden layer is connected to the neurons of the second hidden layer and so on. The neurons of the last hidden layer are connected to the neurons of the output layer 1. Input layer 3. Output layer 2. Hidden layer 4. Neurons Figure 2: Typical feedforward neural network Cascade-forward backpropagation algorithm A cascadeforward neural network consists of three layers. The first layer is the input layer, second or the middle layerconsists of one or more hidden layers and the third is the output layer. Each of the input layer, hidden layer and the output layer consists of a set of neurons. Every neurons of the input layer is connected to each neuron of the hidden layer and the output layer. Every neuron of first hidden layer is connected to the neurons of the second hidden layer and so on. The neurons of the last hidden layer are connected to neurons of the output layer. 1. 2. 3. Inputs 4. Output paths Sums 5. wij = weights Transfer function Figure 1: A basic artificial neuron Backpropagation algorithm Backpropagation is a form of supervised learning. When using a supervised learning method, the network is provided with both sample inputs and predicted outputs. The predicted outputs are compared against the actual outputs for given input. ISSN: 2231-5381 http://www.ijettjournal.org Page 103 International Journal of Engineering Trends and Technology (IJETT) – Volume 29 Number 2 - November 2015 1. Input layer 2. Hidden layer 3. Output layer Figure 3:- Typical cascade-forward neural network B. Preprocessing of data It is one of the many steps in data mining. Data pre-processing is done before the actual processing of data. It prepares the raw data for further processing. Data Normalization Data pre-processing is also known as Data Normalization. The reason for using feature scaling method is that the gradient descent converges faster. Mathematically,the normalized datax/is given by: Where (la)iis land available for tomato cultivation with each farmer tyiis the tomato yield from that particular land 4. V. ANALYSIS AND DISCUSSIONS Data were collected through an interview survey amongst the professional tomato cultivators of the Phesual region of Jorhat district. The collected data were normalized using feature scaling method so that the network converges faster. The ANN network consists of 3 layers. The first layer is the input layer. It consists of five neurons viz. i) pH of soil, ii) amount of superphosphate used, iii) amount of potash used, iv) amount of urea used and v) land available for tomato cultivation. The second layer is the hidden layer. It consists of single hidden layer. The hidden layer consists of 10 neurons. The final layer is the output layer and it consists of two output neurons viz. i) tomato yield and ii) deviation of tomato yield from the maximum. Table-I and Table-II represents the total predicted tomato yield and its deviation obtained from the feed-forward neural network and cascadeforward neural network respectively. x/= wherex is the input data of a parameter. min(x) is the minimum input data of the parameter. max(x) is the maximum input data of the parameter. Deviation of tomato yield The tomato cultivators are cultivating the variety Avinash-2. The main characteristic of this variety is high yield under controlled conditions. Yield of this variety is about 1200 quintal per hectare. Deviation of tomato yield = 1200*(la)i – tyi TABLE-I PREDICTED TOMATO YIELD Farmer Actual yield 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 45 200 190 26.25 90 227.5 70 90 85 105 112.5 165 27.5 37.5 40 78.75 100 82.5 35 25 90 325 162.5 157.5 127.5 ISSN: 2231-5381 Predicted yield CascadeFeed-forward forward 46.48 51.88 179.77 185.83 178.36 169.90 31.77 33.13 95.22 88.66 228.27 219.79 66.93 65.86 105.03 111.72 82.25 86.80 107.53 114.82 113.21 115.78 164.65 154.61 31.42 25.83 32.20 34.31 34.67 41.98 76.57 75.52 97.33 109.18 68.68 76.09 33.92 27.36 36.04 44.80 91.83 87.26 193.64 317.50 185.28 190.76 157.67 160.63 106.28 113.25 Farmer Actual yield 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 52.5 126 44 44 150 75 10 26.25 25 65 26.25 175 225 165 50 61.25 37.5 80 30 255 45 68.75 105 78.75 225 http://www.ijettjournal.org Predicted yield CascadeFeed-forward forward 52.39 51.46 107.53 114.82 34.62 33.87 34.64 32.84 162.00 169.36 67.05 74.76 33.95 20.93 31.72 32.82 34.57 30.79 62.83 62.17 32.73 21.45 162.00 169.36 223.58 238.42 192.79 181.82 41.81 42.65 68.85 66.96 60.22 68.58 86.22 84.64 32.47 28.33 254.74 234.19 34.38 37.89 48.51 60.95 101.07 95.16 74.91 74.01 194.95 183.65 Page 104 International Journal of Engineering Trends and Technology (IJETT) – Volume 29 Number 2 - November 2015 A scatter plot diagram is plotted for tomato yield by taking actual values along the abscissa and predicted along the ordinate. The equation of best fit is given by: A. For feed-forward neural network: y = 0.8476x + 11.76 Where y is the predicted tomato yield x is the actual tomato yield From the scatter plot diagram following points are observed: 1. The predicted output showed a linear relation with the actual output. 2. The co-efficient of determination for the line of best fit is 0.9685. Hence we can conclude thatthe predicted output from the model is having 96.85% accuracy. PREDICTED PREDICTED Vs ACTUAL 400.00 350.00 300.00 250.00 200.00 150.00 100.00 50.00 0.00 R² = 0.911 total yield(quintal) Linear (total yield(quintal)) 0 200 400 600 ACTUAL Figure 4:- Line of best fit between the actual and predicted tomato yield from the feed-forward backpropagation algorithm From the scatter plot diagram following points are observed: 1. The predicted output showed a linear relation with the actual output 2. The co-efficient of determination for the line of best fit is 0.911 i.e. the predicted output from the model is having 91.1% accuracy. B. For cascade-forward neural network: y = 0.9476x + 4.6771 where y is predicted tomato yield x is actual tomato yield PREDICTED PREDICTED vs ACTUAL R² = 0.9685 450.00 400.00 350.00 300.00 250.00 200.00 150.00 100.00 50.00 0.00 total yield(quintal) Linear (total yield(quintal)) 0 200 400 600 ACTUAL YIELD Figure 5: Line of best fit between the actual and predicted tomato yield from the cascade-forward backpropagation algorithm ISSN: 2231-5381 http://www.ijettjournal.org Page 105 International Journal of Engineering Trends and Technology (IJETT) – Volume 29 Number 2 - November 2015 TABLE-II PREDICTED DEVIATION TOMATO YIELD 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 276.07 602.68 452.15 214.55 552.15 896.26 251.07 391.61 236.07 376.61 690.18 798.22 133.04 203.3 281.07 483.13 702.68 399.11 286.07 Farmer Actual deviation of tomato yield 39 40 41 42 43 44 798.22 351.34 500.63 444.11 562.15 210.8 Predicted deviation of tomato yield Feed-forward Actual deviation of tomato yield Farmer Cascadeforward 250.05 250.17 627.10 633.43 469.68 486.93 208.91 217.43 540.57 556.38 809.36 934.07 256.46 232.12 357.94 354.88 256.09 217.07 354.27 352.73 681.47 716.58 803.01 848.42 176.70 176.10 207.86 215.61 251.09 263.77 472.50 486.17 694.92 727.51 385.64 387.15 292.85 281.57 Predicted deviation of tomato yield CascadeFeed-forward forward 771.93 802.76 336.78 342.29 476.86 495.56 385.63 395.20 544.08 564.37 219.57 224.67 A scatter plot diagram is plotted for deviation of tomato yield by taking actual values along the abscissa and predicted along the ordinate. The equation of best fit is given by: a. For feed-forward neural network:y = 0.8083x + 67.813 Where y is the predicted deviation of tomato yield x is the actual deviation of tomato yield. Predicted deviation of tomato yield Feed-forward 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 296.07 552.15 1280.36 640.18 1287.33 354.11 429.11 355.61 277.07 277.07 652.68 406.61 150.54 214.55 296.07 256.07 214.55 627.68 979.02 Actual deviation of tomato yield Farmer 45 46 47 48 49 50 708.22 276.07 332.59 456.88 483.13 738.22 Cascadeforward 248.66 260.44 541.84 559.31 816.87 1215.23 621.55 629.82 1095.80 1163.51 356.13 353.81 399.84 409.48 354.27 352.73 268.14 272.69 271.33 273.91 643.59 646.28 385.61 388.62 186.30 185.73 209.30 217.90 278.44 276.51 255.56 236.02 252.64 237.17 643.59 646.28 815.56 994.58 Predicted deviation of tomato yield CascadeFeed-forward forward 726.94 778.18 257.92 268.22 316.59 326.22 472.60 463.78 473.36 488.02 770.58 801.27 PREDICTED vs ACTUAL PREDICTED DEVIATION Farmer Actual deviation of tomato yield 1600.00 1400.00 1200.00 1000.00 800.00 600.00 400.00 200.00 0.00 R² = 0.985 Deviation from optimal (quintal) 0 500 1000 ACTUAL DEVIATION 1500 Linear (Deviation from optimal (quintal)) Figure 6: Line of best fit between the actual and predicted deviation of tomato yield from the feedforward backpropagation algorithm From the scatter plot diagram following points are observed: ISSN: 2231-5381 http://www.ijettjournal.org Page 106 International Journal of Engineering Trends and Technology (IJETT) – Volume 29 Number 2 - November 2015 1 b. Theactual output is showing linear relation with thepredicted output. 2 As the co-efficient of determination for the line of best fit is 0.9341. Hence we can conclude that the predicted output from the model is having 93.41% accuracy. For cascade-forward neural network:y = 0.9835x + 7.2749 Where y is the predicted deviation of tomato yield x is the actual deviation of tomato yield PREDICTED DEVIATION PREDICTED vs ACTUAL 1400.00 1200.00 1000.00 800.00 600.00 400.00 200.00 0.00 R² = 0.934 Deviation from optimal (quintal) 0 500 1000 ACTUAL DEVIATION 1500 Linear (Deviation from optimal (quintal)) predicted deviation showed 96.85% accuracy. In case of cascade-forward neural network following points are observed: 1. The predicted tomato yield and its deviation are varying linearly with the actual tomato yield and its deviation respectively. 2. From the scatter plot of tomato yield it was found that the predicted value of tomato yield showed 93.41% accuracy and the predicted deviation showed 98.53% accuracy. Cascade-forward neural network can predict the tomato yield and its deviation more accurately than the feed-forward neural network as in case of cascade-forward neural network the input layer is directly linked with the output layer. As a result of which the synaptic weights could be better updated this is not possible in case of feed-forward neural network. Hence it can be concluded that Cascadeforward neural network of Artificial Neural Network can be used efficiently and effectively for prediction of tomato yield and its deviation. REFERENCES Temeyer Bradley R. and Gallus William A. Jr. “ Using an Artificial Neural Network to Predict Parameters for Frost Deposition on Iowa Bridgeways” 2. Kaul M., Hill R.L., Walthall C., “Artificial neural network for corn and soybean prediction”, Agricultural System 85 (2005) 1-18. 3. Ushada M., Murase H. “Identification of a Moss Growth System using an Artificial Neural Network Model”, Biosystemss Engineering 94 (2) (2006) 179-189. 4. Ji B. , Sun Y., Yang S. and Wan J. (2007), “Artificial neural networks for rice yield prediction in mountainous regions”, 249-261 5. Rahman M.M., Bala B.K., “Modelling of jute production using artificial neural networks”, Biosystems Engineering 105 (2010) 350-356. 6. Qiao D.M., Shi H.B., Pang H.B., X.B. Qi, Plauborg F., “Estimating plant root water uptake using a neural network approach”, Agricultural Water Management 98 (2) (2010) 251-260. 7. WANG Xin-Zheng, DUAN Xiao-chen and LIU Jing-yan, “Application of Neural Network in the Cost Estimation of Highway Engineering” Journal Of Computers, Vol. 5, No. 11, NOVEMBER 2010, 1762-1766 8. ŠŤASTNÝ Jiří, KONEČNÝ Vladimír, TRENZ Oldřich, “Agricultural data prediction by means of neural network”, Agric. Econ. – Czech, 57, 2011 (7): 356–361 9. H. Khan Zabir, S. AlinTasnim and Hussain Md. Akter, “Price Prediction of Share Market using Artificial Neural Network (ANN)”, International Journal of Computer Applications (0975 – 8887) Volume 22– No.2, May 2011, 42-47. 10. Miss.SnehalDahikar S., V.RodeDr.Sandeep (2014), “Agricultural Crop Yield Prediction Using Artificial Neural Network Approach”, INTERNATIONAL JOURNAL OF INNOVATIVE RESEARCH IN ELECTRICAL, ELECTRONICS, INSTRUMENTATION AND CONTROL ENGINEERING Vol. 2, Issue 1, January 2014, 683-686. 1. Figure 7: Line of best fit between the actual and predicted deviation of tomato yield from the cascadeforward backpropagation algorithm From the scatter plot diagram following points are observed: 1. Theactual output is showing linear relation with thepredicted output. 2. As the co-efficient of determination for the line of best fit is 0.9853 i.e. the predicted output from the model is having 98.53% accuracy VI. CONCLUSION The purpose of the study was to model a network that would predict the tomato yield and its deviation in a specific region. Two different networks were created where one was feed-forward and the other was cascade-forward neural networks. Both the neural networks were trained and learned by backpropagation algorithm. In case of feed-forward neural network following points are observed: 1. The predicted tomato yield and its deviation are varying linearly with the actual tomato yield and its deviation respectively. 2. From the scatter plot of tomato yield it was found that the predicted value of tomato yield showed 91.1% accuracy and the ISSN: 2231-5381 http://www.ijettjournal.org Page 107 International Journal of Engineering Trends and Technology (IJETT) – Volume 29 Number 2 - November 2015 11. JabjoneSaisunee and WannasangSura (2014), “Decision Support System Using Artificial Neural Network to Predict Rice Production in Phimai District, Thailand”, International Journal of Computer and Electrical Engineering, Vol. 6, No. 2, April 2014, 162-166. 12. TengelengSiddi and Armand Nzeukou (2014), “Performance of Using Cascade Forward Back Propagation Neural ISSN: 2231-5381 Networks for Estimating Rain Parameters with Rain Drop Size Distribution”, www.mdpi.com/journal/atmosphere, ISSN 2073-4433, 454-472. 13. http://www.cropnutrition.com/efu-soil-ph 14. http://www.indiaagronet.com/tomato/resources/3/seed1.htm http://www.ijettjournal.org Page 108