Master of Science in Analytics Blue Book for Bulldozers - Predicting Bulldozer Sales Price MSiA 420 Predictive Analytics Emily Eunhee Ko, Yoojong Bang, Samuel Hillis Benedict Lim, Joon Lim 0 Executive Summary .......................................................................................................... 2 Data Overview ................................................................................................................... 2 Detailed Model Analysis ................................................................................................... 4 Linear Regression ........................................................................................................ 7 Data Description .................................................................. Error! Bookmark not defined. Results ................................................................................. Error! Bookmark not defined. Ridge Regression ........................................................................................................ 7 Data Description ................................................................................................................. 8 Results ................................................................................................................................ 8 K-Nearest Neighbor Classification (KNN) ..................................................................... 8 Data Description .................................................................. Error! Bookmark not defined. Results ................................................................................................................................ 8 Support Vector Machines Classification (SVM) ............................................................ 9 Data Description ................................................................................................................. 9 Results ...............................................................................................................................10 Naïve Bayes Classification..........................................................................................10 Data Description .................................................................. Error! Bookmark not defined. Neural Networks .........................................................................................................10 Data Description .................................................................. Error! Bookmark not defined. Results ................................................................................. Error! Bookmark not defined. Boosting (GBM) ..........................................................................................................10 Data Description .................................................................. Error! Bookmark not defined. Results ................................................................................. Error! Bookmark not defined. Classification Trees (CART) ........................................................................................11 Data Description ................................................................................................................11 Results ................................................................................. Error! Bookmark not defined. GAM ...........................................................................................................................12 Data Description .................................................................. Error! Bookmark not defined. Results ................................................................................. Error! Bookmark not defined. MARS .........................................................................................................................12 Data Description .................................................................. Error! Bookmark not defined. Results ................................................................................. Error! Bookmark not defined. Random Forest ...........................................................................................................12 Data Description ................................................................................................................13 Results ...............................................................................................................................14 PPR ............................................................................................................................14 Data Description .................................................................. Error! Bookmark not defined. Results ................................................................................. Error! Bookmark not defined. Conclusion ............................................................................. Error! Bookmark not defined. 1 Executive Summary [] Competition Overview Competition Details is a crowdsourcing platform for predictive modeling and analytics competitions where companies post problems and data sets for researchers and analysts all over the world to apply data mining techniques on. We entered into the “Blue Book for Bulldozers” competition where the goal of the contest is to predict the sale price of a particular piece of heavy equipment at an auction based on the equipment’s attributes such as year it was made and number of machine hours on the equipment. Forecasting Goal The goal of the competition is to produce the most accurate forecasts of the equipment sale price. The accuracy of the model is evaluated using the Root Mean Squared Log Error (RMSLE), calculated using the formula: π 1 π πππΏπΈ = √ ∑(log(ππ + 1) − log(πβππ‘π + 1))2 π π=1 Where Yi is the actual sale price of the ith piece of equipment, and Yhati is the predicted sale price. Kaggle provides a test set on its website where participants can use their models to forecast sale price, and submit their predictions online. Kaggle then calculates RMSLE based on the above formula and maintains a leaderboard to rank the performance of the participant’s models. Currently the top ranked model has a RMSLE of 0.2209. Here are the current top models below: Data Overview There are initially 53 variables in this data set and it consists of one response variable, which is sale price, and 52 predictors. Most of them are categorical variables, but there are some continuous variables in the data set. Sale price, which is a response variable is a continuous variable, and three out of 52 predictors are continuous variables as well: MachineHourCurrentMeters, YearMade, and Tire_Size. Following is the whole data set and its descriptions. Table1. Data Description 2 Variable Description SalesID MachineID ModelID datasource unique identifier of a particular sale of a machine at auction identifier for a particular machine; machines may have multiple sales identifier for a unique machine model (i.e. fiModelDesc) source of the sale record; some sources are more diligent about reporting attributes of the machine than others. Note that a particular datasource may report on multiple auctioneerIDs. identifier of a particular auctioneer, i.e. company that sold the machine at auction. Not the same as datasource. year of manufacturer of the Machine current usage of the machine in hours at time of sale (saledate); null or 0 means no hours have been reported for that sale value (low, medium, high) calculated comparing this particular Machine-Sale hours to average usage for the fiBaseModel; e.g. 'Low' means this machine has less hours given it's lifespan relative to average of fiBaseModel. time of sale cost of sale in USD Description of a unique machine model (see ModelID); concatenation of fiBaseModel & fiSecondaryDesc & fiModelSeries & fiModelDescriptor auctioneerID YearMade MachineHoursCurrentMeter UsageBand Saledate Saleprice fiModelDesc fiBaseModel fiSecondaryDesc fiModelSeries fiModelDescriptor ProductSize ProductClassDesc State ProductGroup ProductGroupDesc Drive_System disaggregation of fiModelDesc disaggregation of fiModelDesc disaggregation of fiModelDesc disaggregation of fiModelDesc Don't know what this is description of 2nd level hierarchical grouping (below ProductGroup) of fiModelDesc US State in which sale occurred identifier for top-level hierarchical grouping of fiModelDesc description of top-level hierarchical grouping of fiModelDesc machine configuration; typcially describes whether 2 or 4 wheel drive Enclosure Forks machine configuration - does machine have an enclosed cab or not machine configuration - attachment used for lifting Pad_Type Ride_Control Stick machine configuration - type of treads a crawler machine uses machine configuration - optional feature on loaders to make the ride smoother machine configuration - type of control Transmission Turbocharged Blade_Extension Blade_Width Enclosure_Type Engine_Horsepower Hydraulics Pushblock Ripper Scarifier Tip_control Tire_Size Coupler Coupler_System Grouser_Tracks Hydraulics_Flow machine configuration - describes type of transmission; typically automatic or manual machine configuration - engine naturally aspirated or turbocharged machine configuration - extension of standard blade machine configuration - width of blade machine configuration - does machine have an enclosed cab or not machine configuration - engine horsepower rating machine configuration - type of hydraulics machine configuration - option machine configuration - implement attached to machine to till soil machine configuration - implement attached to machine to condition soil machine configuration - type of blade control machine configuration - size of primary tires machine configuration - type of implement interface machine configuration - type of implement interface machine configuration - describes ground contact interface machine configuration - normal or high flow hydraulic system Track_Type Undercarriage_Pad_Width Stick_Length Thumb machine configuration - type of treads a crawler machine uses machine configuration - width of crawler treads machine configuration - length of machine digging implement machine configuration - attachment used for grabbing 3 Pattern_Changer Grouser_Type Backhoe_Mounting Blade_Type Travel_Controls machine configuration - can adjust the operator control configuration to suit the user machine configuration - type of treads a crawler machine uses machine configuration - optional interface used to add a backhoe attachment machine configuration - describes type of blade machine configuration - describes operator control configuration Differential_Type Steering_Controls machine configuration - differential type, typically locking or standard machine configuration - describes operator control configuration Descriptive Statistics Out of the initial 53 variables, we cut the data set down into 4 main predictor variables to predict the numeric variable, SalePrice. We cut the data set down to a smaller data set mainly because of computational issues. Filling up the sparse data set caused the size of the dataset to be about 200mb, and R was not able to handle a dataset of this size even for a simple algorithm such as linear regression. Thus we used the Gradient Boosting Method (GBM) algorithm and identified which variables were the most important variables in the data set and selected those variables as our main variables for this study. In all, we identified four predictors (YearMade, State, MachineHoursCurrentMeter, Enclosure, and fiProductClassDesc) and one response variable (Sale Price) to run the following 12 models and will elaborate their descriptive statistics in this section. Of the four predictors, machine hours and the year made are numeric variables, while the rest are categorical variables. Each algorithm used in this paper have specific limitations on what types of input it accepts. Thus for each algorithm, a variation of this data set was used, and the variables selected are briefly described in the “Model Analysis” section. Moreover, transformations on each of the variables are performed as needed. Sale Price (Response Variable) Sale Price is ‘cost of sale (USD) in an auction.’ This is a numeric variable and following is descriptive statistics for the variable. Table2. Descriptive Statistics – Sale Price Min. 4750 1st Qu. 14,500 Median 24,000 Mean 31,100 3rd Qu. 40,000 Max. 142,000 We initially saw the histogram of the response variable and observed its left-skewness. We, therefore, took log 10 for this variable and found that degree of the skewness is reduced and the histogram is more normalized. Therefore, we decided to take log 10 for our response variable. 4 Figure2. Histogram – Log10 Sale Price Figure1. Histogram – Sale Price MachineHoursCurrentMeter (Predictor) MachineHoursCurrentMeter is ‘current usage of the machine in hours at time of sale (saledate)’; null or 0 means no hours have been reported for that sale. This is a numeric variable and its descriptive statistics is as follows. Table3. Descriptive Statistics – MachineHoursCurrentMeter Min. 0 1st Qu. 0 Median 0 Mean 3,458 3rd Qu. 3,025 Max. 248,300 NA’s 258,360 YearMade (Predictor) YearMade means ‘year of manufacturer of the machine’ and it is another numeric variable in our data set. Following is the descriptive statistics for YearMade. Table4. Descriptive Statistics – YearMade Min. 1919 1st Qu. 1988 Median 1996 Mean 1994 Enclosure (Predictor) 5 3rd Qu. 2001 Max. 2013 NA’s 38,185 Enclosure is one of the machine configurations and a categorical variable which has five values under it. According to data descriptions from Kaggle web site, it means ‘if the machine has an enclosed cab or not.’ There are five values under Enclosure and it has two missing values. State (Predictor) Figure3. Histogram – Enclosure State is a categorical variable and it means ‘the state where sales occurred’. There are, in total, 53 values under ‘state’ including one ‘unspecified’ value which we considered missing value. The number of missing value for ‘state’ variable is 2801. Figure4. Histogram – State 6 fiProductClassDesc (Predictor) Based on data description from Kaggle web site, fiProductClassDesc means ‘description of 2nd level hierarchical grouping (below ProductGroup) of fiModelDesc’. It is a categorical variable and there are in total 74 values and there is no missing value under fiProductClassDesc. Figure5. Histogram – fiProductClassDesc Detailed Model Analysis Linear Regression Model Setup Parameters For the linear regression model, there are no parameters for us to perform a sensitivity analysis on. Data Description The linear model allows for categorical variables, however in our dataset, the categorical variables were relatively sparse, thus when we were trying to perform 10 fold cross validation, there were cases where the training set did not contain certain categorical variables that the test set did, causing the prediction to crash. To take advantage of the available categorical data, we split the dataset by the product class, and then performed a linear regression on a complete data set of SalePrice on the standardized: 1) Number of Machine Hours on the Current Meter 2) Year made Results We ran this model for all 57 product classes, and the RMLSE for best model for each class ranged from 0.220586345to 0.730789603 with an average RMLSE of 0.400408126. A detailed breakdown of the RMLSE for each product class can be found in the appendix. A closer look at the linear regression models created for each product class shows that the average coefficient on the number of machine hours was -245.7014 while the average coefficient on the year made was 11585.12. These coefficients make sense because it means 7 that the longer the machine has been used the lower the price, and the ‘younger’ the machine the more valuable it is. Ridge Regression Model Setup Parameters For the ridge regression model, we ran cross validation on the linear regression model with 20 different lambda parameters ranging from 0.0004510930 to 1. Data Description The linear model allows for categorical variables, however in our dataset, the categorical variables were relatively sparse, thus when we were trying to perform 10 fold cross validation, there were cases where the training set did not contain certain categorical variables that the test set did, causing the prediction to crash. To take advantage of the available categorical data, we split the dataset by the product class, and then performed a linear regression on a complete data set of SalePrice on the standardized: 1) Number of Machine Hours on the Current Meter 2) Year made Results The results from the ridge regression were significantly worse than that from the simple linear regression. We ran this model for all 57 product classes, and the RMLSE for best model for each class ranged from 1.818759302 to 0.511227413 with an average RMLSE of 1.01693174. A detailed breakdown of the RMLSE for each product class can be found in the appendix. This result is could have stemmed from the fact that there is very little correlation between the two predictor variables, and implementing ridge regression implemented an additional bias on the coefficients and weakened the predictions. It is also important to note that the lambda that had the lowest RMLSE for all the 57 product classes was the smallest lambda of 0.00045, implying that the model would potentially have been better off without the additional constraint on the size of the coefficients. K-Nearest Neighbor Classification (KNN) Model Setup Parameters For the KNN model, we experimented with a range of 3 to 10 nearest neighbors. Data Description The KNN model only allows for numerical variables, thus we were unable to include the other categorical variables in the model. However, we split the dataset by the product class, and then found the number of nearest neighbors that minimized RMLSE for each product class. The variables used for each subset of data were: 1) Number of Machine Hours on the Current Meter 2) Year made Results 8 We ran this model for all 57 product classes, and the RMLSE for best model for each class ranged from 0.206888639 to 0.657082542 with an average RMLSE of 0.34215316. A detailed breakdown of the RMLSE for each product class can be found in the appendix. For most of the 57 product classes, the KNN algorithm picked 9 or 10 nearest neighbors as the one that minimizes RMLSE. Number of Nearest Neighbors, K 5 6 7 8 9 10 Number of Product Classes with Associated Optimal K 1 1 4 6 12 33 Support Vector Machines Classification (SVM) Model Setup Parameters A SVM model is a representation of the examples as points in space, mapped so that the examples of the separate categories are divided by a clear gap that is as wide as possible. New examples are then mapped into that same space and predicted to belong to a category based on which side of the gap they fall on. In addition to performing linear classification, SVMs can efficiently perform non-linear classification using a method, the kernel trick, to implicitly map their inputs into high-dimensional feature spaces. This allows the SVM algorithm to fit the maximum-margin hyper plane in a transformed feature space. In this prediction model, other than the linear kernel, we focused on 3 types of kernels: the sigmoid, polynomial, and the radial. Each kernel transforms the feature space in different nonlinear ways so as to be able to capture some of the non-linearity in the data set. The parameters can be adjusted for each of the 3 kernels as follows: • • • • Polynomial: Degree and Gamma Sigmoid: Gamma and Coefficient Radial: Gamma Linear: Gamma The range of gamma we experimented with ranges from 10-6 to 0.1, the degree ranges from 2 to 6, and the coefficients range from 0 to 3. We selected these parameters as these parameters are commonly used by other researchers in the field. We tuned the model using these parameters and for each combination of parameters we performed 10 fold cross validation to get the average classification rate for each kernel. Data Description It is important to note that due to the size of the data, we were unable to run SVM on the entire 401,125 observations, thus we created subsets of the data based on the 57 product class descriptions and individually did cross validation to find out the optimal parameters for each product class. The variables used for each subset of data were: 1) State of sale 2) Type of enclosure 3) Number of Machine Hours on the Current Meter 4) Year made 9 Results We ran this model for all 57 product classes, and the RMLSE for best model for each class ranged from 0.208401176 to 0.519523311 with an average RMLSE of 0.329685653. A detailed breakdown of the RMLSE for each product class can be found in the appendix. Neural Networks Model Setup Parameters Data Description Results Boosting (GBM) Model Setup Parameters The parameters that we chose to vary for cross-validation purposes were interaction depth and shrinkage. An interaction depth of 1 implies that the model is simply additive in nature while a depth of “k” implies that there may be interaction present between combinations of k variables. We chose to explore interaction depths up to 4. The shrinkage parameter is essentially the lambda value that is used when developing trees for the model. We tested shrinkage values {0.1, 0.2, … , 1.0}. For each model we also chose to fit 100 trees; we determined this was a sufficient value heuristically by testing the prediction power of several models with 10, 100, and 1000 trees. While 1000 trees performed slightly better, the menial improvement did not justify the huge increase in computational time due to the vastness of our dataset. Data Description The generalized boosting model is able to handle both quantitative and categorical data. Additionally, the “gbm” function in R allows for datasets that contain null values. This was especially useful for our dataset as many of the predictors were very sparse. We began with 52 predictor variables and reduced the set to 40 by removing meaningless and identical variables. From these 40 variables, several initial boosting models were calculated to determine which variables were useful in the model and chose to retain all variables with a relative influence value greater than 0.1; this amounted to 24 predictors. Results For the 30 different combinations of shrinkage parameters and interactions depths, we ran a 10-fold cross validation to estimate the RMSLE of each pairing of parameters. The maximum RMSLE came from the model with a shrinkage parameter of 0.1 and interaction depth of 2 (0.1597) while the smallest came from the model with a shrinkage parameter of 0.6 and interaction depth of 4 (0.1447). The average value among all combinations was 0.1483. Even still we see that the relative influence of several predictors (in order: product class, year made, and 10 enclosure type) dominate the remaining 21 variables in relative influence in the model. Classification Trees (CART) Data Description Since Regression Tree is capable of handling many categorical variables with lots of categories, at least more than 100, and variables with lots of missing values that is more than 75% is null for some variables, we decided to use all 52 predictor variables and prune the tree afterwards to manage the over fitting issue. Model Setup In R, “rpart” package uses πͺπ as a method to penalize the tree to prevent over fitting. As we can see the plot left, the relative error converges to 0.43 after πͺπ = π. π. Therefore, we chose our shrinkage parameter, πͺπ, of 0.1 to prune the tree. The corresponding size of tree when πͺπ = π. π is the tree with 8 nodes. As we can see on our final mode of Regression Tree graph on the right, the variable, “fiProductClassDesc ” is the most influential variable to determine the sales price of bulldozer. This is a categorical variable and the tree divided into two parts based on whether the final product class description is “abcdefiprtvwBCDEFGHIJOQSZZ” or not. From the plot, we can clearly notice that fiProductClassDesc , YearMade and Enclosure are significant predictor variables to estimate Sales Price. 11 Results With this Regression Tree Model, we ran Repeated 10 Fold Cross Validation. This was pretty manageable because each 10 fold cross validation only took 115.892 seconds, that is slightly less than 2 minutes. Interesting point is each 10 fold cross validation, the RMLSE were not vary much. The residual plot also shows errors are randomly distributed. From Regression Tree, we have the repeated 10 fold Cross Validated RMLSE of 0.3318572. GAM Model Setup Data Description The Generalized Addictive model only allows for numerical variables just like KNN model, therefore we could not include categorical variables in our model. However, we split the dataset by the product class, and then found the RMLSE for each product class. The variables used for each subset of data were: 1) Number of Machine Hours on the Current Meter 2) Year made Results We ran the GAM for all 57 product classes, and we found that Generalized Additive Model did predict better than Linear Regression and Ridge Regression with our data but was not so different from K Nearest Neighbor. 10 Fold Cross Validation RMSLE from GAM ranged from 0.214957079 to 0.683211532. On average, GAM has 10 Fold CV RMSLE of 0.35196. MARS Model Setup Multivariate adaptive regression spline (MARS) models use additive local linear regression models that can handle both numerical and categorical data. The modeling technique creates basis functions for each variable called “hinge functions”. These functions are essentially piecewise linear functions; the value at which the piecewise function is divided is referred to as the “knot”. Like most other non-parametric models, they seek to minimize the sum of squared errors and are derived iteratively. The predicted value is a weighted sum of these hinge functions. The algorithm consists of a forward and backward pass; these steps are very similar to CART models in that the forward pass is meant to overfit the data while the backward pass is 12 meant to prune the least effective hinge functions. MARS software contains a built-in cross validation technique that is used to determine when the backward pass should stop. Data Description Unfortunately the R implementation of multivariate adaptive regression splines cannot handle null values. For this reason, we subset our predictor variables to the five least sparse: state, enclosure type, machine hours, year made, and product class. Of the 400,000 observation only about 120,000 had non-null values for each of these variables. The only parameter that we looked at modifying for the MARS model was the number of interaction terms. Much like other models, this value is the maximum number of variable interaction terms that the algorithm will consider; values 1 through 4 were tested using 10-fold cross validation. Results For the interaction degree levels of 1 to 4, we ran a 10-fold cross validation to estimate the RMSLE of each pairing of parameters. The maximum RMSLE came from the model with an interaction depth of 1 (0.1512) while identical RMSLE values were found for models with interaction terms of 2, 3 and 4 (0.1497). The average value among all combinations was 0.1501. Random Forest Model Setup Recursive partitioning methods are one of the most popular and widely used supervised learning methods for nonparametric regression and classification. Theoretically, random forest is very powerful since it can handle large numbers of predictor variables even in the presence of complex interactions. To use Random Forest in R, we have two packages, which are “randomForest”, and “party.” They have slightly different characteristics and different abilities to handle our data. Random forest in party package introduced the unbiased tree algorithm for conditional inference trees and that enables learning unbiased forests. It is important to notice that randomForest package and party package are different in the variable importance measure. randomForest uses Gini importance, which is based on the Gini gain criterion employed in most traditional classification algorithms while party uses conditional importance. Generally speaking, the party package’s random forest is treated as an updated version of traditional random forest algorithm offered by randfomForest package. Data Description For randomForest package, it does not take any null values and categorical variable has to have less than 8 categories. But it offers the parallel computing technology with ‘foreach’ package, which save the time for running the algorithm significantly. Party’s random forest can handle missing values and many categories but takes a long time to run the algorithm. However, random forest offered from both packages required a large memory and high computing power. We tested both random forest models with Mac Book Pro Retina 2012 model. Since we have the considerably large sample size of 401,125 observations, this laptop cannot handle more than 4 variables due to RAM and running time issues. (Even the random forest benchmark dataset given by Kaggle consists of only one predictor variable.) Therefore, we chose 4 variables based on CART Tree and GBM variable importance outcome and the existence of null values. 1) MachineID 2) ProductGroup 13 3) YearMade 4) Saledate As we can notice from the graph on the left, ProductGroup is the most important variable followed by YearMade and MachineID. Salesdate is the least significant to predict Sales Price among our 4 variables. Results We ran the 10 fold cross validation on random forest for both packages. randomForest package gave RMSLE of 0.4819 while party package gave 0.4889. PPR Model Setup Parameters Data Description Results Conclusion The Kaggle competition is a fierce competition with sophisticated data scientists and analysts across the world using cutting edge algorithms and techniques to achieve the best score. In our project we attempted to apply as many of the algorithms we learned in class (and more) to get comfortable with the dataset and understand the intricacies of the problem. One major issue we faced was the size and how sparse the data was: R was unable to run the many algorithms without running out of memory. Moreover, some algorithms were limited by whether it could take in categorical, numeric data or both. Therefore, for the purposes of this project, we ran three types of analysis for different types of algorithms: 1) Prediction using subset of variables on all 400,000 observations 2) Prediction using all 4 variables splitting on one of the categorical data into 57 subsets 14 3) Prediction using 2 continuous variables splitting on one of the categorical data into 57 subsets This approach allowed us to explore as many predictive analytic techniques as possible to give us a sense of what algorithms would be most effective going forward, as well as what approach we could take after taking into account data and computational limitations. We summarize our findings as follow: Key results: 1) GBM and MARS performed the best (algorithms using the 1st type of analysis) ο· RMLSE of 0.1447 and 0.1497 respectively 2) We are able to identify which algorithms work best for each of the 57 subsets ο· Given new observations, we can identify which product class it belongs to, and use the vzalgorithm with the lowest RMLSE ο· The best algorithm for each product class can be found in the appendix In conclusion, we have taken a deep dive into exploring the potential predictive methods for the Kaggle dataset. We find that even after taking out many of the sparse categorical variables, we are still able to get good results after cross-validation within our training set. Future steps would include [Joon to add in some sentences] 15 Appendix Algorithm Name SVM Product Class Best Model Ridge KNN Regression Number of Neighbours Best Lambda Wheel Loader - Radial Model 110.0 to 120.0 Parameters: Horsepower 0.1 Polynomial Wheel Loader - Model 150.0 to 175.0 Parameters: Horsepower 6&2 Skid Steer Loader - 1351.0 Polynomial to 1601.0 Lb Model Operating Parameters: Capacity 2&1 Hydraulic Excavator, Polynomial Track - 12.0 to Model 14.0 Metric Parameters: Tons 3&2 Skid Steer Loader - 1601.0 Polynomial to 1751.0 Lb Model Operating Parameters: Capacity 4&3 Backhoe Loader - 14.0 to 15.0 Ft Standard Digging Depth Hydraulic Excavator, Track - 21.0 to 24.0 Metric Tons Hydraulic Excavator, Track - 3.0 to 4.0 Metric Tons Radial Model Parameters: 0.1 Radial Model Parameters: 0.1 Radial Model Parameters: 0.1 Polynomial Wheel Loader - Model 350.0 to 500.0 Parameters: Horsepower 5&1 Track Type Tractor, Dozer - Radial Model 20.0 to 75.0 Parameters: Horsepower 0.1 Hydraulic Excavator, Polynomial Track - 19.0 to Model 21.0 Metric Parameters: Tons 3&2 Hydraulic Excavator, Radial Model Track - 4.0 to Parameters: 5.0 Metric Tons 0.1 Hydraulic Excavator, Radial Model Track - 2.0 to Parameters: 3.0 Metric Tons 0.1 Hydraulic Excavator, Track - 24.0 to Radial Model 28.0 Metric Parameters: Tons 0.1 Radial Model Parameters: 0.1 Polynomial Wheel Loader - Model 200.0 to 225.0 Parameters: Horsepower 3&3 Hydraulic Excavator, Polynomial Track - 50.0 to Model 66.0 Metric Parameters: Tons 4&1 Ridge Neural Regression GAM Networks PPR SVM Ridge Best Best number Regression GAM parameters of terms RMLSE Linear Regression Ridge Regression GAM Neural Networks PPR RMLSE RMLSE RMLSE RMLSE RMLSE RMLSE 8 Linear Ridge 0.00045109 Regression Regression GAM 0.1&3 1 0.33575332 0.35076334 0.40837499 0.70163136 0.36868185 0.35538048 9 Linear Ridge 0.00045109 Regression Regression GAM 1&4 2 0.31033801 0.33113282 0.35216205 0.67773369 0.3205651 0.35467294 8 Linear Ridge 0.00045109 Regression Regression GAM 1&4 2 0.24630725 0.24410265 0.26810819 1.42895732 0.25893171 0.25797712 0.25001887 KNN: 8 0.24410265 10 Linear Ridge 0.00045109 Regression Regression GAM 1&4 8 0.31138785 0.30577523 0.35633651 0.66627939 0.32760268 0.34002275 0.30577523 10 Linear Ridge 0.00045109 Regression Regression GAM 1&4 2 0.23584601 0.24564255 0.26338521 0.9930175 0.25151523 0.25150977 0.31992246 KNN: 10 SVM: Polynomial Model Parameters: 0.23786264 4&3 10 Linear Ridge 0.00045109 Regression Regression GAM 1&5 0.24944523 SVM: Radial Model Parameters: 0.22956573 0.1 8 0.22810239 0.24000226 0.25716934 1.11142673 0.24268334 10 Linear Ridge 0.00045109 Regression Regression GAM 1&5 1 0.34091423 0.35231256 0.39636337 1.05285981 0.36320212 0.38496036 10 Linear Ridge 0.00045109 Regression Regression GAM 1&4 1 0.24431454 0.2512586 0.30901526 1.26680323 0.26348923 0.26263735 SVM: Radial Model Parameters: 0.413769 0.1 SVM: Radial Model Parameters: 0.24972948 0.1 Neural Networks: 0.413769 1&5 SVM: Radial Model Parameters: 0.31090085 0.1 9 Linear Ridge 0.00045109 Regression Regression GAM 1&5 4 0.43915666 0.44425007 0.51024605 0.79955844 0.45847529 0.48610911 10 Linear Ridge 0.00045109 Regression Regression GAM 1&4 4 0.29899515 0.30327425 0.422715 0.84468004 0.3191405 0.33593979 10 Linear Ridge 0.00045109 Regression Regression GAM 0.1&5 3 0.31277874 0.30947516 0.39973948 0.8394843 0.31990292 0.33802762 9 Linear Ridge 0.00045109 Regression Regression GAM 0.1&4 1 0.26638053 0.26948051 0.3227393 1.06964016 0.28255372 0.29636534 0.32135175 KNN: 10 SVM: Radial Model Parameters: 0.28489108 0.1 Neural Networks: 0.23215892 1&4 Linear Ridge 0.00045109 Regression Regression GAM 1&4 8 0.23829788 0.2381777 0.2883876 1.24691859 0.24733567 0.24115873 7 Linear Ridge 0.00045109 Regression Regression GAM 0.1&5 4 0.40188474 0.38703131 0.63985156 0.51122741 0.40418251 0.42743641 10 Linear Ridge 0.00045109 Regression Regression GAM 0.1&4 9 0.41195991 0.46286548 0.49397032 0.96964403 0.46706404 0.4677637 0.413769 KNN: 7 SVM: Radial Model Parameters: 0.413769 0.1 10 Linear Ridge 0.00045109 Regression Regression GAM 1&5 1 0.4383524 0.4294455 0.46345463 1.0676783 0.4363468 0.44888333 Neural Networks: 0.413769 1&5 10 Linear Ridge 0.00045109 Regression Regression GAM 1&3 2 0.51952331 0.52165395 0.63830237 0.80927636 0.53612937 0.56845338 Neural Networks: 0.413769 1&3 Algorithm Name SVM KNN Ridge Regres sion Linear Regres sion Ridge Regres sion GA M Neural Networ ks Product Class Best Model Number of Neighbo urs Best Lambd a Linear Regres sion Ridge Regres sion GA M Best parame ters Wheel Loader 110.0 to 120.0 Horsepower Radial Model Param eters: 0.1 0.0004 5109 Linear Regres sion Ridge Regres sion GA M 8 KNN Best Model RMLSE SVM: Radial Model Parameters: 0.35834164 0.1 0.33575332 SVM: Polynomial Model Parameters: 0.31114744 6&2 0.31033801 10 Motorgrader 45.0 to 130.0 Horsepower Linear Regression Linear Regression 0.1&3 PPR Best num ber of term s 1 0.23584601 0.22810239 0.34091423 0.24431454 0.413769 0.29899515 0.30947516 0.26638053 0.23215892 0.38703131 0.41195991 0.413769 0.413769 SVM KNN Linear Regress ion Ridge Regress ion GAM Neural Networks PPR RMLSE RMLSE RMLSE RMLSE RMLSE RMLSE RML SE Best Model RMLSE 0.33575 332 0.35076 334 0.40837 499 0.70163 136 0.36868 185 0.35 8341 64 SVM: Radial Model Parameters: 0.1 0.33575332 16 0.355380 48 Wheel Loader 150.0 to 175.0 Horsepower Polyn omial Model Param eters: 6&2 9 Skid Steer Loader - 1351.0 to 1601.0 Lb Operating Capacity Polyn omial Model Param eters: 2&1 Hydraulic Excavator, Track - 12.0 to 14.0 Metric Tons Polyn omial Model Param eters: 3&2 Skid Steer Loader - 1601.0 to 1751.0 Lb Operating Capacity Polyn omial Model Param eters: 4&3 Backhoe Loader - 14.0 to 15.0 Ft Standard Digging Depth Radial Model Param eters: 0.1 Hydraulic Excavator, Track - 21.0 to 24.0 Metric Tons Radial Model Param eters: 0.1 Hydraulic Excavator, Track - 3.0 to 4.0 Metric Tons Radial Model Param eters: 0.1 Wheel Loader 350.0 to 500.0 Horsepower Polyn omial Model Param eters: 5&1 Track Type Tractor, Dozer 20.0 to 75.0 Horsepower 0.354672 94 0.31 1147 44 SVM: Polynomial Model Parameters: 6&2 0.31033801 0.25893 171 0.257977 12 0.25 0018 87 KNN: 8 0.24410265 0.32760 268 0.340022 75 0.31 9922 46 KNN: 10 0.30577523 SVM: Polynomial Model Parameters: 4&3 0.23584601 0.0004 5109 Linear Regres sion Ridge Regres sion GA M 1&4 2 0.31033 801 0.33113 282 0.35216 205 0.67773 369 0.32056 51 8 0.0004 5109 Linear Regres sion Ridge Regres sion GA M 1&4 2 0.24630 725 0.24410 265 0.26810 819 1.42895 732 10 0.0004 5109 Linear Regres sion Ridge Regres sion GA M 1&4 8 0.31138 785 0.30577 523 0.35633 651 0.66627 939 10 0.0004 5109 Linear Regres sion Ridge Regres sion GA M 10 0.0004 5109 Linear Regres sion Ridge Regres sion GA M 10 0.0004 5109 Linear Regres sion Ridge Regres sion GA M 10 0.0004 5109 Linear Regres sion Ridge Regres sion GA M 9 0.0004 5109 Linear Regres sion Ridge Regres sion GA M Radial Model Param eters: 0.1 10 0.0004 5109 Linear Regres sion Ridge Regres sion GA M Hydraulic Excavator, Track - 19.0 to 21.0 Metric Tons Polyn omial Model Param eters: 3&2 10 0.0004 5109 Linear Regres sion Ridge Regres sion GA M Hydraulic Excavator, Track - 4.0 to 5.0 Metric Tons Radial Model Param eters: 0.1 9 0.0004 5109 Linear Regres sion Ridge Regres sion GA M Hydraulic Excavator, Track - 2.0 to 3.0 Metric Tons Radial Model Param eters: 0.1 10 0.0004 5109 Linear Regres sion Ridge Regres sion GA M 2 0.23584 601 0.24564 255 0.26338 521 0.99301 75 0.25151 523 0.251509 77 0.23 7862 64 8 0.22810 239 0.24000 226 0.25716 934 1.11142 673 0.24268 334 0.249445 23 0.22 9565 73 SVM: Radial Model Parameters: 0.1 0.22810239 1 0.34091 423 0.35231 256 0.39636 337 1.05285 981 0.36320 212 0.384960 36 0.41 3769 SVM: Radial Model Parameters: 0.1 0.34091423 1 0.24431 454 0.25125 86 0.30901 526 1.26680 323 0.26348 923 0.262637 35 0.24 9729 48 SVM: Radial Model Parameters: 0.1 0.24431454 4 0.43915 666 0.44425 007 0.51024 605 0.79955 844 0.45847 529 0.486109 11 0.41 3769 Neural Networks: 1&5 1&4 4 0.29899 515 0.30327 425 0.42271 5 0.84468 004 0.31914 05 0.335939 79 0.31 0900 85 SVM: Radial Model Parameters: 0.1 0.29899515 0.1&5 3 0.31277 874 0.30947 516 0.39973 948 0.83948 43 0.31990 292 0.338027 62 0.32 1351 75 KNN: 10 0.30947516 0.1&4 1 0.26638 053 0.26948 051 0.32273 93 1.06964 016 0.28255 372 0.296365 34 0.28 4891 08 SVM: Radial Model Parameters: 0.1 0.26638053 1&4 8 0.23829 788 0.23817 77 0.28838 76 1.24691 859 0.24733 567 0.241158 73 0.23 2158 92 Neural Networks: 1&4 0.23215892 1&4 1&5 1&5 1&4 1&5 17 0.413769 Hydraulic Excavator, Track - 24.0 to 28.0 Metric Tons Radial Model Param eters: 0.1 Motorgrader 45.0 to 130.0 Horsepower 7 0.0004 5109 Linear Regres sion Ridge Regres sion GA M Radial Model Param eters: 0.1 10 0.0004 5109 Linear Regres sion Ridge Regres sion GA M Wheel Loader 200.0 to 225.0 Horsepower Polyn omial Model Param eters: 3&3 10 0.0004 5109 Linear Regres sion Ridge Regres sion GA M Hydraulic Excavator, Track - 50.0 to 66.0 Metric Tons Polyn omial Model Param eters: 4&1 10 0.0004 5109 Linear Regres sion Ridge Regres sion GA M Hydraulic Excavator, Track - 40.0 to 50.0 Metric Tons Polyn omial Model Param eters: 5&3 0.0004 5109 Linear Regres sion Ridge Regres sion Hydraulic Excavator, Track - 33.0 to 40.0 Metric Tons Radial Model Param eters: 0.1 0.0004 5109 Linear Regres sion Ridge Regres sion Skid Steer Loader - 2201.0 to 2701.0 Lb Operating Capacity Polyn omial Model Param eters: 6&2 10 0.0004 5109 Linear Regres sion Ridge Regres sion GA M Wheel Loader 120.0 to 135.0 Horsepower Radial Model Param eters: 0.1 10 0.0004 5109 Linear Regres sion Ridge Regres sion GA M Track Type Tractor, Dozer 130.0 to 160.0 Horsepower Radial Model Param eters: 0.1 9 0.0004 5109 Linear Regres sion Ridge Regres sion GA M Wheel Loader 275.0 to 350.0 Horsepower Polyn omial Model Param eters: 5&1 10 0.0004 5109 Linear Regres sion Ridge Regres sion Motorgrader 145.0 to 170.0 Horsepower Radial Model Param eters: 0.01 8 0.0004 5109 Linear Regres sion Hydraulic Excavator, Track - 6.0 to 8.0 Metric Tons Radial Model Param eters: 0.1 10 0.0004 5109 Wheel Loader 60.0 to 80.0 Horsepower Radial Model Param eters: 0.1 10 0.0004 5109 10 10 GA M GA M 4 0.40188 474 0.38703 131 0.63985 156 0.51122 741 0.40418 251 0.427436 41 0.41 3769 KNN: 7 0.38703131 0.1&4 9 0.41195 991 0.46286 548 0.49397 032 0.96964 403 0.46706 404 0.467763 7 0.41 3769 SVM: Radial Model Parameters: 0.1 0.41195991 1&5 1 0.43835 24 0.42944 55 0.46345 463 1.06767 83 0.43634 68 0.448883 33 0.41 3769 Neural Networks: 1&5 0.413769 2 0.51952 331 0.52165 395 0.63830 237 0.80927 636 0.53612 937 0.568453 38 0.41 3769 Neural Networks: 1&3 0.413769 0.417335 02 0.39 9107 92 SVM: Polynomial Model Parameters: 5&3 0.38933384 0.378307 11 0.41 3769 SVM: Radial Model Parameters: 0.1 0.35468536 SVM: Polynomial Model Parameters: 6&2 0.24643054 0.1&5 1&3 1&4 0.1&4 2 1 0.38933 384 0.35468 536 0.40959 054 0.37966 446 0.47060 631 0.37935 221 0.64644 183 0.93317 366 0.39932 584 0.36268 093 10 0.24643 054 0.24794 252 0.26862 243 1.02147 966 0.25447 553 0.261878 04 0.25 1501 75 3 0.28866 407 0.31282 209 0.35884 263 1.20026 808 0.31628 543 0.323098 69 0.30 0270 53 SVM: Radial Model Parameters: 0.1 0.28866407 1&3 2 0.36727 663 0.37044 194 0.55617 408 0.67896 448 0.37746 282 0.419319 67 0.36 7419 66 SVM: Radial Model Parameters: 0.1 0.36727663 GA M 1&5 1 0.38746 149 0.38224 806 0.41222 417 0.90850 292 0.38308 271 0.408133 47 0.38 7442 19 KNN: 10 0.38224806 Ridge Regres sion GA M 1&4 1 0.49452 423 0.49534 362 0.52594 288 1.06452 298 0.49733 341 0.511988 56 0.41 3769 Neural Networks: 1&4 Linear Regres sion Ridge Regres sion GA M 1&5 2 0.29628 042 0.29065 79 0.33804 042 1.66925 021 0.30807 069 0.327035 04 0.30 8414 97 KNN: 10 Linear Regres sion Ridge Regres sion GA M 10 0.24360 989 0.30076 637 0.38081 833 1.35694 763 0.31691 218 0.283193 04 0.25 2146 72 SVM: Radial Model Parameters: 0.1 0.1&4 1&4 1&3 18 0.413769 0.2906579 0.24360989 Other Radial Model Param eters: 0.1 Hydraulic Excavator, Track - 8.0 to 11.0 Metric Tons Radial Model Param eters: 0.1 Skid Steer Loader - 1751.0 to 2201.0 Lb Operating Capacity Polyn omial Model Param eters: 4&1 Motorgrader 170.0 to 200.0 Horsepower 0.0004 5109 Linear Regres sion Ridge Regres sion 0.0004 5109 Linear Regres sion Ridge Regres sion 10 0.0004 5109 Linear Regres sion Ridge Regres sion GA M Radial Model Param eters: 0.1 10 0.0004 5109 Linear Regres sion Ridge Regres sion GA M Skid Steer Loader - 1251.0 to 1351.0 Lb Operating Capacity Polyn omial Model Param eters: 6&3 9 0.0004 5109 Linear Regres sion Ridge Regres sion GA M Hydraulic Excavator, Track - 16.0 to 19.0 Metric Tons Radial Model Param eters: 0.1 10 0.0004 5109 Linear Regres sion Ridge Regres sion GA M Hydraulic Excavator, Track - 0.0 to 2.0 Metric Tons Polyn omial Model Param eters: 3&3 0.0004 5109 Linear Regres sion Ridge Regres sion Backhoe Loader - 15.0 to 16.0 Ft Standard Digging Depth Polyn omial Model Param eters: 3&1 8 0.0004 5109 Linear Regres sion Ridge Regres sion GA M Motorgrader 130.0 to 145.0 Horsepower Radial Model Param eters: 0.1 9 0.0004 5109 Linear Regres sion Ridge Regres sion GA M Track Type Tractor, Dozer 75.0 to 85.0 Horsepower Radial Model Param eters: 0.1 7 0.0004 5109 Linear Regres sion Ridge Regres sion GA M Hydraulic Excavator, Track - 14.0 to 16.0 Metric Tons Polyn omial Model Param eters: 3&3 9 0.0004 5109 Linear Regres sion Ridge Regres sion GA M Track Type Tractor, Dozer 85.0 to 105.0 Horsepower Radial Model Param eters: 0.1 10 0.0004 5109 Linear Regres sion Ridge Regres sion GA M 5 10 10 GA M GA M GA M 1&2 1&2 2 1 0.51496 064 0.29261 768 0.65708 254 0.31282 336 0.73078 96 0.35837 578 1.07683 14 1.24440 742 0.68321 153 0.32781 977 0.622165 83 0.41 3769 Neural Networks: 1&2 0.334950 91 0.32 4035 19 SVM: Radial Model Parameters: 0.1 0.29261768 SVM: Polynomial Model Parameters: 4&1 0.23070482 0.2609167 0.413769 6 0.23070 482 0.23247 183 0.25180 718 1.58901 583 0.24569 646 0.251898 18 0.23 3320 6 0.1&4 1 0.26091 67 0.27508 384 0.30386 9 1.59788 115 0.26710 015 0.274466 31 0.41 3769 SVM: Radial Model Parameters: 0.1 1&4 8 0.22193 511 0.21051 592 0.24853 04 0.74400 705 0.22992 763 0.231248 19 0.22 7690 3 KNN: 9 0.21051592 2 0.38197 897 0.39281 777 0.54958 854 0.65838 095 0.43397 318 0.457178 33 0.41 3769 SVM: Radial Model Parameters: 0.1 0.38197897 0.270345 13 0.25 7932 99 SVM: Polynomial Model Parameters: 3&3 0.22410146 SVM: Polynomial Model Parameters: 3&1 0.26441843 1&2 1&5 1&1 1&5 0.1&5 1&5 0.1&4 1&5 1 0.22410 146 0.26998 726 0.26919 95 1.27051 503 0.32441 786 1 0.26441 843 0.28025 198 0.29670 482 0.78811 841 0.28034 952 0.290621 95 0.26 8396 94 1 0.34615 068 0.36045 055 0.40768 655 0.79593 673 0.36399 808 0.385938 61 0.36 8518 55 SVM: Radial Model Parameters: 0.1 0.34615068 1 0.28736 992 0.29048 576 0.43504 493 0.78647 162 0.30105 254 0.321011 65 0.29 4961 22 SVM: Radial Model Parameters: 0.1 0.28736992 SVM: Polynomial Model Parameters: 3&3 0.30570146 SVM: Radial Model Parameters: 0.1 0.30562686 10 0.30570 146 0.31876 652 0.37349 433 1.21413 613 0.32010 724 0.337232 66 0.32 3916 46 3 0.30562 686 0.31137 251 0.40285 867 0.75861 262 0.32191 936 0.345044 4 0.31 6775 01 19 Backhoe Loader - 16.0 + Ft Standard Digging Depth Polyn omial Model Param eters: 6&1 Hydraulic Excavator, Track - 28.0 to 33.0 Metric Tons Radial Model Param eters: 0.01 Track Type Tractor, Dozer 105.0 to 130.0 Horsepower Radial Model Param eters: 0.1 Track Type Tractor, Dozer 160.0 to 190.0 Horsepower Polyn omial Model Param eters: 2&1 Backhoe Loader - 0.0 to 14.0 Ft Standard Digging Depth Radial Model Param eters: 0.1 Skid Steer Loader - 0.0 to 701.0 Lb Operating Capacity 9 0.0004 5109 Linear Regres sion Ridge Regres sion GA M 9 0.0004 5109 Linear Regres sion Ridge Regres sion GA M 10 0.0004 5109 Linear Regres sion Ridge Regres sion GA M 10 0.0004 5109 Linear Regres sion Ridge Regres sion GA M 10 0.0004 5109 Linear Regres sion Ridge Regres sion GA M Radial Model Param eters: 0.1 10 0.0004 5109 Linear Regres sion Ridge Regres sion GA M Track Type Tractor, Dozer 190.0 to 260.0 Horsepower Radial Model Param eters: 0.01 10 0.0004 5109 Linear Regres sion Ridge Regres sion GA M Wheel Loader 135.0 to 150.0 Horsepower Radial Model Param eters: 0.1 7 0.0004 5109 Linear Regres sion Ridge Regres sion GA M Wheel Loader 175.0 to 200.0 Horsepower Radial Model Param eters: 0.1 10 0.0004 5109 Linear Regres sion Ridge Regres sion Wheel Loader 225.0 to 250.0 Horsepower Polyn omial Model Param eters: 6&2 9 0.0004 5109 Linear Regres sion Skid Steer Loader - 2701.0+ Lb Operating Capacity Radial Model Param eters: 0.01 9 0.0004 5109 Wheel Loader 90.0 to 100.0 Horsepower Radial Model Param eters: 0.1 8 Track Type Tractor, Dozer 260.0 + Horsepower Radial Model Param eters: 0.1 10 1&4 1&5 1&5 SVM: Polynomial Model Parameters: 6&1 0.28914711 2 0.28914 711 0.29907 637 0.30620 912 1.24188 101 0.30235 597 0.307575 19 0.30 3117 62 1 0.37914 877 0.39569 984 0.47466 733 0.70126 707 0.39296 672 0.405767 37 0.38 7588 14 SVM: Radial Model Parameters: 0.01 0.37914877 1 0.31184 375 0.31313 502 0.41287 421 1.81875 93 0.33867 579 0.339487 76 0.31 9167 43 SVM: Radial Model Parameters: 0.1 0.31184375 SVM: Polynomial Model Parameters: 2&1 0.38919826 1 0.38919 826 0.42608 39 0.44200 895 0.94395 754 0.41650 617 0.432248 04 0.41 3769 04 4 0.32020 462 0.33946 19 0.37386 66 0.74734 629 0.35639 792 0.344605 73 0.32 5790 11 SVM: Radial Model Parameters: 0.1 0.32020462 0.1&5 2 0.39193 912 0.42895 705 0.48740 001 1.57397 779 0.44306 527 0.421871 85 0.41 3769 01 SVM: Radial Model Parameters: 0.1 0.39193912 1&5 2 0.45868 538 0.48185 645 0.52125 878 0.88971 728 0.48800 686 0.486872 52 0.41 3769 Neural Networks: 1&5 0.1&4 1 0.33270 004 0.33326 997 0.42667 388 0.86321 417 0.34307 985 0.381239 53 0.35 1424 51 SVM: Radial Model Parameters: 0.1 0.33270004 GA M 1&5 1 0.38363 821 0.37672 165 0.45806 171 0.61917 316 0.37705 636 0.402778 14 0.41 3769 KNN: 10 0.37672165 Ridge Regres sion GA M 1&4 2 0.37865 485 0.34669 888 0.39429 239 1.03299 897 0.37540 148 0.368214 07 0.38 4812 25 KNN: 9 0.34669888 Linear Regres sion Ridge Regres sion GA M 1&2 1 0.20840 118 0.20688 864 0.22058 635 1.56057 389 0.21495 708 0.222718 23 0.21 7478 48 KNN: 9 0.20688864 0.0004 5109 Linear Regres sion Ridge Regres sion GA M 0.1&1 1 0.30473 316 0.28435 215 0.37389 757 0.90813 827 0.29837 22 0.353921 67 0.32 7818 21 KNN: 8 0.28435215 0.0004 5109 Linear Regres sion Ridge Regres sion GA M 1 0.38991 36 0.40325 283 0.47156 16 0.73189 076 0.40337 091 0.438172 09 0.39 9978 38 SVM: Radial Model Parameters: 0.1 0.1&4 0.1&4 1&5 20 0.413769 0.3899136 Motorgrader 200.0 + Horsepower Polyn omial Model Param eters: 2&1 Hydraulic Excavator, Track - 5.0 to 6.0 Metric Tons Radial Model Param eters: 0.1 Wheel Loader 250.0 to 275.0 Horsepower 0.0004 5109 Linear Regres sion Ridge Regres sion 9 0.0004 5109 Linear Regres sion Ridge Regres sion GA M Radial Model Param eters: 0.1 10 0.0004 5109 Linear Regres sion Ridge Regres sion GA M Backhoe Loader - Unidentified Radial Model Param eters: 0.01 7 0.0004 5109 Linear Regres sion Ridge Regres sion GA M Wheel Loader 100.0 to 110.0 Horsepower Polyn omial Model Param eters: 5&1 0.0004 5109 Linear Regres sion Ridge Regres sion Hydraulic Excavator, Track - 11.0 to 12.0 Metric Tons Polyn omial Model Param eters: 4&2 6 10 8 0.0004 5109 Linear Regres sion Ridge Regres sion GA M GA M GA M 1&5 4 0.44490 182 0.45012 806 0.51265 445 0.84326 766 0.47455 712 0.508644 77 0.41 3769 Neural Networks: 1&5 SVM: Radial Model Parameters: 0.1 0.27534269 0.413769 2 0.27534 269 0.28041 916 0.37614 394 1.14462 962 0.30552 916 0.304181 77 0.29 4165 72 0.01&3 1 0.37001 197 0.39039 829 0.41221 317 0.95634 534 0.38010 83 0.408472 65 0.39 9544 75 SVM: Radial Model Parameters: 0.1 0.37001197 0.01&3 1 0.26369 973 0.26213 523 0.31400 174 1.27351 349 0.26561 423 0.290554 32 0.30 2389 73 KNN: 7 0.26213523 0.362194 26 0.34 7168 44 SVM: Polynomial Model Parameters: 5&1 0.33089216 0.31 7747 7 SVM: Polynomial Model Parameters: 4&2 0.28398376 0.1&2 1&3 0.1&3 2 1 0.33089 216 0.28398 376 21 0.34598 567 0.34997 783 0.36821 579 0.36778 163 1.29839 931 0.75537 583 0.34628 508 0.35620 213 0.332211 35