Bonus Assignment 1 Result of Cluster wise regression R code execution. For this assignment, I have used the FlexMix package in order to perform cluster-wise regression on car.test.frame data in package rpart. We first fit a cluster-wise regression using a single cluster and then I tried to fit the cluster-wise regression with cluster 2, 3, 4 and finally 5. I stopped at 5 because thr AIC and BIC value is lowest with 5 clusters. The model fit is shown below. Coefficients estimate seems reasonable, as increase in fuel consumption translating to a decrease in price. An increase in horse power resulting in an increase in price. I have used all continuous predictors for the cluster wise regression, which are Mileage , Weight , Disp. and HP > library(ISLR) > library(flexmix) > library(rpart) > > cartestframe <- car.test.frame > str(cartestframe) 'data.frame': 60 obs. of 8 variables: $ Price : int 8895 7402 6319 6635 6599 8672 7399 7254 9599 5866 ... $ Country : Factor w/ 8 levels "France","Germany",..: 8 8 5 4 3 6 4 5 3 3 ... $ Reliability: int 4 2 4 5 5 4 5 1 5 NA ... $ Mileage : int 33 33 37 32 32 26 33 28 25 34 ... $ Type : Factor w/ 6 levels "Compact","Large",..: 4 4 4 4 4 4 4 4 4 4 ... $ Weight : int 2560 2345 1845 2260 2440 2285 2275 2350 2295 1900 ... $ Disp. : int 97 114 81 91 113 97 97 98 109 73 ... $ HP : int 113 90 63 92 103 82 90 74 90 73 ... > > #Model with 1 cluster > set.seed(1) > model1 <- flexmix(Price ~ Mileage + Weight + Disp. + HP, data=cartestframe, k=1) > summary(model1) Call: flexmix(formula = Price ~ Mileage + Weight + Disp. + HP, data = cartestframe, k = 1) prior size post>0 ratio Comp.1 1 60 60 1 'log Lik.' -555.404 (df=6) AIC: 1122.808 BIC: 1135.374 > parameters(model1, component=1) Comp.1 coef.(Intercept) 822.181804 coef.Mileage -165.400112 coef.Weight 4.602914 coef.Disp. -40.567694 coef.HP 70.908074 sigma 2642.448365 > > #Model with 2 cluster > set.seed(1) > model2 <- flexmix(Price ~ Mileage + Weight + Disp. + HP, data=cartestframe, k=2) > summary(model2) Call: flexmix(formula = Price ~ Mileage + Weight + Disp. + HP, data = cartestframe, k = 2) prior size post>0 ratio Comp.1 0.7 46 55 0.836 Comp.2 0.3 14 55 0.255 'log Lik.' -538.7509 (df=13) AIC: 1103.502 BIC: 1130.728 > > #Model with 3 cluster > set.seed(1) > model3 <- flexmix(Price ~ Mileage + Weight + Disp. + HP, data=cartestframe, k=3) > summary(model3) Call: flexmix(formula = Price ~ Mileage + Weight + Disp. + HP, data = cartestframe, k = 3) prior size post>0 ratio Comp.1 0.150 7 37 0.189 Comp.2 0.472 32 52 0.615 Comp.3 0.378 21 47 0.447 'log Lik.' -528.0994 (df=20) AIC: 1096.199 BIC: 1138.086 > > #Model with 4 cluster > set.seed(1) > model4 <- flexmix(Price ~ Mileage + Weight + Disp. + HP, data=cartestframe, k=4) > summary(model4) Call: flexmix(formula = Price ~ Mileage + Weight + Disp. + HP, data = cartestframe, k = 4) prior size post>0 ratio Comp.1 0.387 29 42 0.690 Comp.2 0.400 21 52 0.404 Comp.3 0.213 10 41 0.244 'log Lik.' -527.0793 (df=20) AIC: 1094.159 BIC: 1136.046 > > #Model with 5 cluster. Lowest SIC and BIC values compare to above cluster wise regression > set.seed(1) > model5 <- flexmix(Price ~ Mileage + Weight + Disp. + HP, data=cartestframe, k=5) > summary(model5) Call: flexmix(formula = Price ~ Mileage + Weight + Disp. + HP, data = cartestframe, k = 5) prior size post>0 ratio Comp.1 0.109 7 7 1.000 Comp.2 0.307 17 37 0.459 Comp.3 0.137 9 12 0.750 Comp.4 0.174 9 42 0.214 Comp.5 0.272 18 28 0.643 'log Lik.' -490.3728 (df=34) AIC: 1048.746 BIC: 1119.953 > > #The parameters are aligned with the relationships. > #For example the ccoefficent of Milage is negative in the case of all 5 clusters. That means any increase in Gas milage will reduce the price of the vehicle. > parameters(model5, component=1) Comp.1 coef.(Intercept) 24504.8392315 coef.Mileage -788.9528115 coef.Weight 0.5570891 coef.Disp. -75.3209703 coef.HP 140.8121374 sigma 8.9184939 > parameters(model5, component=2) Comp.2 coef.(Intercept) 6916.112200 coef.Mileage -223.699851 coef.Weight 3.995574 coef.Disp. -13.787570 coef.HP 2.370265 sigma 351.799042 > parameters(model5, component=3) Comp.3 coef.(Intercept) 1.021417e+04 coef.Mileage -1.604458e+02 coef.Weight -7.851282e-02 coef.Disp. -1.899640e+01 coef.HP 5.976686e+01 sigma 8.845261e+01 > parameters(model5, component=4) Comp.4 coef.(Intercept) 6301.9717124 coef.Mileage -668.1869203 coef.Weight 18.4157791 coef.Disp. -182.2142509 coef.HP -0.1179906 sigma 1675.9284938 > parameters(model5, component=5) Comp.5 coef.(Intercept) 12014.705768 coef.Mileage -431.141374 coef.Weight 1.340096 coef.Disp. -1.676479 coef.HP 60.965397 sigma 275.968118