Uploaded by Yuvansh saini

KNN-Regression and Classification

advertisement
1.Consider GIVEN data set which can be used for Regression using KNN (if prediction is numerical )
use the same data and set the prediction attribute to categorical values for prediction
2. First do regression Then do classification
3.import the data into a data frame X
4.list few records
5. Pre-process
1. Remove the attribute to be predicted from the data frame and add to another data frame
called Y (house_value_median)
2. Show the count, mean, std deviation, min, max etc. metrics about the dataset in frame X
3. What do you infer from the values of mean and std deviation
4. If the mean or the std deviation is more, is it good to use it directly without scaling the
values? Justify
5. Split the data into train and test set using train_test_split of sklearn, set seed to some value,
let the test size be .25
6. Print the length of the train frame, test frame
7. Using StandardScalar of sklearn transform(scale) the values of X train and X test data frames
8. Show the count, mean, std deviation, min, max etc metrics about the scaled dataset in frame
X train
9. Check the std deviation now and describe what has happened?
10. Using KNeighborsRegressor with K=5 train the model
11. Predict our test data
12. Use MAE, MSE, R2 score the evaluate the model
13. Show all the statistics of the predicted variable such as count, mean, std. deviation etc.
14. What do you infer from the mean and std deviation?
15. Is the MAE value good?
16. Find the best value of K (1 to 35) for the given data set
17. Plot the error values with K obtained in Q16
18. Derive the lowest MAE for which K value is chosen
19. Print the score of the classifier, what do you interpret?
20. Create a confusion matrix to know how much we got right or wrong for each class
21. Also print the precision,recall and f1 score -what do you infer from them
22.
Download