1. Suppose you have given the following 2-class data where “+” represents a positive class and “−” is represent negative class. Which of the following would be the leave one out cross validation error for k=5? A) 2/14 B) 4/14 C) 6/14 D) 8/14 E) None of the above Please select the correct answer and explain your work with words and/or diagram. Answer: B) For k=5, Cross validation error is 4/14 Which of the following value of k in k-NN would minimize the leave one out cross validation error? A) 3 B) 5 C) Both have same D) None of these Please select the correct answer and explain your work with words and/or diagram Answer: B) k=5, or 5-NN will have the least LOO cross validation error. Which of the following will be true about k in k-NN in terms of Bias? A) When you increase the k the bias will be increases B) When you decrease the k the bias will be increases C) Can’t say D) None of these Please select the correct answer and explain your work with words and/or diagram. Answer: A) When we increase k, bias increases as the model becomes simple, lower k decreases bias. Which of the following will be true about k in k-NN in terms of variance? A) When you increase the k the variance will increases B) When you decrease the k the variance will increases C) Can’t say D) None of these Please select the correct answer and explain your work with words and/or diagram. Answer: B) When we decrease k, variance increases. Why might using too large values k be bad in this dataset? Why might too small values of k also be bad? Please answer and explain with words and/or diagram. Answer: When we increase k too large, that will start generalizing the and include point that belong to more than one class causing to produce errors. And when we start reducing k to too low, there is not much room for error, any small change in the dataset can lead to produce errors. 2. Suppose, you have given the following data where x and y are the 2 input variables and Class is the dependent variable. Below is a scatter plot which shows the above data in 2D space. Suppose, you want to predict the class of new data point x=1 and y=1 using eucludian distance in 3-NN. In which class this data point belong to? A) + Class B) – Class C) Can’t say D) None of these Answer: A) 3 nearest points are + as shown in the figure above with red circle In the previous question, you are now want use 7-NN instead of 3-KNN which of the following x=1 and y=1 will belong to? A) + Class B) – Class C) Can’t say Answer: B) – Class 4 out of 7 nearest points are –, as shown in the figure with blue circle 3. Draw a picture of the false positive rate as a fraction as was done for precision and recall in class: FPR = FP/FP+TN = 3/10 Precision = TP / TP + FP = 5/8 Recall = TP / TP + FN = 5/12