home2-09

advertisement
Dr. Eick
COSC 6342“Machine Learning” Assignment2 Spring 2009
First Draft
Due: Tuesday, April 21, 11p (electronic Submission); problem 9 is due Sa., April 25, 11p
8. I---Topic8
Construct a one-dimensional classification dataset for which the leave-one-out cross
validation error for 1NN is always 1—in other words, the 1NN algorithm never predicts
the held out example correctly.
9. G---Topic8 (don’t start too late solving this problem!!)
a) Download the arsenic dataset arsenic_ds1_D1.txt ignoring the class label!
b) Compute the average 5-nearest neighbor distance called d5 using Euclidian
distance
c) Compute and Visualize the Gaussian Kernel density function for 22=0.5*d5,
22=d5, 22=d5*2 (see Topic8d.ppt)
d) Compute and Visualize the k-NN density function for k=3, k=5 and k=7 (see
Topic8d.ppt)
e) Analyze the differences in the 6 created density functions!
f) Explain the differences in the distance functions (try your best!)!
g) Submit a report that contains your software, visualizations, and answers to
questions e and f!
10. I---Topic 10
Assume the following dataset is given with two nominal attributes A and B, and 3
different classes C1, C2, and C3. Compute the information gain for A and B. Based on
your answers to the last question which test should be used as the root of a decision tree?
A B
Class
1
2
C3
1
1
C3
1
2
C1
1
2
C1
2
2
C1
2
1
C2
3
1
C2
3
1
C2
3
1
C2
3
2
C2
11. I---Topic13+14
a) Support vector machine maximize margins when creating hyperplanes. What is the
motivation for doing that? Why a large margins desirable?
1
b) What role does C play in the Soft Margin Hyperplane Approach (section 10.9.3 of the
textbook); what do slack variables measure? Assume the obtained hyperplane for a
dataset of 100 examples has the following values for the slack variable: 1=2, 2=3,
4=0.8, 17=0.2; i is 0 for all other examples in the dataset; what does this mean?
c) Why do most support vector machine machines map examples to a higher dimensional
space?
d) What is a support vector? If we know what the support vectors are—how can this
knowledge be used to speed up support vector learning?
e) What are kernel functions? Why are kernel functions popular in conjunction with
support vector machines—what is their contribution in speeding up the learning process?
2
Download