Data Mining Assignment: SVM, Clustering, LIBSVM

Data Mining and Knowledge Discovery (KSE525) Assignment #4 (May 21, 2013, due: June 4) 1. [10 points] The effectiveness of the SVM depends on the selection of kernels. trick? (b) Consider the quadratic kernel K(u, v) = (u • v + 1)2. = Φ(u) • Φ(v) for some Φ. (a) What is the kernel Show that this is a kernel, i.e., K(u, v) [Hint: I did the proof for K(u, v) = (u • v)2 in class.] 2. [5 points] What is boosting? Why does boosting improve classification accuracy? 3. [10 points] Discuss the advantages and disadvantages of the four clustering methods: k-means, EM, BIRCH, and DBSCAN. You had better fill out the table below. Advantages Disadvantages k-means EM BIRCH DBSCAN 4. [10 points] Suppose that the data mining task is to cluster points (with (x, y) representing a location) into three clusters, where the points are A1(2,10), A2(2,5), A3(8,4), B1(5,8), B2(7,5), B3(6,4), C1(1,2), C2(4,9). The distance function is the Euclidean distance. center of each cluster, respectively. after the first round of execution. Suppose initially we assign A1, B1, and C1 as the Use the k-means algorithms. (a) Show the three cluster centers (b) Show the final three clusters. 5. [15 points] LIBSVM is one of the most popular tools for the SVM. Download the Wine data set available at the URL below. Let’s practice to use LIBSVM. Then, arbitrarily divide wine.scale into the training set and the test set of approximately the same size. Run svm-train to build a classification model using the training set and run svm-predict to test the accuracy of the model using the test set. Identify the misclassified objects in the test set and report the accuracy of the model. You need to mention which kernel is used (the default is the Gaussian kernel).  LIBSVM: http://www.csie.ntu.edu.tw/~cjlin/libsvm/  Wine data set: http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/multiclass.html#wine

Data Mining Assignment: SVM, Clustering, LIBSVM

Related documents

Products

Support

Data Mining Assignment: SVM, Clustering, LIBSVM

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib