Machine Learning for Language Technology (2015) – DRAFT July 2015 Lecture 03: LAB Assignment Weka: Decision Trees (1): Reading the output ACKNOWLEDGEMENTS : THIS LAB ASSIGNMENT IS BASED ON THE CONTENT OF THE WEKA BOOK . TASKS HAVE BEEN BORROWED FROM MARTIN D. SYKORA’S TUTORIALS. (<HTTP ://HOMEPAGES.LBORO.AC.UK/~COMDS2/ COC131/>). Required Reading for this Lab Assignment Daume III (2014): 10-16 Witten et al. (2011): Ch 17: 562-565 ATT: datasets can be downloaded from here: <http://stp.lingfil.uu.se/~santinim/ml/2015/datasets/> Free material: Free Weka book (2005), 2nd edition: <http://home.etf.rs/~vm/os/dmsw/Morgan.Kaufman.Publishers.Weka.2nd.Edition.200 5.Elsevier.pdf> Additional reading (optional): Witten et al. (2011): - Section 4.3: Divide and Conquer: Construction Decision Trees; - Section 4.4: Covering Algorithms: Constructing Rules. Learning objectives In this lab assignment you are going to: experience supervised machine learning classification; use a Weka implementation of the decision tree classifier called J48; use a decision tree classifier on two different datasets; familiarize with the presentation of the results in Weka. Tasks G tasks: pls provide comprehensive answers to all the questions below: (1) Start Weka, Launch the explorer window and select the "Preprocess" tab. Open the iris dataset. Select the Classify tab. Under Classifier, select J48. What main parameters can be specified for this classifier? (2) Under Test options, select Crossvalidation and under More options, check Output predictions. Click Start to start training the model. You should see a stream of output Machine Learning for Language Technology (2015) – DRAFT July 2015 appear in the window named Classifier output. What do each of the following sections tell you about the model? (a) Predictions on ..." (b) Summary" (c) "Detailed accuracy by class" (d) "Confusion matrix" (3) Go to the graphical representation of the decision tree. it can be displayed graphically in a pop-up tree visualizer. What is the feature under the root node, that is the most discriminative feature? (4) Once you have finished with the iris dataset, repeat the same steps for the English past tense dataset. What is the performance (accuracy, P/R, and f-measure) of the decision tree classifier on this dataset? Try and explain why you get this performance on the past tense dataset. (suggestion: look at the distribution of the classes and analyse the confusion matrix) . (5) Theoretical question: what is a loss function? Give an informal definition and example(s). VG tasks: pls provide comprehensive answers to all the questions below: (6) Under Result list you should see the model that is created at each run. Right-click on the model created for the iris dataset and select Visualize classifier error. Points marked with a square are errors, i.e. incorrectly classified instances. How do you think the classifier performed? Once you have finished with the iris dataset, repeat the same action with the English past tense dataset. How do you think the classifier performed on this larger dataset? (7) Analyse the graphical representation of the decision trees of both the iris dataset and the English past tense dataset. What can you notice? Describe what you see and interpret the trees. To be submitted A one-page written report containing the reasoned answers to the questions above and a short section where you summarize your reflections and experience. Submit the report to santinim@stp.lingfil.uu.se no later than 22 Nov 2015.