Machine Learning for Language Technology (2015) – DRAFT

advertisement
Machine Learning for Language Technology (2015) – DRAFT July 2015
Detailed Outline:
Last updated: Thu 30 July 2015
Legend:
D2014=Daume’ III, Hal (2014). A course in Machine Learning (online version 0.9)
W2011=Witten, Ian and Frank, Eibe and Hall, Mark (2011). Data Mining. Practical
Machine Learning Tools and Techniques. Morgan Kafmann, 3rd Edition.
Lect
1
2
Topics
Online: N/A
In class: Opening Lecture
 Introduction to the course Scalable Learning
 What is machine learning?
o It draws from exploratory statistics, inferential
statistics, information theory, pattern
recognition, linear algebra, calculus, etc.
o Its basic characteristics:
 Training data & test data
 Generalization (hypothesis space,
overfitting, underfitting)
 Evaluation
 etc.
Online: Preliminaries
 Exploratory Statistics
o variables: numeric/nominal/categorical
o raw data and feature representation;
o sampling, mean, variance, standard deviation,
outliers, noise, etc.
o graphs: how to read a histogram, scatter plot,
etc.
o feature standardization/normalization;
o etc.
Reading
- D2014: 8-10
- W2011: Ch 1
- W2011: Ch2;
Ch10; Ch11:
407-410;
Ch17: 559562;
Lab: Weka - Data Exploration and Preprocessing
<http://stp.lingfil.uu.se/~santinim/ml/2015/labs/Lect02_LAB_Assignment.pdf>
3
Required concepts:
 Concepts, attributes, instances
 Understanding the arff format (cf .csv)
 The iris dataset (3 classes)
 Standard deviation
 Standardization
 Normalization
 Data Visualization – Graphs (W p. 562)
Online: Decision Trees (1)
 Learning model
- D2014: 1016; 60-62;
Machine Learning for Language Technology (2015) – DRAFT July 2015

Loss function
- W2011: 562565;
Lab: Weka – Decision Tree (1): Reading the output
<http://stp.lingfil.uu.se/~santinim/ml/2015/labs/Lect03_LAB_Assignment.pdf>
4
Required concepts:
 Crossvalidation
 Accuracy, P/R, F-measure
 Confusion matrix
 English Past Tense dataset (several classes,
description: http://coltekin.net/cagri/ml08/lab3.html)
Online: Decision Trees (2)
 Inductive Bias
 Models, parameters, hyper-parameters
 ID3(=J48), C4.5, etc.
 Pruning
- D2014: 1623; 51-58;
- W2011: 487494; 567; 575577;
Digression: Information theory: entropy, surprisal
Digression: Math: logarithms
Lab: Weka – Decision Tree (2): Feature Selection and
Reduction
<http://stp.lingfil.uu.se/~santinim/ml/2015/labs/Lect04_LAB_Assignment.pdf>
5
Required concepts:
 Entropy
 Pruning
Online: Perceptron (1)
 Numerical features
 Perceptron learning
 Convergence & separability
- D2014: 3746;
- W2011: 314316; 574-575;
Math Digression: Dot product
Lab: Weka – Perceptron (1): Discretizing numeric
attributes
<http://stp.lingfil.uu.se/~santinim/ml/2015/labs/Lect05_LAB_Assignment.pdf>
6
Required concepts:
 discretization
Online: Perceptron (2)
 Voted and Averaged perceptron
 Limitations
Digression: the “kernel trick” (weka p. 229-230)
Lab: Weka – Perceptron (2): Parameter tuning
<http://stp.lingfil.uu.se/~santinim/ml/2015/labs/Lect06_LAB_Assignment.pdf>
Required concepts:
- D2014: 4650; 60-62;
- W2011: 147156; 577-578;
Machine Learning for Language Technology (2015) – DRAFT July 2015

7
Kernel
Online: Practical Issues (1)
 The importance of good features;
 Evaluating Model performance (Roc curves, etc.)
- D2014: 5160;
- W2011: 172177; 580-581
Lab: Weka – Practical Issues (1): Evaluating Model
performance
<http://stp.lingfil.uu.se/~santinim/ml/2015/labs/Lect07_LAB_Assignment.pdf>
Required concepts:
 ROC curves
8
Online: Practical Issues (2):
 Hypothesis testing & statistically significance
Lab: Weka – Practical Issues (2): Testing with Paired TTest
<http://stp.lingfil.uu.se/~santinim/ml/2015/labs/Lect08_LAB_Assignment.pdf>
9
Required concepts:
 Paired t-test
Online: Beyond binary classification
 Learning with unbalanced data
 Multiclass classification
 Ranking
Digression: k-statistic
10
11
12
Lab: Bringing all together
Online: Statistical learning (1)
 Theory of probability (quick repetition)
 Density estimation
 Statistical estimation (MLE etc)
 Statistical inference
Lab: Weka – ???
Online: Statistical learning (2)
 Naïve bayes
 Prediction
Lab: Weka – ???
Online: Statistical learning (3)
 Conditional models (logistic regression)
 priors
Lab: Weka – ???
D2014: 63-65;
W2011: 505515;
Download