finalReview - Department of Computer Science and Technology

advertisement
COMP 328: Final Review
Spring 2010
Nevin L. Zhang
Department of Computer Science & Engineering
The Hong Kong University of Science & Technology
http://www.cse.ust.hk/~lzhang/
Can be used as cheat sheet
Page 2
Pre-Midterm


Algorithms for supervised learning

Decision trees

Instance-based learning

Naïve Bayes classifiers

Neural networks

Support vector machines
General issues regarding supervised learning

Classification error and confidence interval

Bias-Variance tradeoff

PAC learning theory
Post-Midterm


Clustering

Distance-Based Clustering

Model-Based Clustering
Dimension Reduction

Principal Component Analysis

Reinforcement Learning

Ensemble Learning
Clustering
Distance/Similarity Measures
Distance-Based Clustering

Partitional and Hierarchical clustering
K-Means: Partitional Clustering
K-Means: Partitional Clustering

Different initial points might lead to different partitions

Solution:

Multiple runs,

Use evaluation criteria such as SSE to pick the best one
Hierarchical Clustering

Agglomerative and Divisive
Cluster Similarity
Cluster Validation

External indices

Entropy: Average purity of clusters obtained

Mutual Information between class label and cluster label
Cluster Validation

External Measure

Jaccard Index

Rand Index
Measure similarity between two relationships: in-same-class & in-same-cluster
# pairs in same cluster
# pairs in diff cluster
# pairs w/ same label
a
b
# pairs w/ diff label
c
d
Cluster Validation

Internal Measure

Dunn’s index
Cluster Validation

Internal Measure
Post-Midterm


Clustering

Distance-Based Clustering

Model-Based Clustering
Dimension Reduction

Principal Component Analysis

Reinforcement Learning

Ensemble Learning
Model-Based Clustering

Assume data generated from a mixture model with K
components

Estimate parameters of the model from data

Assign objects to clusters based posterior probability: Soft
Assignment
Gaussian Mixtures
Learning Gaussian Mixture Models
EM
EM
EM

l(t): Log likelihood of model after t-th iteration

l(t): increases monotonically with t

But might go to infinite in case of singularity


Solution: place bound on eigen values of covariance matrix
Local maximum

Multiple restart

Use likelihood to pick best model
EM and K-Means

K-Means is hard-assignment EM
Mixture Variable for Discrete Data
Latent Class Model
Learning Latent Class Models
Always converges
Post-Midterm


Clustering

Distance-Based Clustering

Model-Based Clustering
Dimension Reduction

Principal Component Analysis

Reinforcement Learning

Ensemble Learning
Dimension Reduction

Necessary because there are data sets with large numbers
of attributes that are difficult to learning algorithms to
handle.
Principal Component Analysis
PCA Solution
PCA Illustration
Eigenvalues and Projection Error
Post-Midterm


Clustering

Distance-Based Clustering

Model-Based Clustering
Dimension Reduction

Principal Component Analysis

Reinforcement Learning

Ensemble Learning
Reinforcement Learning
Markov Decision Process

A model of how agent interact with its environment
Markov Decision Process
Value Iteration
Reinforcement Learning
Q-Learning
Q-Learning

From Q-function based value iteration

Ideas

In-place/asynchronous value iteration

Approximate expectation using samples

ε-greedy policy (for exploration/exploitation) tradeoff
Time Difference Learning
Sarsa is also time difference learning
Post-Midterm


Clustering

Distance-Based Clustering

Model-Based Clustering
Dimension Reduction

Principal Component Analysis

Reinforcement Learning

Ensemble Learning
Ensemble Learning
Bagging: Reduce Variance
Boosting: Reduce Classification Error
AdaBoost: Exponential Error
Download