Label Embedding Trees for Large Multi

advertisement
Label Embedding Trees for Large
Multi-class Tasks
Samy Bengio Jason Weston
David Grangier
Presented by Zhengming Xing
Outline
•
•
•
•
Introduction
Label Trees
Label Embeddings
Experiment result
Introduction
Large scale problem:
the number of example
Feature dimension
Number of class
Main idea: propose a fast and memory saving multi-class classifier for large dataset
based on trees structure method
Introduction
Label Tree:
Indexed nodes:
Edges:
Label Predictors:
Label sets:
The root contain all classes, and each child label set is a subset of its parent
K is the number of classes
Disjoint tree: any two nodes at the same depth cannot share any labels.
Introduction
Classifying an example:
Label Trees
Tree loss
I is the indicator function
is the depth in the tree of the final prediction for x
Label tree
Learning with fixed label tree: N,E,L chosen in advance
Goal: minimize the tree loss over the variables F
Given training data
Relaxation 1
Replace indicator function with hinge loss and
Relaxation 2
Label tree
Learning label tree structure for disjoint tree
Basic idea: group together labels into the same label set that are likely to be
confused at test time.
Treat A as the affinity matrix and apply the steps similar to spectral
clustering
define
Label embeddings
is a k-dimensional vector with a 1 in the yth
position and 0 otherwise
solve
Problem : how to learn W, V
Method 1:
Label embeddings
Learn V
The same two steps of algorithm 2
minimize
Learn W
minimize
Label embedding
Method 2: join learn W and V
minimize
Combine all the methods discussed above
minimize
Experiment
Dataset
Experiment
Experiment
Download