Hierarchical Semi-supervised Classification with Incomplete Class Hierarchies

advertisement
Hierarchical Semi-supervised Classification
with Incomplete Class Hierarchies
Bhavana
¶*
Dalvi ,
Aditya
†
Mishra ,
and William W.
*
Cohen
¶ Allen
Institute for Artificial Intelligence,
* School Of Computer Science, Carnegie Mellon University,
† Department of Computer Science & Software Engineering, Seattle University
Motivation
Method: OptDAC Exploratory EM
Inputs: 𝑋 𝑙 :Labeled glosses; π‘Œ 𝑙 : πΆπ‘Žπ‘‘π‘’π‘”π‘œπ‘Ÿπ‘¦ π‘™π‘Žπ‘π‘’π‘™π‘  π‘œπ‘“ 𝑋 𝑙 ;
οƒ˜ In an entity classification task, topic or concept
hierarchies are often incomplete. This can lead to
semantic drift of known classes or topics.
οƒ˜ Our previous work on Exploratory Learning (Dalvi
et al. ECML 2013) extends the semi-supervised EM
algorithm by dynamically adding new classes when
appropriate. In this paper, we present Exploratory
learning techniques for hierarchical semisupervised learning tasks.
Initialize the model
This dataset is made publicly available at
http://rtw.ml.cmu.edu/wk/WebSets/hierarc
hical_ExploratoryLearning_WSDM2016/
index.html
Statistic
Small
#Classes
#levels in the hierarchy
#classes per level
πœ½πŸŽπ’‹
75
FLAT-ExploreEM
OptDAC ExploreEM
65
with a few seeds per class π‘ͺ𝒋
 E Step (Iteration t): Assign a bit vector of categories
to each gloss
55
45
35
For i = 1 : N
Find 𝑃 𝐢𝑗 𝑋𝑖 ; πœƒπ‘—π‘‘−1 ) for all classes 𝐢𝑗
25
Level =
𝒕
π’€π’•π’Š = Optimal-Label-Assignment 𝑷 π‘ͺ𝒋 π‘Ώπ’Š ; πœ½π’•−𝟏
,
𝒁
)
𝒋
{If a new class is created, then class constraints are updated
accordingly.}
(𝒕)
𝒁 = UpdateConstraints 𝑿𝒍 , 𝒀𝒍 , 𝑿𝒖 , 𝒀𝒖 , 𝒁(𝒕)
2
3
4
OptDAC with varying amount of training data
Text-Small
Table-Small
 M step: Re-compute model parameters
Re-compute πœ½π’•π’‹ based on current label assignments π‘Œπ‘–π‘‘ .
 Do model selection
Optimal Label Assignment given
Class Constraints
Input: 𝑃 𝐢𝑗 𝑋𝑖 ) , Class constraints: Subset, Mutex(disjoint)
Output: Consistent bit vector 𝑦𝑗𝑖 for 𝑋𝑖
Runtime of Flat vs. OptDAC method on different datasets
Dataset
Avg. Runtime
in sec.
FLAT
Semisupervised EM
Text-Small
53.5
Table-Small
50.7
Text-Medium
524.7
Table-Medium 5932.4
Ontology
Medium
3
11
1, 3, 7
denotes statistically significant improvements
(0.05 significance level) w.r.t. FLAT ExloreEM
𝑋𝑒;
 Iterate till convergence (till data likelihood AND #classes converges)
οƒ˜ KB categories are arranged in an ontology. There
are subset and disjointness constraints defined
between these classes. Further, the class hierarchy
can be incomplete.
Datasets
Comparison: macro averaged seeded-class F1
Outputs:
Labels for
𝜽𝟏 … πœ½π’Œ+π’Ž parameters for k seed and m
newly added classes; π‘π‘˜+π‘š set of constraints between k+m classes
οƒ˜ We focus on entity classification task where each
entity is represented by either text context or table
co-occurrence features. Given a few seed
examples per Knowledge Base(KB) category, the
task is to classify unlabeled entities into KB
categories.
οƒ˜ Our proposed method (OptDAC) can learn new
examples of existing classes, as well as extend the
class hierarchy in a single unified framework.
OptDAC reduces semantic drift of seeded classes.
𝑋 𝑒 :Unlabeled glosses ; N: |X|; K: number of classes;
π‘π‘˜ : Class constraints (subclass or disjointness constraints);
π‘Œπ‘’ :
Experimental Results
4
39
1, 4, 24, 10
Avg. runtime in multiple of Flat Semisupervised EM
FLAT
OptDAC
Exploratory SemiExploratory
EM
supervised EM EM
8
7
17
3
10
21
5
11
25
4
7
10
Evaluation of extended class hierarchies
Maximize {likelihood of assignment – constraint violation penalty}
Small Ontology
Score of label
assignment
Medium Ontology
Subset constraint
Penalty
Mutex constraint
Penalty
Subset constraint
Mutex Constraint
When New Classes Are Created?
Conclusions
1
Dataset
Statistics
#Entities #Features # (Entity, label)
pairs
Text-Small
Text-Medium
Table-Small
2.5K
12.9K
4.3K
3.4M
6.7M
0.96M
7.2K
42.2K
12.2K
Table-Medium
33.4K
2.2M
126.K
5
οƒ˜ An example Text pattern feature for entity “Pittsburgh” is
(“lives in ARG”, 1000), indicating that the entity Pittsburgh
appeared in position ARG of the text context “live in ARG”
for 1000 times in the sentences from Clueweb09 dataset.
οƒ˜ An example Table context feature for entity “Pittsburgh” is
(“clueweb09-en0011-94-04::2:1”, 1) indicates that the entity
“Pittsburgh” appeared once in HTML table 2, column 1 from
ClueWeb09 document id “clueweb09-en0011-94-04”.
3
2
6
7
4
8
9
10
Near uniform?
Cnew
Test: Best assignment using the mixed
integer program should pick Cnew
11
οƒ˜ In this paper, we propose the Hierarchical Exploratory EM approach
that can take an incomplete class ontology as input, along with a few
seed examples of each class, to populate new instances of seeded
classes and extend the ontology with newly discovered classes.
οƒ˜ Our proposed hierarchical exploratory EM method, named OptDACExploreEM performs better than flat classification and hierarchical semisupervised EM methods at all levels of hierarchy, especially as we go
further down the hierarchy.
οƒ˜ Experiments show that OptDAC-ExploreEM outperforms its semisupervised variant on average by 13% in terms of seed class F1 scores.
It also outperforms both previously proposed exploratory learning
approaches FLAT-ExploreEM and DAC-ExploreEM in terms of seed class
F1on average by 10% and 7% respectively.
οƒ˜ In the future, we would like to apply our method on datasets with nontree structured class hierarchies.
Acknowledgements : This work is supported in part by Google PhD fellowship in Information Extraction, and NSF grant No. IIS1250956-NSFCOHEN.
Download