Event - VNLP

advertisement
Final talk
Automatically Acquiring a Dictionary of EmotionProvoking Events
Student: Hoa Vu-Trong – VNU
Supervisor: Graham sensei - NAIST
1/20
Can Twitter benefit a dialogue system?
Twitter users
Dialog System
Machine: Hello!
User: Hello!
User: A guy next to me today,
are too noisy !
Machine: That's so annoying!
User:
2/20
Motivation
Text emotion classifier
●
●
●
Emotion is not present in specific
word.
1. I feel happy today
2. I met my friend today
4% of words imply emotion [1]
Simple architecture of dialogue system with emotion adaption.
[1] Pennebaker, J.W., Mehl, M.R., Niederhoffer, K.: Psychological aspects of natural language use: Our words, our selves. Annual
Review of Psychology 54, 547–577 (2003)
3/20
Motivation
•
Arbitrarily large set of emotion-provoking events can be
collected from Twitter
You must be very happy
400M tweets/day
4/20
Method
●
Emotion and Event have relation.
●
Pattern learning is an effective way to harvest semantic relation
–
Espresso (Pantel and Pennacchiotti 06).
Ex: “I'm happy that I have the support of my friends. I love all of them!”
“I'm sad that tomorrow is Monday and I have to work. It's bad day”
Pattern: I be EMOTION that EVENT
Instances:
happy – I have the support of my friends
sad – tomorrow is Monday and I have to work
5/20
Espresso Algorithm
●
●
●
Used in mining semantic relation (eg: is-a, has-a …) begins
with some seed instances.
Each iteration contains 3 phases:
–
Pattern Induction
–
Pattern ranking
–
Instance extraction
Stopping criterion: enough patterns, average reliabilty of the
patterns decrease t% or exeeds defined number of iterations.
6/20
Espresso Algorithm
●
Pattern Induction: Infers all the patterns P that connect the seed
instances. Ex:
I'm happy that I have the support of my friends. I love all of them!
I'm sad that tomorrow is Monday and I have to work. It's bad day
I be EMOTION that EVENT . I love all of you
I be EMOTION that EVENT . It be bad day
I be EMOTION that EVENT - 2 times
EMOTION that EVENT . - 2 times
EMOTION that EVENT . I love all – 1 time
…
…
7/20
Espresso Algorithm
●
●
Pattern ranking: Rank all the patterns and extract top K reliable
ones.
Reliable patterns: one that both highly precise and one that extract
many instances (more in next slides).
8/20
Espresso Algorithm
●
●
Instance Extraction: Retrieves top M reliable instances match K
patterns extracted from previous phase.
Reliable instance: one that highly associated with as many reliable
patterns. (more in next slides)
9/20
Espresso Algorithm
●
Strength of association between instance i(x,y) and pattern p is
measured by PMI.
𝑝𝑚𝑖 𝑖, 𝑝 = log
𝑐𝑜𝑢𝑛𝑡 𝑖, 𝑝
𝑐𝑜𝑢𝑛𝑡 i × 𝑐𝑜𝑢𝑛𝑡 p
10/20
Espresso Algorithm
●
Pattern reliability:
𝑖∈𝐼
𝑟 𝑝 =
●
𝑝𝑚𝑖 𝑖, 𝑝
𝑚𝑎𝑥𝑃𝑀𝐼 ∗ 𝑟 𝑖
𝑐𝑜𝑢𝑛𝑡 𝐼
0<𝑟 𝑝 ⩽1
Instance reliability:
𝑝∈𝑃
𝑟′ 𝑖 =
𝑝𝑚𝑖 𝑖, 𝑝
𝑚𝑎𝑥𝑃𝑀𝐼 ∗ 𝑟 𝑝
𝑐𝑜𝑢𝑛𝑡 𝑃
0 < 𝑟′ 𝑖 ⩽ 1
11/20
Grouping events
●
●
●
Relieve sparsity issues to some extent by sharing statistics among
the events in a single group
allows humans to understand the events better, highlighting the
important events shared by many people
Using hierarchical agglomerative clustering and the single-linkage
criterion using cosine similarity as a distance measure
Experiments
●
Data corpus: 30 million tweets from Neubig and Duh 13' [1]
●
Tweet normalization by Han et al 12' [2]
●
Stanford parser [3] was employed to make sure that event must be a
sentence
[1] Graham Neubig, Kevin Duh.How Much is Said in a Tweet? A Multilingual, Information-theoretic Perspective in AAAI Sprin
[2] Han et al. Automatically Constructing a Normalisation Dictionary for Microblogs in EMLNP 2012
http://nlp.stanford.edu/software/lex-parser.shtml
13/20
Experiments
●
6 basic emotion classes defined by Ekman [1] :
–
Anger: angry, mad
–
Digust: digusted, terrible
–
Fear: afraid, scared
–
Happiness: happy, glad
–
Sadness: sad, upset
–
Surprise: surprised, astonished
[1]Ekman, P.: Universals and cultural dierences in facial expressions of emotions. Nebraska Symposium
on Motivation 19, 207{283 (1972)}
14/20
Experiments
●
We start the system with the seed instances collected by the
pattern: “I be EMOTION that EVENT”
●
Reliability of seed instances is 1.
●
Stopping criterion: limit iterations.
15/20
Result
●
Happiness: 14027 events
●
Sadness: 3909 events
●
Fear: 8798 events
●
Anger: 2133 events
●
Surprise: 2466 events
●
Disgust: 26 events
16/20
Result
●
Some new patterns:
I feel EMOTION when EVENT
I be EMOTION because EVENT
I be EMOTION EVENT
I get so EMOTION when EVENT
Make me EMOTION when EVENT
Get really EMOTION that EVENT
Be really EMOTION to hear that EVENT
Be EMOTION to know that EVENT
EMOTION at the fact that EVENT
be EMOTION to death that EVENT
…
17/20
Evaluation
●
Using Mean Reciprocal Rank(MMR):
𝑀𝑅𝑅 =
1
∣𝑄∣
∣𝑄∣
𝑖=1
1
𝑟𝑎𝑛𝑘𝑖
Predicted
Human
annotation
Rank
Reciprocal rank
Surprised
Happiness
Surprise
Sadness
2
1/2
18/20
Evaluation
●
Measuring recall
–
Asking 30 people about 5 events that provoke each of five emotions
Emotions
Events
happiness
meeting friends
buying/getting
something I want
going on a date
sadness
a plan gets cancelled
someone dies/gets
sick
failing a test
anger
someone breaks a
promise
someone insults me
someone breaks
something of mine
fear
getting a sudden
phone call
seeing an insect
walking at night
surprise
seeing a friend
unexpectedly
seeing a car suddenly
appear
hearing a loud noise
19/20
Evaluation
●
Evaluation emotion-provoking events
●
Human evaluation on top 100 groups.
Methods
MRR
Recall
Seed
51.8
5.21
Seed + clustering
66.1
Espresso
Espresso + clustering
Emotions
MRR
Recall
Happiness
100
26.9
9.40
Sadness
82.3
10.0
51.5
8.55
Anger
82.4
15.8
74.7
16.2
fear
46.3
27.3
Surprise
58.3
0.0
20/20
Disscusion
●
●
●
Recall is still relatively low
Events extracted from Twitter were somewhat biased towards
everyday events or events regarding love and dating
for surprise we didn’t manage to extract any of the emotions created
by the annotators at all
21/20
In Conclusion
●
●
●
This work focus on acquiring emotion-provoking events
Using Espresso algorithm to learn patterns and extract events then
similar events are grouped to create a dictionary.
Paper summited to EACL 2014
22/20
Arigato gozaimasu
23/20
Download