Week 10 Lecture 1: Recognizing Sarcasm with a Bootstrapping

advertisement
Computational
Models of
Discourse Analysis
Carolyn Penstein Rosé
Language Technologies Institute/
Human-Computer Interaction Institute
Warm Up Discussion

Look at the analysis I have passed out

Note: inscribed sentiment is underlined and invoked sentiment is
italicized and relatively frequent words that appear in either of
these types of expressions have been marked in bold

Do you see any sarcastic comments here? What if any
connection do you see between sentiment and sarcasm?

Keeping in mind the style of templates you read about, do
you see any snippets of text in these examples that you
think would make good templates?
Patterns I see

Inscribed sentiment

About 24% of words in underlined segments are relatively high frequency
 3 “useful” patterns out of 18 underlined portions
 Examples:




Like
Good
More CW than CW
Invoked sentiment

About 39% of words were relatively high frequency
 About 7 possibly useful patterns out of 17, but only 3 really look unambigous
 Examples







CW like one
CW the CW of the CW
Makes little CW to CW to the CW
CW and CW of an CW
Like CW on CW
Leave you CW a little CW and CW
CW more like a CW
Unit 3 Plan

3 papers we will discuss all give ideas for
using context (at different grain sizes)
 Local

Using bootstrapping
 Local

patterns without syntax
patterns with syntax
Using a parser
 Rhetorical


patterns within documents
Using a statistical modeling technique
The first two papers introduce techniques
that could feasibly be used in your Unit 3
assignment
Student Comment: Point of Discussion

To improve performance language technologies
seem to approach the task in either one of two
ways. First of all approaches attempt to generate
a better abstract model that provides the
translation mechanism between a string of terms
(sentence) and our human mental model of
sentiment in language. Alternatively some start
with a baseline and try to find a corpus or
dictionary of terms that provides evidence for
sentiment.

Please clarify
Connection between Appraisal and
Sarcasm
A sarcastic example of invoked negative sentiment from Martin and White, p 72

Student Comment: I’m not exactly sure how one would go
about applying appraisal theory to something as elusive
as sarcasm.
Inscribed versus Invoked

Do we see signposts
that tell us how to
interpret invoked
appraisals?
Overview of Approach


Start with small amount of labeled
data
Generate patterns from examples




Select those that appear in training
data more than once and don’t
appear both in a 1 and a 5 labeled
example
Expand data through search using
examples from labeled data as
queries (take top 50 snippet
results)
Represent data in terms of
templatized patterns
Modified kNN classification
approach
How could you do
this with SIDE?
1 Build a feature
extractor to
generate the set of
patterns
2 Use search to set
up expanded set of
data
3 Apply generated
patterns to
expanded set of
data
4 Use kNN
classification
Pattern Generation

Classify words into high frequency (HFW) versus
content words (CW)
 HFWs
occur at least 100 times per million words
 CW occur no more than 1000 times per million words



Also add [product], [company], [title] as additional
HFWs
Constraints on patterns:
2-6 HFWs, 1-6 slots for CWs, patterns start and
end with HFWs
Would Appraisal
theory suggest other
categories of
words?
Expand Data: “Great for Insomniacs…”
What could
they have
done
instead?
Pattern Selection
 Approach: Select those that appear in training
data more than once and don’t appear both in a 1
and a 5 labeled example

Could have used an attribute selection
technique like Chi-squared attribute
evaluation

What do you see as the trade-offs between
these approaches?
Representing Data as a Vector


Most of the features
were from the generated
patterns
Also included
punctuation based
features



Number of !, number of ?,
number of quotes, number
capitalized words
What other features
would you use?
What modifications to
feature weights would
you propose?
Modified kNN

Is there a simpler approach?
Weighted
average so
majority
class
matches
count more.
Evaluation

I am …rather wary of
the effectiveness of
their approach because
it seems that they
cherry picked a
heuristic ‘starsentiment’ baseline to
compare their results to
in table 3 but do not
offer a similar baseline
for table 2.
Baseline technique:
Count as positive examples
those that have a highly
negative star rating but lots
of positive words.
Is this really a strong
baseline? Look at the
examples from the paper.
Evaluation

What do
you
conclude
from this?

What
surprises
you?
Revisit: Overview of Approach


Start with small amount of labeled
data
Generate patterns from examples




Select those that appear in training
data more than once and don’t
appear both in a 1 and a 5 labeled
example
Expand data through search using
examples from labeled data as
queries (take top 50 snippet
results)
Represent data in terms of
templatized patterns
Modified kNN classification
approach
How could you do
this with SIDE?
1 Build a feature
extractor to
generate the set of
patterns
2 Use search to set
up expanded set of
data
3 Apply generated
patterns to
expanded set of
data
4 Use kNN
classification
What would it take to achieve
inter-rater reliability?

You can find definitions and examples on the
website, just like in the book, but it’s not
enough…

Strategies
– are there distinctions that don’t buy us much
anyway?
 Add constraints
 Identify boarderline cases
 Use decision trees
 Simplify
What would it take to achieve
inter-rater reliability?

Look at Beka and Elijah’s analyses in
comparison with mine
 What
 How
were our big disagreements?
would we resolve them?
Questions?
Download