Lexical and semantic selection

advertisement
Lexical and semantic
selection
Options for grammar
engineers and what
they might mean
linguistically
Outline and
acknowledgements
1. Selection in constraint-based approaches
i.
types of selection and overview of methods used in
LKB/ERG
ii. denotation
2. The collocation problem
i. collocation in general
ii. corpus data on magnitude adjectives
iii. possible accounts
3. Conclusions

Acknowledgements: LinGO/DELPH-IN, especially
Dan Flickinger, also Generative Lexicon 2005
1(i): Types of grammatical
selection


syntactic: e.g., preposition among selects for
an NP (like other prepositions)
lexical: e.g., spend selects for PP headed by
on


Kim spent the money on a car
semantic: e.g., temporal at selects for times
of day (and meals)


at 3am
at three thirty five and ten seconds precisely
Lexical selection
lexical selection requires method of
specifying a lexeme
 in the ERG, this is via the PRED value
spend (e.g., spend the money on Kim)

spend_v2 := v_np_prep_trans_le &
[ STEM < "spend" >,
SYNSEM [ LKEYS [ --OCOMPKEY _on_p_rel
KEYREL.PRED "_spend_v_rel" ]]].
Lexical selection

ERG relies on convention that different lexemes have
different relations




`lexical’ selection is actually semantic. cf Wechsler
no true synonyms assumption, or assume that grammar
makes distinctions that are more fine-grained than realworld denotation justifies.
near-synonymy would have to be recorded elsewhere: ERG
does (some) morphology, syntax and compositional
semantics
alternatives?



orthography: but ambiguity or non-monotonic semantics
lexical identifier: requires new feature
PFORM: requires features, values
Semantic selection
Requires a method of specifying a
semantically-defined phrase
 In ERG, done by specifying a higher
node in the hierarchy of relations:

at_temp := p_temp_le &
[ STEM < "at" >,
SYNSEM [ LKEYS [ --COMPKEY hour_or_time_rel,
KEYREL.PRED _at_p_temp_rel ]]].
Hierarchy of relations
Semantic selection

Semantic selection allows for indefinitely large set of
alternative phrases


compositionally constructed time expressions
productive with respect to new words, but exceptions
allowable
• approach wouldn’t be falsified if e.g., *at tiffin




ERG lexical selection is a special case of ERG
semantic selection!
could assume featural encoding of semantic
properties (alternatively or in addition to hierarchy)
TFS semantic selection is relatively limited practically
(see later)
also idiom mechanism in ERG
1(ii): Denotation, grammar
engineering perspective

Denotation is truth-conditional, logically formalisable (in
principle), refers to `real world’ (extension)





Not necessarily decomposable
Naive physics, biology, etc
Must interface with non-linguistic components
Minimising lexical complexity in broad-coverage grammars is
practically necessary
Plausible input to generator:

reasonable to expect real world constraints to be obeyed (except in
context)
• the goat read the book

Potential disambiguation is not a sufficient condition for lexical
encoding

The vet treated the rabbit and the guinea pig with dietary Vitamin
C deficiency
Denotation, continued



Assume linkage to domain, richer knowledge
representation language available
TFS language for syntax etc, not intended for
general inference
Talmy example: the baguette lay across the
road




across - Figure’s length > Ground’s width
identifying F and G and location for comparison in
grammar?
coding average length of all nouns?
allowing for massive baguettes and tiny roads?
But ...


Trend in KR is towards description logics rather than richer
languages.
Need to think about the denotation to justify grammaticization
(or otherwise)



Linguistic criteria: denotation versus grammaticization?




if temporal in/on/at have same denotation, selectional account is
required for different distribution
unreasonable to expect lexical choice for in/on/at in input to
generator
effect found cross-linguistically?
predictable on basis of world knowledge?
closed class vs open class
Practical considerations about interfacing go along with
linguistic criteria


non-linguists expect some information about word meaning!
allow generalisation over e.g., in/on/at in generator input, while
keeping possibility of distinction
2(i) Collocation: assumptions

Significant co-occurrences of words in
syntactically interesting relationships




`syntactically interesting’: for examples in this
talk, attributive adjectives and the nouns they
immediately precede
`significant’: statistically significant (but on what
assumptions about baseline?)
Compositional, no idiosyncratic syntax etc (as
opposed to multiword expression)
About language rather than the real world
Collocation versus denotation



Whether an unusually frequent word pair is a
collocation or not depends on assumptions about
denotation: fix denotation to investigate collocation
Empirically: investigations using WordNet synsets
(Pearce, 2001)
Anti-collocation: words that might be expected to go
together and tend not to


e.g., flawless behaviour (Cruse, 1986): big rain (unless
explained by denotation)
e.g., buy house is predictable on basis of denotation,
shake fist is not
2(ii): Distribution of
`magnitude’ adjectives



some very frequent adjectives have magnituderelated meanings (e.g., heavy, high, big, large)
basic meaning with simple concrete entities
extended meaning with abstract nouns, non-concrete
physical entities (high taxation, heavy rain)




extended uses more common than basic
not all magnitude adjectives – e.g. tall
nouns tend to occur with a limited subset of these
extended adjectives
some apparent semantic groupings of nouns which
go with particular adjectives, but not easily specified
Some adjective-noun
frequencies in the BNC
number
proportion quality problem
part
winds
rain
large
1790
404
0
10
533
0
0
high
92
501
799
0
3
90
0
big
11
1
0
79
79
3
1
heavy
0
0
1
0
1
2
198
Grammaticality judgments
number
proportion quality problem
large
*
high
heavy
?
*
big
?
?
*
*
part
?
winds
rain
*
*
*
*
*
More examples
impor
tance
success
majority
number
proport
ion
quality
role
problem
part
winds
support
rain
great
310
360
382
172
9
11
3
44
71
0
22
0
large
1
1
112
1790
404
0
13
10
533
0
1
0
high
8
0
0
92
501
799
1
0
3
90
2
0
major
62
60
0
0
7
0
272
356
408
1
8
0
big
0
40
5
11
1
0
3
79
79
3
1
1
strong
0
0
2
0
0
1
8
0
3
132
147
0
heavy
0
0
1
0
0
1
0
0
1
2
4
198
Judgments
impor
tance
success
majority
number
proporti
on
quality
role
problem
part
great
large
?
high
?
*
major
*
?
?
?
*
?
winds
?
strong
?
?
*
*
*
heavy
?
*
?
*
*
rain
?
*
*
*
?
*
?
big
support
?
?
*
*
*
?
Distribution




Investigated the distribution of heavy, high, big,
large, strong, great, major with the most common
co-occurring nouns in the BNC
Nouns tend to occur with up to three of these
adjectives with high frequency and low or zero
frequency with the rest
My intuitive grammaticality judgments correlate but
allow for some unseen combinations and disallow a
few observed but very infrequent ones
big, major and great are grammatical with many
nouns (but not frequent with most), strong and
heavy are ungrammatical with most nouns, high and
large intermediate
heavy: groupings?
magnitude: dew, rainstorm, downpour, rain, rainfall,
snowfall, fall, snow, shower: frost, spindrift: clouds,
mist, fog: flow, flooding, bleeding, period, traffic:
demands, reliance, workload, responsibility, emphasis,
dependence: irony, sarcasm, criticism: infestation,
soiling: loss, price, cost, expenditure, taxation, fine,
penalty, damages, investment: punishment, sentence:
fire, bombardment, casualties, defeat, fighting:
burden, load, weight, pressure: crop: advertising: use,
drinking:
magnitude of verb: drinker, smoker:
magnitude related? odour, perfume, scent, smell,
whiff: lunch: sea, surf, swell:
high: groupings?
magnitude: esteem, status, regard, reputation,
standing, calibre, value, priority; grade, quality, level;
proportion, degree, incidence, frequency, number,
prevalence, percentage; volume, speed, voltage,
pressure, concentration, density, performance,
temperature, energy, resolution, dose, wind; risk, cost,
price, rate, inflation, tax, taxation, mortality, turnover,
wage, income, productivity, unemployment, demand
magnitude of verb: earner
heavy and high
50 nouns in BNC with the extended
magnitude use of heavy with frequency
10 or more
 160 such nouns with high
 Only 9 such nouns with both adjectives:
price, pressure, investment, demand,
rainfall, cost, costs, concentration,
taxation

2(iii): Possible empirical
accounts of distribution
1. Difference in denotation between
`extended’ uses of adjectives
2. Grammaticized selectional
restrictions/preferences
3. Lexical selection
•
stipulate Magn function with nouns (MeaningText Theory)
4. Semi-productivity / collocation
•
plus semantic back-off
1 - Denotation account of
distribution

Denotation of adjective simply prevents it being
possible with the noun. Implies that heavy and high
have different denotations
heavy’(x) => MF(x) > norm(MF,type(x),c) & precipitation(x) or
cost(x) or flow(x) or consumption(x)...
(where rain(x) -> precipitation(x) and so on)

But: messy disjunction or multiple senses, openended, unlikely to be tractable.




e.g., heavy shower only for rain sense, not bathroom sense
Not falsifiable, but no motivation other than
distribution.
Dictionary definitions can be seen as doing this
(informally), but none account for observed
distribution.
Input to generator?
2 - Selectional restrictions and
distribution


Assume the adjectives have the same denotation
Distribution via features in the lexicon






e.g., literal high selects for [ANIMATE false ]
cf., approach used in the ERG for in/on/at in temporal
expressions
grammaticized, so doesn’t need to be determined by
denotation (though assume consistency)
could utilise qualia structure
Problem: can’t find a reasonable set of cross-cutting
features!
Stipulative approach possible, but unattractive.
3 - Lexical selection


MTT approach
noun specifies its Magn adjective



in Mel’čuk and Polguère (1987), Magn is a
function, but could modify to make it a set, or
vary meanings
could also make adjective specify set of
nouns, though not directly in LKB logic
stipulative: if we’re going to do this, why not
use a corpus directly?
4- Collocational account of
distribution


all the adjectives share a denotation corresponding
to magnitude, distribution differences due to
collocation, soft rather than hard constraints
linguistically:




adjective-noun combination is semi-productive
denotation and syntax allow heavy esteem etc, but speakers
are sensitive to frequencies, prefer more frequent phrases
with same meaning
cf morphology and sense extension: Briscoe and Copestake
(1999). Blocking (but weaker than with morphology)
anti-collocations as reflection of semi-productivity
Collocational account of
distribution

computationally,

fits with some current practice:
• filter adjective-noun realisations according to ngrams (statistical generation – e.g., Langkilde
and Knight, recent experiments with ERG)
• use of co-occurrences in WSD

back-off techniques

requires an approach to clustering
semantic spaces





acquired from corpora
generally, collect vectors of words which co-occur with
the target
best known is LSA: often used in psycholinguistics
more sophisticated models incorporate syntactic
relationships
currently sexy, but severe limitations!
dog
bark
house
cat
dog
-
1
0
0
bark
1
-
0
0
Back-off and analogy


back-off: decision for infrequent noun with no corpus
evidence for specific magnitude adjective
should be partly based on productivity of adjective:
number of nouns it occurs with


default to big
back-off also sensitive to word clusters



e.g., heavy spindrift because spindrift is semantically similar
to snow
semantic space models: i.e., group according to distribution
with other words
hence, adjective has some correlation with semantics of the
noun
Metaphor

Different metaphors for different nouns (cf., Lakoff et
al)




`high’ nouns measured with an upright scale: e.g.,
temperature: temperature is rising
`heavy’ nouns metaphorically like burden: e.g., workload:
her workload is weighing on her
Doesn’t lead to an empirical account of distribution,
since we can’t predict classes. Assumption of literal
denotation followed by coercion is implausible.
But: extended metaphor idea is consistent with idea
that clusters for backoff are based on semantic space
Collocation and linguistic
theory


Collocation plus semantic space clusters may account for some
of the `messy’ bits, at least for some speakers.
in/on transport: in the car, on the bus







Talmy: presence of walkway, `ragged lower end of hierarchy’
but trains without walkway, caravans with walkway?
in/on choice perhaps collocational, not real exception to languageindependent schema elements
Potential to simplify linguistic theories considerably.
Success of ngrams, LSA models of priming.
Practically testable: assume same denotation of heavy/high or
in/on, see if we can account for distribution in corpus.
Alternative for temporal in/on/at?

Experiments with machine learning temporal in/on/at (Mei Lin,
MPhil thesis, 2004): very successful at predicting distribution, but
used lots of Treebank-derived features.
Summary

Selection in ERG




Other aspects of ERG selection not described here:
multiword expressions and idioms
Collocational models as adjunct to TFS
encoding
Role of denotation is crucial
Practical considerations about grammar
usability
Final remarks

Grammar usability:




A good broad-coverage grammar should have an
account of denotation of closed-class words at
least, but probably not within TFS encoding.
Can we use semantic web languages for nondomain-specific encoding?
Collocational techniques require much further
investigation
Can semantic space models be related to
denotation (e.g., somehow excluding
collocational component)?
Idioms
Idiom entry:
stand+guard := v_nbar_idiom &
[ SYNSEM.LOCAL.CONT.RELS
<! [ PRED "_stand_v_i_rel" ],
[ PRED "_guard_n_i_rel" ] !> ].
Idiomatic lexical entries:
guard_n1_i := n_intr_nospr_le &
[ STEM < "guard" >,
SYNSEM [ LKEYS.KEYREL.PRED "_guard_n_i_rel“ ]].
stand_v1_i := v_np_non_trans_idiom_le &
[ STEM < "stand" >,
SYNSEM [ LKEYS.KEYREL.PRED "_stand_v_i_rel”]].
Idioms in ERG/LKB





Account based on Wasow et al (1982), Nunberg et al
(1994).
Idiom entry specifies a set of coindexed MRS
relations (coindexation specified by idiom type, e.g.,
v_nbar_idiom)
Relations may correspond to idiomatic lexical entries
(but may be literal uses: e.g., cat out of the bag –
literal out of the).
Idiom is recognised if some phrase matches the
idiom entry.
Allows for modification: e.g., stand watchful guard
Messy examples

among: requires group or plural or ?



among the family (BNC)
among the chaos (BNC)
between: requires plural denoting two
objects, but not group (?)




fudge sandwiched between sponge (BNC)
between each tendon (BNC)
? the actor threw a dart between the couple
* the actor threw a dart between the audience
(even if only two people in the audience)
Download