Lexical Patterns: from Hornby to Hunston and beyond

advertisement
Lexical Patterns: from Hornby
to Hunston and beyond
Patrick Hanks
Faculty of Informatics, Masaryk University,
Brno, Czech Republic
hanks@fi.muni.cz
Afrilex, Stellenbosch June 30, 2008
1
Outline of the talk
• Palmer and Hornby (1942)
– the recognition of lexical patterns
– the dictionary as an aid to idiomatic and productive use of language.
• Refinements to OALD (Hornby 1964; Cowie 1989; Crowther
1995;Wehmeier 2000)
• The impact of corpus linguistics on lexicography
– The role of collocations (Sinclair)
• Verb patterns and sentence meaning; implicatures
– Pattern grammar contrasted with pattern dictionary
• Where do we go from here?
2
A. S. Hornby
1923: started as a teacher of English in Japan
1931: joined H. E. Palmer at the Tokyo Institute for Research into English
Teaching
1936: became head of research at the Institute
1941: Hornby repatriated in World War II; joined the British Council
1942: Idiomatic and Syntactic Dictionary (ISED), by Hornby Gatenby, and
Wakefield, published in Japan by Kaitakusha
1948: ISED re-published (unchanged) by OUP as A Learner’s Dictionary of
English (LDE)
1954: A Guide to Patterns and Usage in English (lexically based)
1960: title of LDE changed to Advanced Learners’ Dictionary of Current
English
1963: Second edition of ALD (Hornby alone).
1974: Third edition of OALD (edited with A. P. Cowie)
1978: Death of A. S. Hornby.
3
Some insights of Palmer and Hornby
• Learners need idiomatic phraseology, not etymology
• English grammar is not Latin grammar
– E.g. Palmer identified ‘determinatives’ (determiners), adverbial particles,
‘anomalous finite verbs’ (auxiliaries, modals, and verbal pro-forms), as well as
classes inherited from Latin grammar (nouns, verbs, adjectives, adverbs)
– Palmer and Hornby also identified clause structure elements – for example
subject complements and object complements (He is mad & linguistics drove
him mad; he is the editor & they appointed him editor)
• Language in use is structured around the lexical patterns of
verbs
4
Hornby’s second edition (1963)
• Huge increase in the number of lexical items
– Many of them more useful for receptive (reading) than productive
(writing) use.
• Less user friendly than ISED
– influenced by the Concise Oxford Dictionary, 4th edition,1951
• ‘Nested’ subentries with swung dashes
– E.g. blackbird became “~bird” under the main entry black
• See Cowie (1999), English Dictionaries for Foreign Learners:
A History, for more details
• Some of these editorial policies were reversed in OALD6
(2000).
5
Hornby’s original verb patterns
(1942)
• 25 patterns were identified.
– Succinct, but still hard for a learner to use easily.
– Only two subdivisions: transitive vs. intransitive.
– 14 of the 25 verb patterns take clausal complements:
• E.g. Pattern 5 (S V O Inf): I made him do it.
• Separating out the clausals makes things a lot simpler.
– Some very subtle (unnecessary?) distinctions.
– For details, see the version of this paper in the Proceedings.
• In the 4th edition (1989), A. P. Cowie introduced a more userfriendly organization of patterns.
6
Verb patterns in OALD now (1)
Intransitive verbs
• [V]
• [V + adv/prep]
A large dog appeared.
A group of swans floated by.
Transitive verbs
• [VN]
• [VN + adv/prep]
Jill’s behaviour annoyed me.
He kicked the ball into the net.
Transitive verbs + two objects
• [VNN]
I gave Sue a book for Christmas.
Linking verbs
• [V-ADJ]
• [V-N]
• [VN-ADJ]
• [VN-N]
His voice sounds hoarse.
Elena became a doctor.
She considered herself lucky.
They elected him president.
7
Verb patterns in OALD now (2)
Verbs used with clauses or phrases
• [V (that)]
He said that he would prefer to walk.
• [VN (that)]
Can you remind me that I need to buy some milk?
• [V wh-]
I wonder what the job will be like.
• [VN wh-]
I asked him where the hall was.
• [V to]
The goldfish need to be fed.
• [VN to]
He was forced to leave the keys.
• [VN inf]
Did you hear the phone ring?
• [V –ing]
She never stops talking.
• [VN –ing]
His comments set me thinking.
• Verbs + direct speech
• [V speech]
“It’s snowing,” she said.
• [VN speech]
“Tom’s coming to lunch,” she told him.
8
Word classes vs. clause structure
• OALD now, like the Pattern Grammar of Hunston and Francis
(2000) and other works, expresses patterns in terms of word
classes, not clause roles.
– E.g. “VN” not “subject – verb – object”.
• I will argue that patterns of verb use need to be analysed in
terms of clause roles – a subtle but important distinction.
– Failure to observe this distinction has led to some confusion.
– Lexicographers defining verbs need to know about clause structure.
9
Clause roles
•
•
•
•
S Subject
P Predicator
The verb group.
O Object
English clauses have 1, 2, or 0 objects.
C Complement Co-referential with either the subject or the
object of the clause.
• A Adverbial
(sometimes called Adjunct). A clause may
have any number of optional adverbials, but only 1 or 0
obligatory adverbial.
– It is necessary to distinguish obligatory adverbials (e.g. on the table in
He put the cup on the table) from optional adverbials (as in He died in
1974).
10
Pattern Grammar
• Hunston and Francis (2000): Pattern Grammar: a corpusdriven approach to the lexical grammar of English
• “One of the most important observations in a corpus-driven
description of English is that patterns and meanings are
connected.”
• PG is founded on real texts and is a real attempt at empirically
valid, i.e. corpus-driven, generalizations.
11
How is a Pattern Dictionary different
from a Pattern Grammar?
• Pattern Grammar seeks similarities – words with similar
meanings – and groups them together according to syntactic as
well as semantic similarity.
• By contrast, the Pattern Dictionary seeks systematic
differences:
– In particular, the differences in pattern that pick out different meanings
of a polysemous word.
– To do this, it needs to introduce semantic values of arguments
– At this point, all hell breaks loose!
12
Semantic values of arguments
(valencies)
• Not just “V n” but (e.g.):
[[Human]] polish {([[Surface]] of) [[Physical Object]]}
– Not “all possible values” but “all normal values”
• Lumping vs. splitting:
– Is polishing one’s nails the same pattern as polishing one’s
boots | the furniture | etc.?
• Semantic values shimmer – see Hanks and Jezek, this
conference, e.g.:
– calm a person, calm an animal
– calm someone’s fears, anxieties, nerves ...
13
Focusing of arguments
– repair the house
– repair the roof of the house
– repair the damage to the roof
• Same event, different semantic types
14
Meanings are not just
concatenations
• Constructions (construction grammar), e.g. resultatives:
– He shook the rain off his umbrella
– What did he shake? (Not the rain!)
• He belched his way out of the room [example from Goldberg
and Jackendoff 2004]
– A noise-emission event AND a motion event
• Compare he belched out of the window.
– Only certain classes of verbs participate in the way construction
– Corpus analysis (not navel gazing) is needed to determine which
verbs participate in which constructions
– See Fillmore’s ‘Constructicon’ (plenary, this conference)
15
Apparatus of Pattern Analysis
• The Pattern Grammar has an admirably simple apparatus.
– target word; part of speech categories; word order; and certain function
words (mainly prepositions).
• Simplicity can be overdone.
--• To represent the distinctive features of meaning in use, we also
need at least:
– Systematic analysis and categorization of ‘colligations’
– Lexical items grouped as collocates or by semantic type
– Valencies – a.k.a. clause roles (we use S P O C A)
16
execute
Verb
execute
Pattern Grammar
Vn
Pattern Dictionary
[[Human 1]] execute [[Human 2]]
(passive n be V-ed)
Example sentence: Private Joseph Byers was the first Kitchener volunteer to
be executed.
In the Pattern Dictionary (but not in PG) semantic types distinguish this sense
from other “V n” patterns of the same verb, e.g. ‘execute an order’.
17
enlist
Verb
Pattern Grammar
Pattern Dictionary
enlist
V in n
[[Human]] enlist [NO OBJ] {in [[Human
Group = Military]]}
Example sentence: He was 17 and under age when he enlisted in the 1st Royal
Scots Fusiliers.
Pattern Grammar and Pattern Dictionary agree in contrasting this sense with
other patterns such as “[[Human]] enlist [[Assistance]]” (V n).
18
go
Verb
go
Pattern Grammar
V adj
Pattern Dictionary
[[Human]] go [NO OBJ] {absent | AWOL}
Example sentence: His inexperience and the horrors he witnessed caused him
to go absent without leave.
This is a light verb (“delexical verb” in Sinclair’s terminology), with many
patterns.
The “adj” in PG is a Subject Complement.
The small lexical set, {absent | AWOL}, in the Pattern Dictionary activates a
particular meaning of go, contrasting with other patterns of go having a
Subject Complement, e.g. go {mad | bananas} .
19
plead
Verb
Pattern Grammar
plead V adj
Pattern Dictionary
[[Human]] plead [NO OBJ] {guilty | {not guilty}}
Example sentence: Byers pleaded guilty.
The adj in this pattern is an Object Complement.
The Object Complement is populated by a lexical set of just two possible
(normal) items. (“plead innocent” is plausible but not idiomatic.)
20
fire
Verb
fire
Pattern Grammar
V adj
Pattern Dictionary
[[Human]] fire [NO OBJ]
([Advl[Direction]])
Example sentence: … the firing squad had fired wide to avoid killing the youth.
The “adj” in this sentence has the clause role of Adjunct or (in my terminology)
Adverbial of Direction, as in:
The police fired into the crowd.
They fired over their heads.
It’s not really an adj. at all.
21
Semantic values in patterns
• PATTERN: [[Human]] execute [[Command]]
• [[Command]] is a semantic type with many lexical
realizations: command, order, instruction, wish, ...
• IMPLICATURE:
[[Human 1]] acts in accordance with [[Human 2]]’s [[Wish]]
• A look-up table (an ontology) is needed, in which users can
find all the normal words (the ‘population’) of each semantic
type in each normal context
– and/or a procedure for recognizing type membership – e.g. ‘named
entity recognition’ – which recognizes all and only members of the set
[[Human]], and distinguish them from names of places, businesses,
products, dogs, etc.
22
Collocations; lexical sets;
semantic types
• Pustejovsky (1995, 2008 (with Rumshisky)): meaning is expressed
in terms of verbs and their argument structures (with the semantic
types of the arguments).
• Sinclair (passim; followed by Kilgarriff and others): this is
unnecessary: just list the collocates found in a corpus; don’t try to
group them in terms of semantic types.
• Hanks (1996, 2007, elsewhere), Jezek: lexical sets typically share
semantic values, but with much variation.
– Grouping them by semantic type increases the power of the dictionary
to predict the meaning of a sentence correctly.
• Either way, sentence meaning depends on the sets of words that
normally, typically occur in particular clause roles in relation to a
particular verb (not just the word classes – i.e. not just “V n”).
23
Patterns, not senses
• Meanings taken from WordNet or a dictionary do not yield
reliable data for disambiguating senses (Ide and Wilks 2005).
– WordNet lists synonym sets and other semantic relations – but not
senses.
– WordNet did not do contrastive analysis of word senses.
– In standard dictionaries, word senses are not mutually exclusive.
– There is much fuzzy overlap between senses – which may be OK for
sophisticated human users, but not for learners or computers.
• The patterns of all and only the normal uses of a lexical item
are (normally) mutually exclusive.
– However, teasing them out from corpus data is hard.
24
Norms and exploitations
• The Pattern Dictionary record all and only the normal uses of
each verb.
– Exploitation of norms is a subject for separate analysis.
– Types of ‘exploitation’ include creative metaphor, ellipsis,
and anomalous arguments. Consider:
• The goat ate the newspaper.
• The verb eat has a preference for nouns of semantic type [[Food]]
in the direct object clause role.
• ‘[[Animate]] eat [[Document]]’ is not a normal pattern of English.
• Compare John devoured the newspaper.
• ‘[[Human]] devour [[Document]]’ is a normal pattern of English. It
is a conventional metaphor.
25
Specifically, ...
The Pattern Dictionary of English Verbs will:
• list all normal patterns of each verb lemma in BNC.
• provide a benchmark for comparison and identification of
norms in other corpora, e.g.
– by time period: patterns in historical corpora, future corpora .
– by region: e.g. patterns in American English.
– by domain, e.g.:
• ‘[[Human]] abate [[Problem = Nuisance]]’ is a domain-specific
norm in the domain of legal jargon
• abate is not normally a transitive verb.
26
Details, details, details ....
• amble
PATTERN: [[Human | Animal]] amble [A[Direction]]
IMPLICATURE: [[Human | Animal]] walks slowly and in a relaxed manner in
the stated [[Direction]]
• Notes:
Even though this is a manner-of-motion verb (so the path and destination
are of little importance), normal, idiomatic phraseology requires that
the adverbial of direction be explicit (out of the house, down the hill,
along, into the restaurant, ....)
In the pattern dictionary, implicatures are ‘anchored’ to each pattern by
repetition of at least some of the arguments.
The metalanguage of implicatures here is English (same as the object
language), but it could easily be translated into Russian, Wierzbickan
primitives, or any other formalism.
27
Who did what to whom?
• Look at the numbered patterns for scratch, v., on pp. 119-122
of the Proceedings.
• They help users get started on addressing such questions as:
– Was the action more probably intentional (2, 3, 5, 10), accidental (6), or
neither (1)?
– Was the intention benign (3) or hostile (5)?
– Was the result beneficial (3, 7, 8, 9, 10) or damaging (1, 5)?
– What pragmatic implicatures are activated? – e.g.
• puzzlement (4), poverty (8, 9), reciprocity (10), superficiality (11), ...
• Compare entries for this verb in existing, traditional
dictionaries
28
Patterns for ‘urge’
29
So, what is a pattern dictionary?
• A pattern dictionary explains all normal uses of the words of a
language.
– Not all possible uses!
– It associates meanings with patterns of normal use, rather than with
words in isolation.
• A pattern dictionary is driven by Corpus Pattern Analysis
(CPA).
– as described in Hanks (Euralex Proceedings, 2004).
30
Purposes of a pattern dictionary
• A basic infrastructure resource:
• Showing how meaning maps onto use
• Showing which patterns of usage for each word are important
and which are rare.
• With more predictive power about what words mean in context
than any available resource
• For use by course-book writers, language teachers, advanced
learners, computational linguists, and, of course,
lexicographers.
31
Why focus on word uses, not
word meanings?
• Traditionally, lexicographers ask, “What is the meaning of this
or that word?”
• This assumes that words (in isolation) have meanings.
• I argue (Hanks 1994 and elsewhere) that, strictly speaking, this
assumption is wrong. Words in isolation don’t have meanings,
they have meaning potentials.
• For example, what does fire mean?
– It has the potential to mean lots of things.
• By studying the patterns in which it is normally used, we can
work out what it means in normal contexts.
32
Why only normal uses? Why not
all possible uses?
• The possibilities of usage of each word are infinite.
• Language in use is tremendously creative:
– People like to play with language
– They like to use words in new and interesting ways
– They need to be able to talk about new and unfamiliar things
• To do all this, people exploit the norms of word usage
– So lexicographers must say what these norms are.
– But we don’t know what they are – not even for English!
• Corpus Pattern Analysis enables us to discover the norms.
• Then we can associate meanings with norms – patterns – and
go to say how the norms are exploited creatively.
33
Interpretations: probabilities, not
certainties
• ‘Google fired John’ = Google dismissed John from
employment
BUT
• ‘Google fired John with enthusiasm’ most probably
means ‘John began to feel enthusiasm for Google’s
products or services.’
• Of course it could mean that there are enthusiastic sadists
in the Human Resources department at Google ...
– Innumerable interpretations of texts are possible but
unlikely.
34
Nouns and verbs
• The apparatus required for analysing nouns is different from
that required for verbs.
– Nouns are grouped into lexical sets in relation to the verbs that they
normally colligate with.
– Typically, the lexical sets are united by a semantic type.
– A shallow ontology of nouns (grouped by their semantic type) is
therefore part of the apparatus of a pattern dictionary.
– Semantic typing in real texts is more complex than might be expected
from invented examples.
– Lexical sets include alternations , parts, and attributes of types.
35
Conclusions
• Corpus data now enables lexicography to go beyond
Hornby and Hunston in at least two directions:
– Adding semantic values to arguments, enabling the mapping of
meanings onto use
– Probable, statistical, normal – no absolute certainties
– Identification of constructions, going beyond the lexical item
• The lexicographical task here is to find out which lexical items
participate in which constructions
• So far, lexicography has been slow to respond to these
opportunities.
• Funding agencies have been even slower!
– They seem to be stuck in a conservative time warp.
36
Thanks
•
•
•
•
To you, for listening,
To the Hornby Trust, for inviting me,
To U. Pompeu Fabra, as wonderfully efficient hosts,
To fellow lexicographers, the late John Sinclair, and the (still
extant) James Pustejovsky, who have inspired this approach,
• To Karel Pala, Pavel Rychlý, Adam Rambousek, and Adam
Kilgarriff, who created tools that make this analysis possible,
• and to the Academy of Sciences of the Czech Republic
(project T100300419) and the Czech Ministry of Education
(National Research Program II project 2C06009), who are
funding this research.
37
Download