Lexical Patterns: from Hornby to Hunston and beyond Patrick Hanks Faculty of Informatics, Masaryk University, Brno, Czech Republic hanks@fi.muni.cz Afrilex, Stellenbosch June 30, 2008 1 Outline of the talk • Palmer and Hornby (1942) – the recognition of lexical patterns – the dictionary as an aid to idiomatic and productive use of language. • Refinements to OALD (Hornby 1964; Cowie 1989; Crowther 1995;Wehmeier 2000) • The impact of corpus linguistics on lexicography – The role of collocations (Sinclair) • Verb patterns and sentence meaning; implicatures – Pattern grammar contrasted with pattern dictionary • Where do we go from here? 2 A. S. Hornby 1923: started as a teacher of English in Japan 1931: joined H. E. Palmer at the Tokyo Institute for Research into English Teaching 1936: became head of research at the Institute 1941: Hornby repatriated in World War II; joined the British Council 1942: Idiomatic and Syntactic Dictionary (ISED), by Hornby Gatenby, and Wakefield, published in Japan by Kaitakusha 1948: ISED re-published (unchanged) by OUP as A Learner’s Dictionary of English (LDE) 1954: A Guide to Patterns and Usage in English (lexically based) 1960: title of LDE changed to Advanced Learners’ Dictionary of Current English 1963: Second edition of ALD (Hornby alone). 1974: Third edition of OALD (edited with A. P. Cowie) 1978: Death of A. S. Hornby. 3 Some insights of Palmer and Hornby • Learners need idiomatic phraseology, not etymology • English grammar is not Latin grammar – E.g. Palmer identified ‘determinatives’ (determiners), adverbial particles, ‘anomalous finite verbs’ (auxiliaries, modals, and verbal pro-forms), as well as classes inherited from Latin grammar (nouns, verbs, adjectives, adverbs) – Palmer and Hornby also identified clause structure elements – for example subject complements and object complements (He is mad & linguistics drove him mad; he is the editor & they appointed him editor) • Language in use is structured around the lexical patterns of verbs 4 Hornby’s second edition (1963) • Huge increase in the number of lexical items – Many of them more useful for receptive (reading) than productive (writing) use. • Less user friendly than ISED – influenced by the Concise Oxford Dictionary, 4th edition,1951 • ‘Nested’ subentries with swung dashes – E.g. blackbird became “~bird” under the main entry black • See Cowie (1999), English Dictionaries for Foreign Learners: A History, for more details • Some of these editorial policies were reversed in OALD6 (2000). 5 Hornby’s original verb patterns (1942) • 25 patterns were identified. – Succinct, but still hard for a learner to use easily. – Only two subdivisions: transitive vs. intransitive. – 14 of the 25 verb patterns take clausal complements: • E.g. Pattern 5 (S V O Inf): I made him do it. • Separating out the clausals makes things a lot simpler. – Some very subtle (unnecessary?) distinctions. – For details, see the version of this paper in the Proceedings. • In the 4th edition (1989), A. P. Cowie introduced a more userfriendly organization of patterns. 6 Verb patterns in OALD now (1) Intransitive verbs • [V] • [V + adv/prep] A large dog appeared. A group of swans floated by. Transitive verbs • [VN] • [VN + adv/prep] Jill’s behaviour annoyed me. He kicked the ball into the net. Transitive verbs + two objects • [VNN] I gave Sue a book for Christmas. Linking verbs • [V-ADJ] • [V-N] • [VN-ADJ] • [VN-N] His voice sounds hoarse. Elena became a doctor. She considered herself lucky. They elected him president. 7 Verb patterns in OALD now (2) Verbs used with clauses or phrases • [V (that)] He said that he would prefer to walk. • [VN (that)] Can you remind me that I need to buy some milk? • [V wh-] I wonder what the job will be like. • [VN wh-] I asked him where the hall was. • [V to] The goldfish need to be fed. • [VN to] He was forced to leave the keys. • [VN inf] Did you hear the phone ring? • [V –ing] She never stops talking. • [VN –ing] His comments set me thinking. • Verbs + direct speech • [V speech] “It’s snowing,” she said. • [VN speech] “Tom’s coming to lunch,” she told him. 8 Word classes vs. clause structure • OALD now, like the Pattern Grammar of Hunston and Francis (2000) and other works, expresses patterns in terms of word classes, not clause roles. – E.g. “VN” not “subject – verb – object”. • I will argue that patterns of verb use need to be analysed in terms of clause roles – a subtle but important distinction. – Failure to observe this distinction has led to some confusion. – Lexicographers defining verbs need to know about clause structure. 9 Clause roles • • • • S Subject P Predicator The verb group. O Object English clauses have 1, 2, or 0 objects. C Complement Co-referential with either the subject or the object of the clause. • A Adverbial (sometimes called Adjunct). A clause may have any number of optional adverbials, but only 1 or 0 obligatory adverbial. – It is necessary to distinguish obligatory adverbials (e.g. on the table in He put the cup on the table) from optional adverbials (as in He died in 1974). 10 Pattern Grammar • Hunston and Francis (2000): Pattern Grammar: a corpusdriven approach to the lexical grammar of English • “One of the most important observations in a corpus-driven description of English is that patterns and meanings are connected.” • PG is founded on real texts and is a real attempt at empirically valid, i.e. corpus-driven, generalizations. 11 How is a Pattern Dictionary different from a Pattern Grammar? • Pattern Grammar seeks similarities – words with similar meanings – and groups them together according to syntactic as well as semantic similarity. • By contrast, the Pattern Dictionary seeks systematic differences: – In particular, the differences in pattern that pick out different meanings of a polysemous word. – To do this, it needs to introduce semantic values of arguments – At this point, all hell breaks loose! 12 Semantic values of arguments (valencies) • Not just “V n” but (e.g.): [[Human]] polish {([[Surface]] of) [[Physical Object]]} – Not “all possible values” but “all normal values” • Lumping vs. splitting: – Is polishing one’s nails the same pattern as polishing one’s boots | the furniture | etc.? • Semantic values shimmer – see Hanks and Jezek, this conference, e.g.: – calm a person, calm an animal – calm someone’s fears, anxieties, nerves ... 13 Focusing of arguments – repair the house – repair the roof of the house – repair the damage to the roof • Same event, different semantic types 14 Meanings are not just concatenations • Constructions (construction grammar), e.g. resultatives: – He shook the rain off his umbrella – What did he shake? (Not the rain!) • He belched his way out of the room [example from Goldberg and Jackendoff 2004] – A noise-emission event AND a motion event • Compare he belched out of the window. – Only certain classes of verbs participate in the way construction – Corpus analysis (not navel gazing) is needed to determine which verbs participate in which constructions – See Fillmore’s ‘Constructicon’ (plenary, this conference) 15 Apparatus of Pattern Analysis • The Pattern Grammar has an admirably simple apparatus. – target word; part of speech categories; word order; and certain function words (mainly prepositions). • Simplicity can be overdone. --• To represent the distinctive features of meaning in use, we also need at least: – Systematic analysis and categorization of ‘colligations’ – Lexical items grouped as collocates or by semantic type – Valencies – a.k.a. clause roles (we use S P O C A) 16 execute Verb execute Pattern Grammar Vn Pattern Dictionary [[Human 1]] execute [[Human 2]] (passive n be V-ed) Example sentence: Private Joseph Byers was the first Kitchener volunteer to be executed. In the Pattern Dictionary (but not in PG) semantic types distinguish this sense from other “V n” patterns of the same verb, e.g. ‘execute an order’. 17 enlist Verb Pattern Grammar Pattern Dictionary enlist V in n [[Human]] enlist [NO OBJ] {in [[Human Group = Military]]} Example sentence: He was 17 and under age when he enlisted in the 1st Royal Scots Fusiliers. Pattern Grammar and Pattern Dictionary agree in contrasting this sense with other patterns such as “[[Human]] enlist [[Assistance]]” (V n). 18 go Verb go Pattern Grammar V adj Pattern Dictionary [[Human]] go [NO OBJ] {absent | AWOL} Example sentence: His inexperience and the horrors he witnessed caused him to go absent without leave. This is a light verb (“delexical verb” in Sinclair’s terminology), with many patterns. The “adj” in PG is a Subject Complement. The small lexical set, {absent | AWOL}, in the Pattern Dictionary activates a particular meaning of go, contrasting with other patterns of go having a Subject Complement, e.g. go {mad | bananas} . 19 plead Verb Pattern Grammar plead V adj Pattern Dictionary [[Human]] plead [NO OBJ] {guilty | {not guilty}} Example sentence: Byers pleaded guilty. The adj in this pattern is an Object Complement. The Object Complement is populated by a lexical set of just two possible (normal) items. (“plead innocent” is plausible but not idiomatic.) 20 fire Verb fire Pattern Grammar V adj Pattern Dictionary [[Human]] fire [NO OBJ] ([Advl[Direction]]) Example sentence: … the firing squad had fired wide to avoid killing the youth. The “adj” in this sentence has the clause role of Adjunct or (in my terminology) Adverbial of Direction, as in: The police fired into the crowd. They fired over their heads. It’s not really an adj. at all. 21 Semantic values in patterns • PATTERN: [[Human]] execute [[Command]] • [[Command]] is a semantic type with many lexical realizations: command, order, instruction, wish, ... • IMPLICATURE: [[Human 1]] acts in accordance with [[Human 2]]’s [[Wish]] • A look-up table (an ontology) is needed, in which users can find all the normal words (the ‘population’) of each semantic type in each normal context – and/or a procedure for recognizing type membership – e.g. ‘named entity recognition’ – which recognizes all and only members of the set [[Human]], and distinguish them from names of places, businesses, products, dogs, etc. 22 Collocations; lexical sets; semantic types • Pustejovsky (1995, 2008 (with Rumshisky)): meaning is expressed in terms of verbs and their argument structures (with the semantic types of the arguments). • Sinclair (passim; followed by Kilgarriff and others): this is unnecessary: just list the collocates found in a corpus; don’t try to group them in terms of semantic types. • Hanks (1996, 2007, elsewhere), Jezek: lexical sets typically share semantic values, but with much variation. – Grouping them by semantic type increases the power of the dictionary to predict the meaning of a sentence correctly. • Either way, sentence meaning depends on the sets of words that normally, typically occur in particular clause roles in relation to a particular verb (not just the word classes – i.e. not just “V n”). 23 Patterns, not senses • Meanings taken from WordNet or a dictionary do not yield reliable data for disambiguating senses (Ide and Wilks 2005). – WordNet lists synonym sets and other semantic relations – but not senses. – WordNet did not do contrastive analysis of word senses. – In standard dictionaries, word senses are not mutually exclusive. – There is much fuzzy overlap between senses – which may be OK for sophisticated human users, but not for learners or computers. • The patterns of all and only the normal uses of a lexical item are (normally) mutually exclusive. – However, teasing them out from corpus data is hard. 24 Norms and exploitations • The Pattern Dictionary record all and only the normal uses of each verb. – Exploitation of norms is a subject for separate analysis. – Types of ‘exploitation’ include creative metaphor, ellipsis, and anomalous arguments. Consider: • The goat ate the newspaper. • The verb eat has a preference for nouns of semantic type [[Food]] in the direct object clause role. • ‘[[Animate]] eat [[Document]]’ is not a normal pattern of English. • Compare John devoured the newspaper. • ‘[[Human]] devour [[Document]]’ is a normal pattern of English. It is a conventional metaphor. 25 Specifically, ... The Pattern Dictionary of English Verbs will: • list all normal patterns of each verb lemma in BNC. • provide a benchmark for comparison and identification of norms in other corpora, e.g. – by time period: patterns in historical corpora, future corpora . – by region: e.g. patterns in American English. – by domain, e.g.: • ‘[[Human]] abate [[Problem = Nuisance]]’ is a domain-specific norm in the domain of legal jargon • abate is not normally a transitive verb. 26 Details, details, details .... • amble PATTERN: [[Human | Animal]] amble [A[Direction]] IMPLICATURE: [[Human | Animal]] walks slowly and in a relaxed manner in the stated [[Direction]] • Notes: Even though this is a manner-of-motion verb (so the path and destination are of little importance), normal, idiomatic phraseology requires that the adverbial of direction be explicit (out of the house, down the hill, along, into the restaurant, ....) In the pattern dictionary, implicatures are ‘anchored’ to each pattern by repetition of at least some of the arguments. The metalanguage of implicatures here is English (same as the object language), but it could easily be translated into Russian, Wierzbickan primitives, or any other formalism. 27 Who did what to whom? • Look at the numbered patterns for scratch, v., on pp. 119-122 of the Proceedings. • They help users get started on addressing such questions as: – Was the action more probably intentional (2, 3, 5, 10), accidental (6), or neither (1)? – Was the intention benign (3) or hostile (5)? – Was the result beneficial (3, 7, 8, 9, 10) or damaging (1, 5)? – What pragmatic implicatures are activated? – e.g. • puzzlement (4), poverty (8, 9), reciprocity (10), superficiality (11), ... • Compare entries for this verb in existing, traditional dictionaries 28 Patterns for ‘urge’ 29 So, what is a pattern dictionary? • A pattern dictionary explains all normal uses of the words of a language. – Not all possible uses! – It associates meanings with patterns of normal use, rather than with words in isolation. • A pattern dictionary is driven by Corpus Pattern Analysis (CPA). – as described in Hanks (Euralex Proceedings, 2004). 30 Purposes of a pattern dictionary • A basic infrastructure resource: • Showing how meaning maps onto use • Showing which patterns of usage for each word are important and which are rare. • With more predictive power about what words mean in context than any available resource • For use by course-book writers, language teachers, advanced learners, computational linguists, and, of course, lexicographers. 31 Why focus on word uses, not word meanings? • Traditionally, lexicographers ask, “What is the meaning of this or that word?” • This assumes that words (in isolation) have meanings. • I argue (Hanks 1994 and elsewhere) that, strictly speaking, this assumption is wrong. Words in isolation don’t have meanings, they have meaning potentials. • For example, what does fire mean? – It has the potential to mean lots of things. • By studying the patterns in which it is normally used, we can work out what it means in normal contexts. 32 Why only normal uses? Why not all possible uses? • The possibilities of usage of each word are infinite. • Language in use is tremendously creative: – People like to play with language – They like to use words in new and interesting ways – They need to be able to talk about new and unfamiliar things • To do all this, people exploit the norms of word usage – So lexicographers must say what these norms are. – But we don’t know what they are – not even for English! • Corpus Pattern Analysis enables us to discover the norms. • Then we can associate meanings with norms – patterns – and go to say how the norms are exploited creatively. 33 Interpretations: probabilities, not certainties • ‘Google fired John’ = Google dismissed John from employment BUT • ‘Google fired John with enthusiasm’ most probably means ‘John began to feel enthusiasm for Google’s products or services.’ • Of course it could mean that there are enthusiastic sadists in the Human Resources department at Google ... – Innumerable interpretations of texts are possible but unlikely. 34 Nouns and verbs • The apparatus required for analysing nouns is different from that required for verbs. – Nouns are grouped into lexical sets in relation to the verbs that they normally colligate with. – Typically, the lexical sets are united by a semantic type. – A shallow ontology of nouns (grouped by their semantic type) is therefore part of the apparatus of a pattern dictionary. – Semantic typing in real texts is more complex than might be expected from invented examples. – Lexical sets include alternations , parts, and attributes of types. 35 Conclusions • Corpus data now enables lexicography to go beyond Hornby and Hunston in at least two directions: – Adding semantic values to arguments, enabling the mapping of meanings onto use – Probable, statistical, normal – no absolute certainties – Identification of constructions, going beyond the lexical item • The lexicographical task here is to find out which lexical items participate in which constructions • So far, lexicography has been slow to respond to these opportunities. • Funding agencies have been even slower! – They seem to be stuck in a conservative time warp. 36 Thanks • • • • To you, for listening, To the Hornby Trust, for inviting me, To U. Pompeu Fabra, as wonderfully efficient hosts, To fellow lexicographers, the late John Sinclair, and the (still extant) James Pustejovsky, who have inspired this approach, • To Karel Pala, Pavel Rychlý, Adam Rambousek, and Adam Kilgarriff, who created tools that make this analysis possible, • and to the Academy of Sciences of the Czech Republic (project T100300419) and the Czech Ministry of Education (National Research Program II project 2C06009), who are funding this research. 37