Language, Cognition and Optimality Henriëtte de Swart ESSLLI 2008, Hamburg Bidirectional OT in natural language • Foundational course: everyone welcome! • Jointly offered by Helen de Hoop and Henriëtte de Swart, with a special guest appearance by Petra Hendriks. • Course materials available through website: • http://www.let.uu.nl/~Henriette.deSwart/per sonal/Classes/otesslli/index.html Course program I • day 1: Language, cognition and optimality (de Swart) • day 2: Case marking patterns in the languages of the world (de Hoop) • day 3: Expression and interpretation of negation: a bidirectional OT typology (de Swart) Course program II • day 4: Scrambling in Dutch (de Hoop) • day 5: Language acquisition and production/comprehension asymmetries (Hendriks) Today’s program • Motivation for an optimization approach to language. • Basics of optimality theory (OT): inputoutput, constraints, ranking. • Illustrations: grammar, interpretation, language variation. • Speaker-hearer interaction: from unidirectional to bidirectional OT. Classical view of language • Linguistic theory: representation of implicit knowledge of native speaker (competence) • Morphology, syntax, semantics: ‘hard’ symbolic rules, generation, parsing (nlp). • Algorithm determines well-formedness. • Creativity, recursion. Variation and learning • Variation across languages: lexicon, parameters, universal grammar. • Language acquisition: universal grammar is innate, child learns lexicon and parameter setting of L1. Problems with classical view I • Parameter setting insufficient for interaction multiple rules (see below). • Hard rules often have exceptions. • Semantic variation can only reside in the lexicon: no interaction with grammar (see day 3). • Process of language acquisition is hard to describe; comprehension/production asymmetries (see day 5). Problems with classical view II • Strict separation of system (competence) and use (performance): little insight into processing, pragmatics, tendencies. • Modular structure vs. parallel processing: language in the brain, newer insights into neurocognition. McGurk effect • http://www.media.uio.no/personer/arntm/M cGurk_english.html • In language perception, visual and auditive input work together. • Interaction of different linguistic subsystems (cross-modularity). • Embedding of linguistic system in broader cognitive model. An alternative • Optimality theory: optimal solutions of conflicting constraints in natural language Pronunciation of words (phonology) Sentence construction (syntax) Optimal interpretation in context (semantics) ‘Least effort’ • Least Effort: It takes less effort to talk if you choose a normal, ‘easy’ pronunciation of a sound in a particular position. • Speaker oriented Devoice • voiceless: t k f s ch p voiced: dgvzgb • Voiced is ‘special’, ‘harder’, requires action of vocal cords. • Voiceless is ‘normal’, ‘easier’, no action of vocal cords. • Devoice: Sounds are voiceless at the end of a word. Faithfulness • Faithfulness: A distinction in sound (phonology) needs to be preserved in prounciation (phonetics). • Voice: Voiced sounds are pronounced with voice. • Hearer oriented Language variation • Differences between languages: different ‘weight’ assigned to certain rules. • Dutch: Devoice >> Voice • English: Voice >> Devoice • Dutch chooses an easy pronuncation. English chooses a clear pronuncation. Dutch hoed ‘hat’ [hoed] [hoet] Devoice Voice * * English hood Voice [hood] [hoot] Devoice * * Basic principles • OT considers grammar as relation between input and output ( neural network). • Grammatical well-formedness is defined in terms of harmony of the network. • Optimal candidate wins, all other candidates suboptimal (‘winner takes all’). Pattern recognition • Recognizing faces • Music • Recognizing hand written letters Handwritten letters • Is this an A or an H? • Cannot answer question without context. Letters in context • Letters in context are not ambiguous. • Pattern recogition is optimization process. Patterns and rules • Optimization in context vs. symbolic rules. Are they completely separated cognitive processes? • OT combines symbolic and subsymbolic levels: constraints are symbolic, but rules are soft, violable, and evaluation by optimization. • ‘Harmonic’ pattern of activation by network mirrored in ‘harmonic’ outcome of conflicting rules (Prince and Smolensky 1993). Input and output • Input: given. • GEN: generates possibly infinite set of output candidates ( activation pattern). • Grammar: ranked set of constraints. • Parallel evaluation of all constraints. • Optimization: least important violations, maximal harmony. Linguistic input and output • Phonology: input is underlying phonological representation, output is actual pronunciation (cf. hoed vs. hood). • Syntax: input is intended meaning, output is linguistic form (speaker oriented). • Semantics: input is actual form, output is meaningful representation (hearer oriented). Null subjects • It is raining. [English] • Piove. [Italian] • Two violable constraints (Grimshaw and Samek-Lodovici 1998): • Subject: All clauses must have a subject. • Full-Interpretation: all constituents in the sentence must be interpreted. English Subject Is raining It is raining Full-Int * * Italian Full-Int * Piove ‘It’ piove Subject * Universal grammar • Constraints are universal, but soft and violable. • Ranking is language-specific. • Optimization process resolves conflicts between constraints. • Reranking of constraints plays role in language variation, language change, language acquisition. Interpretation in context • Six candidates were invited for an interview. Three were rejected. • Three of what? • Six candidates were hired. Three were rejected. • Three of what? Anaphoric interpretation preferred • DOAP: do not overlook anaphoric possibilities • Six candidates were hired. Three were rejected. • Three = three candidates (not ‘others’). Maximize anaphoricity • Antecedent rule: the antecedent of an incomplete NP is the set AB of the preceding sentence. • Six candidates were invited for an interview. Three were rejected. • Three = three of the candidates invited for an interview (not ‘others’ not ‘other candidates’) Avoid inconsistenties • Why do we not always maximize anaphoricity? • Six candidates were hired. Three were rejected. • Three three of the candidates who were hired. • *Inconsistencies: Avoid pragmatically inconsistent interpretations. Emergence of the unmarked Three candidates were hired. Three were rejected. Three of the candidates hired were rejected *Incons Antec Doap * Three candidates were rejected * Three ‘others’ (not candidates) were rejected * * Bi-directional OT • Speakers are also hearers (different roles alternate in communication process. • Syntax-semantics interface, production/comprehension: bi-directional OT. • Optimization over form-meaning pairs, such that intended meaning of speaker corresponds with actual interpretation by hearer. hearer speaker Intend Phrase Speak Comprehend Understand Hear Speech sound Form+meaning = communication • If a speaker wants to convey a ‘negative’ message, he uses a form marked for negation. The unmarked form is used for affirmation. • It is raining. It is not raining. • When the input for the hearer is a form marked for negation, he will understand this as a ‘negative’ message. The unmarked form is understood as affirmative. Constraints about negation • FNeg (faithfulness constraint): Nonaffirmative input needs to be reflected in the output. • *Neg (markedness constraint): avoid negation in the output. • Universal ranking: FNeg >> *Neg. • Result: all languages express negation by means of a marked form. OT syntax meaning form It is raining It is not raining FNeg *Neg * * OT semantics form meaning FNeg It is not raining *Neg * * OT syntax + OT semantics speaker It is not raining message hearer • Bidirectional OT: optimization over formmeaning pairs. Optimization over form-meaning pairs f: it is raining f’: it is not raining m: m’: <raining, > FNeg *Neg <raining, > * * <not raining, > * * <not raining, > ** Arrow diagram raining not raining Strong bidirectional OT • Strong bidirectional OT: blocks all formmeaning pairs that are suboptinal in one or the other direction. Blutner (2000): • A form-meaning pair <f,m> is bidirectionally optimal iff: a. there is no other pair <f’,m> such that <f’,m> is more harmonic than <f,m>. b. there is no other pair <f,m’> such that <f,m’> is more harmonic than <f,m>. Blocking • Strong bidirectional OT accounts for blocking of certain meanings for certain forms (because a better form is available to convey that meaning) and blocking of certain forms for certain meanings (because a better meaning is available for that form). Partial blocking • Strong bidirectional OT accounts for total blocking, but not for partial blocking. • Non-linguistic example: dance. • A group of men and women needs to form pairs of a male and a female dancer. The best dancers start choosing their partners. The best m dancer chooses the best f dancer, the next-best m dancer chooses the next-best f dancer, etc. Partial blocking in the lexicon I • Competition between kill and cause to die. • By lexical decomposition: Kill = [Cause [become [not alive]]] (Dowty 1979). • But if this what kill means, why does the periphrastic construction cause to die live next to kill? • Severely handicapped newborn: 'to let live' or 'cause to die‘ (Google) Partial blocking in the lexicon II • Kill is typically used to convey direct causation, cause to die is used to convey indirect causation. • Kill is shorter (unmarked form), cause to die is longer (marked form). • Direct causation is unmarked meaning, indirect causation as marked meaning. Preferred associations (arrow diagram) direct cause kill indirect cause cause to die Weak bidirectional OT *f2 <f1,m1> <f1,m2> * <f2,m1> <f2,m2> *m2 * * * Weak bidirectional OT • A form-meaning pair <f,m> is bidirectionally superoptimal iff: a. there is no other superoptimal pair <f’,m> such that <f’,m> is more harmonic than <f,m>. b. there is no other superoptimal pair <f,m’> such that <f,m;> is more harmonic than <f,m>. Horn’s division of pragmatic labor • Weak bidirectional OT is an implementation of Horn’s division of pragmatic labor. • Horn (1984): Unmarked forms go with unmarked meanings; marked forms go with marked meanings. Conclusions of the day • We need a theory of grammar compatible with modern insights in neurocognition. • Patterns of optimization are pervasive; language is no exception. • Speaker-hearer interactions can be modeled in bidirectional OT: optimization over form-meaning pairs.