Henriëtte de Swart

advertisement
Language, Cognition and
Optimality
Henriëtte de Swart
ESSLLI 2008, Hamburg
Bidirectional OT in natural language
• Foundational course: everyone welcome!
• Jointly offered by Helen de Hoop and
Henriëtte de Swart, with a special guest
appearance by Petra Hendriks.
• Course materials available through website:
• http://www.let.uu.nl/~Henriette.deSwart/per
sonal/Classes/otesslli/index.html
Course program I
• day 1: Language, cognition and optimality
(de Swart)
• day 2: Case marking patterns in the
languages of the world (de Hoop)
• day 3: Expression and interpretation of
negation: a bidirectional OT typology (de
Swart)
Course program II
• day 4: Scrambling in Dutch (de Hoop)
• day 5: Language acquisition and
production/comprehension asymmetries
(Hendriks)
Today’s program
• Motivation for an optimization approach to
language.
• Basics of optimality theory (OT): inputoutput, constraints, ranking.
• Illustrations: grammar, interpretation,
language variation.
• Speaker-hearer interaction: from
unidirectional to bidirectional OT.
Classical view of language
• Linguistic theory: representation of implicit
knowledge of native speaker (competence)
• Morphology, syntax, semantics: ‘hard’
symbolic rules, generation, parsing (nlp).
• Algorithm determines well-formedness.
• Creativity, recursion.
Variation and learning
• Variation across languages: lexicon,
parameters, universal grammar.
• Language acquisition:
universal grammar is innate,
child learns lexicon and
parameter setting of L1.
Problems with classical view I
• Parameter setting insufficient for interaction
multiple rules (see below).
• Hard rules often have exceptions.
• Semantic variation can only reside in the lexicon:
no interaction with grammar (see day 3).
• Process of language acquisition is hard to
describe; comprehension/production asymmetries
(see day 5).
Problems with classical view II
• Strict separation of system (competence)
and use (performance): little insight into
processing, pragmatics, tendencies.
• Modular structure vs. parallel processing:
language in the brain, newer insights into
neurocognition.
McGurk effect
• http://www.media.uio.no/personer/arntm/M
cGurk_english.html
• In language perception, visual and auditive
input work together.
• Interaction of different linguistic
subsystems (cross-modularity).
• Embedding of linguistic system in broader
cognitive model.
An alternative
• Optimality theory: optimal solutions of
conflicting constraints in natural language
Pronunciation of words (phonology)
Sentence construction (syntax)
Optimal interpretation in context
(semantics)
‘Least effort’
• Least Effort: It takes
less effort to talk if you
choose a normal, ‘easy’
pronunciation of a
sound in a particular
position.
• Speaker oriented
Devoice
• voiceless:
t k f s ch p
voiced:
dgvzgb
• Voiced is ‘special’, ‘harder’, requires action
of vocal cords.
• Voiceless is ‘normal’, ‘easier’, no action of
vocal cords.
• Devoice: Sounds are voiceless at the end of
a word.
Faithfulness
• Faithfulness: A distinction in sound
(phonology) needs to be preserved in
prounciation (phonetics).
• Voice: Voiced sounds are pronounced with
voice.
• Hearer oriented
Language variation
• Differences between languages: different
‘weight’ assigned to certain rules.
• Dutch: Devoice >> Voice
• English: Voice >> Devoice
• Dutch chooses an easy pronuncation.
English chooses a clear pronuncation.
Dutch
hoed ‘hat’
[hoed]

[hoet]
Devoice
Voice
*
*
English
hood

Voice
[hood]
[hoot]
Devoice
*
*
Basic principles
• OT considers grammar as relation between
input and output ( neural network).
• Grammatical well-formedness is defined in
terms of harmony of the network.
• Optimal candidate wins, all other candidates
suboptimal (‘winner takes all’).
Pattern recognition
• Recognizing faces
• Music
• Recognizing hand written
letters
Handwritten letters
• Is this an A or an H?
• Cannot answer question without context.
Letters in context
• Letters in context are not ambiguous.
• Pattern recogition is optimization process.
Patterns and rules
• Optimization in context vs. symbolic rules.
Are they completely separated cognitive
processes?
• OT combines symbolic and subsymbolic levels:
constraints are symbolic, but rules are soft,
violable, and evaluation by optimization.
• ‘Harmonic’ pattern of activation by network
mirrored in ‘harmonic’ outcome of conflicting
rules (Prince and Smolensky 1993).
Input and output
• Input: given.
• GEN: generates possibly infinite set of output
candidates ( activation pattern).
• Grammar: ranked set of constraints.
• Parallel evaluation of all constraints.
• Optimization: least important violations,
maximal harmony.
Linguistic input and output
• Phonology: input is underlying
phonological representation, output is actual
pronunciation (cf. hoed vs. hood).
• Syntax: input is intended meaning, output is
linguistic form (speaker oriented).
• Semantics: input is actual form, output is
meaningful representation (hearer oriented).
Null subjects
• It is raining.
[English]
• Piove.
[Italian]
• Two violable constraints (Grimshaw and
Samek-Lodovici 1998):
• Subject: All clauses must have a subject.
• Full-Interpretation: all constituents in the
sentence must be interpreted.
English

Subject
Is raining

It is raining
Full-Int
*
*
Italian


Full-Int
*
Piove
‘It’ piove
Subject
*
Universal grammar
• Constraints are universal, but soft and
violable.
• Ranking is language-specific.
• Optimization process resolves conflicts
between constraints.
• Reranking of constraints plays role in
language variation, language change,
language acquisition.
Interpretation in context
• Six candidates were invited for an
interview. Three were rejected.
• Three of what?
• Six candidates were hired. Three were
rejected.
• Three of what?
Anaphoric interpretation preferred
• DOAP: do not overlook anaphoric
possibilities
• Six candidates were hired. Three were
rejected.
• Three = three candidates (not ‘others’).
Maximize anaphoricity
• Antecedent rule: the antecedent of an
incomplete NP is the set AB of the
preceding sentence.
• Six candidates were invited for an
interview. Three were rejected.
• Three = three of the candidates invited for
an interview (not ‘others’ not ‘other
candidates’)
Avoid inconsistenties
• Why do we not always maximize
anaphoricity?
• Six candidates were hired. Three were
rejected.
• Three  three of the candidates who were
hired.
• *Inconsistencies: Avoid pragmatically
inconsistent interpretations.
Emergence of the unmarked
Three candidates were hired.
Three were rejected.
Three of the candidates
hired were rejected
*Incons Antec
Doap
*
 Three candidates were
rejected
*
Three ‘others’ (not
candidates) were rejected
*
*
Bi-directional OT
• Speakers are also hearers (different roles alternate
in communication process.
• Syntax-semantics interface,
production/comprehension: bi-directional OT.
• Optimization over form-meaning pairs, such that
intended meaning of speaker corresponds with
actual interpretation by hearer.
hearer
speaker
Intend

Phrase

Speak
Comprehend

Understand

Hear
Speech sound
Form+meaning = communication
• If a speaker wants to convey a ‘negative’ message,
he uses a form marked for negation. The
unmarked form is used for affirmation.
• It is raining.
It is not raining.
• When the input for the hearer is a
form marked for negation, he will
understand this as a ‘negative’ message.
The unmarked form is understood as affirmative.
Constraints about negation
• FNeg (faithfulness constraint): Nonaffirmative input needs to be reflected in the
output.
• *Neg (markedness constraint): avoid
negation in the output.
• Universal ranking: FNeg >> *Neg.
• Result: all languages express negation by
means of a marked form.
OT syntax
meaning form

It is raining
 It is not raining
FNeg *Neg
*
*
OT semantics
form
meaning FNeg
It is not raining



*Neg
*
*
OT syntax + OT semantics

speaker
 It is not raining  
message
hearer
• Bidirectional OT: optimization over formmeaning pairs.
Optimization over form-meaning
pairs
f: it is raining
f’: it is not raining
m:  m’: 
<raining, >
FNeg
*Neg
<raining, >
*
*
<not raining, >
*
*
<not raining, >


**
Arrow diagram


raining



not raining



Strong bidirectional OT
• Strong bidirectional OT: blocks all formmeaning pairs that are suboptinal in one or
the other direction. Blutner (2000):
• A form-meaning pair <f,m> is
bidirectionally optimal iff:
a. there is no other pair <f’,m> such that
<f’,m> is more harmonic than <f,m>.
b. there is no other pair <f,m’> such that
<f,m’> is more harmonic than <f,m>.
Blocking
• Strong bidirectional OT accounts for
blocking of certain meanings for certain
forms (because a better form is available to
convey that meaning) and blocking of
certain forms for certain meanings (because
a better meaning is available for that form).
Partial blocking
• Strong bidirectional OT accounts for total
blocking, but not for partial blocking.
• Non-linguistic example: dance.
• A group of men and women needs to form
pairs of a male and a female dancer. The
best dancers start choosing their partners.
The best m dancer chooses the best f
dancer, the next-best m dancer chooses the
next-best f dancer, etc.
Partial blocking in the lexicon I
• Competition between kill and cause to die.
• By lexical decomposition: Kill = [Cause
[become [not alive]]] (Dowty 1979).
• But if this what kill means, why does the
periphrastic construction cause to die live
next to kill?
• Severely handicapped newborn: 'to let live'
or 'cause to die‘ (Google)
Partial blocking in the lexicon II
• Kill is typically used to convey direct
causation, cause to die is used to convey
indirect causation.
• Kill is shorter (unmarked form), cause to
die is longer (marked form).
• Direct causation is unmarked meaning,
indirect causation as marked meaning.
Preferred associations (arrow
diagram)
direct
cause
kill

indirect
cause


cause to
die



Weak bidirectional OT
*f2
<f1,m1>

<f1,m2>
*
<f2,m1>
<f2,m2>
*m2
*

*
*
Weak bidirectional OT
• A form-meaning pair <f,m> is
bidirectionally superoptimal iff:
a. there is no other superoptimal pair <f’,m>
such that <f’,m> is more harmonic than
<f,m>.
b. there is no other superoptimal pair <f,m’>
such that <f,m;> is more harmonic than
<f,m>.
Horn’s division of pragmatic
labor
• Weak bidirectional OT is an implementation
of Horn’s division of pragmatic labor.
• Horn (1984): Unmarked forms go with
unmarked meanings; marked forms go with
marked meanings.
Conclusions of the day
• We need a theory of grammar compatible with
modern insights in neurocognition.
• Patterns of optimization are pervasive; language is
no exception.
• Speaker-hearer interactions can be modeled in bidirectional OT: optimization over form-meaning
pairs.
Download