Constructing Grammar: a computational model of the acquisition of early constructions

advertisement
Constructing Grammar:
a computational model
of the acquisition
of early constructions
CS 182 Lecture
April 25, 2006
What constitutes learning a language?
 What are the sounds
(Phonology)
 How to make words
(Morphology)
 What do words mean
(Semantics)
 How to put words together
(Syntax)
 Social use of language
(Pragmatics)
 Rules of conversations
(Pragmatics)
2
What do we know about
language development?
(focusing mainly on first language acquisition
of English-speaking, normal population)
3
Children are amazing learners
0 mos
6 mos
12 mos
2 yr
3 yrs
4 yrs
5 yrs
4
Phonology: Non-native contrasts
 Werker and Tees (1984)
 Thompson: velar vs. uvular, /`ki/-/`qi/.
 Hindi: retroflex vs. dental, /t.a/-/ta/
20
18
16
14
12
yes
10
no
8
6
4
2
0
6-8 months
8-10 months
10-12 months
5
Finding words: Statistical learning
 Saffran, Aslin and Newport (1996)
pretty baby
 /bidaku/, /padoti/, /golabu/
 /bidakupadotigolabubidaku/
 2 minutes of this continuous speech stream
 By 8 months infants detect the words (vs
non-words and part-words)
6
Word order: agent and patient
 Hirsch-Pasek and Golinkoff (1996)
 1;4-1;7
 mostly still in the
one-word stage
 Where is CM
tickling BB?
7
Early syntax
 agent + action
‘Daddy sit’
 action + object
‘drive car’
 agent + object
‘Mommy sock’
 action + location
‘sit chair’
 entity + location
‘toy floor’
 possessor + possessed
‘my teddy’
 entity + attribute
‘crayon big’
 demonstrative + entity
‘this telephone’
8
From Single Words To Complex Utterances
FATHER:
NAOMI:
NAOMI:
NAOMI:
MOTHER:
NAOMI:
MOTHER:
Nomi are you
climbing up the
books?
up.
climbing.
books.
1;11.3
what are you doing?
I climbing up.
you’re climbing up?
2;0.18
FATHER: what’s the boy doing
to the dog?
NAOMI: squeezing his neck.
NAOMI: and the dog climbed
up the tree.
NAOMI: now they’re both safe.
NAOMI: but he can climb
trees.
4;9.3
Sachs corpus (CHILDES)
9
How Can Children Be So Good At
Learning Language?
 Gold’s Theorem:
No superfinite class of language is identifiable in the
limit from positive data only
 Principles & Parameters
Babies are born as blank slates but acquire language
quickly (with noisy input and little correction) →
Language must be innate:
Universal Grammar + parameter setting
But babies aren’t born as blank slates!
And they do not learn language in a vacuum!
10
Modeling the acquisition
of grammar:
Theoretical assumptions
11
Language Acquisition
 Opulence of the substrate

Prelinguistic children already have rich sensorimotor
representations and sophisticated social knowledge

intention inference, reference resolution

language-specific event conceptualizations
(Bloom 2000, Tomasello 1995,
Bowerman & Choi, Slobin, et al.)
 Children are sensitive to statistical information

Phonological transitional probabilities

Even dependencies between non-adjacent items
(Saffran et al. 1996, Gomez 2002)
12
Language Acquisition
 Basic Scenes

Simple clause constructions are associated directly
with scenes basic to human experience
(Goldberg 1995, Slobin 1985)
 Verb Island Hypothesis

Children learn their earliest constructions
(arguments, syntactic marking) on a verb-specific basis
(Tomasello 1992)
throw frisbee
throw ball
get ball
get bottle
…
…
throw OBJECT
get OBJECT
this should be
reminiscent of your
model merging
assignment
13
Comprehension
is
partial.
(not just for dogs)
14
What children pick up from what they hear
what did you throw it into?
they’re throwing this in here.
they’re throwing a ball.
don’t throw it Nomi.
well you really shouldn’t throw things Nomi you know.
remember how we told you you shouldn’t throw things.
 Children use rich situational context / cues to fill in the gaps
 They also have at their disposal embodied knowledge and
statistical correlations (i.e. experience)
15
Language Learning Hypothesis
Children learn constructions
that bridge the gap between
what they know from language
and
what they know from the rest of cognition
16
Modeling the acquisition
of (early) grammar:
Comprehension-driven,
usage-based
17
Embodied Construction Grammar (Bergen and Chang
2005)
construction THROWER-THROW-OBJECT
constructional
constituents
t1 : REF-EXPRESSION
t2 : THROW
t3 : OBJECT-REF
form
t1f before t2f
role-filler
t2f before t3f
bindings
meaning
t2m.thrower ↔ t1m
t2m.throwee ↔ t3m
18
Analyzing “You Throw The Ball”
MEANING (stuff)
FORM (sound)
t1 before t2
t2 before t3
“you”
“throw”
ThrowerThrow-Object
t2.thrower ↔ t1
t2.throwee ↔ t3
you
Addressee
schema
Addressee
subcase of Human
Throw Throw
schema
roles:
thrower
thrower
throwee
throwee
throw
“the”
ball
“ball”
“block”
block
Ball
schema
Ball
subcase of Object
schema Block
subcase of Object
19
Learning-Analysis Cycle
Reorganize
(Utterance, Situation)
Constructions
Analyze
(Chang, 2004)
1. Learner passes input (Utterance
+ Situation) and current
grammar to Analyzer.
2. Analyzer produces SemSpec
and Constructional Analysis.
3. Learner updates grammar:
a. Hypothesize new map.
Semantic
Specification,
Constructional Analysis
Hypothesize
b. Reorganize grammar
(merge or compose).
c. Reinforce
(based on usage).
20
Hypothesizing a new construction
through
relational mapping
21
Initial Single-Word Stage
FORM (sound)
“you”
“throw”
lexical constructions
“block”
schema Addressee
subcase of Human
you
throw
“ball”
ball
block
MEANING (stuff)
schema Throw
roles:
thrower
throwee
schema Ball
subcase of Object
schema Block
subcase of Object
22
New Data: “You Throw The Ball”
FORM
MEANING
SITUATION
throw-ball
Self
“you”
“throw”
you
throw
ball
“ball”
“block”
block
Addressee
schema
Throw Throw
roles:
thrower
thrower
throwee
throwee
Throw
thrower
throwee
role-filler
before
“the”
schema
Addressee
Addressee
subcase of Human
schema
Ball
Ball
subcase of Object
Ball
schema Block
subcase of Object
23
New Construction Hypothesized
construction THROW-BALL
constructional
constituents
t : THROW
b : BALL
form
tf before bf
meaning
tm.throwee ↔ bm
24
Three kinds of meaning relations
1. When B.m fills a role of A.m
throw ball
throw.throwee ↔ ball
2. When A.m and B.m are both filled by X
put ball down
put.mover ↔ ball
down.tr ↔ ball
3. When A.m and B.m both fill roles of X
Nomi ball
possession.possessor ↔ Nomi
possession.possessed ↔ ball
25
Reorganizing the current grammar
through
merge and compose
26
Merging Similar Constructions
throw the block
Throw.throwee = Block
throw before block
throw before Objectf
throw before ball
THROWOBJECT
THROW.throwee = Objectm
Throw.throwee = Ball
throw-ing the ball
throw before-s ing
Throw.aspect = ongoing
27
Resulting Construction
construction THROW-OBJECT
constructional
constituents
t : THROW
o : OBJECT
form
tf before of
meaning
tm.throwee ↔ om
28
Composing Co-occurring Constructions
throw the ball
Throw.throwee = Ball
throw before ball
throw before ball
ball before off
ball before off
ball off
THROWBALLOFF
THROW.throwee = Ball
Motion m
m.mover = Ball
m.path = Off
Motion m
m.mover = Ball
m.path = Off
29
Resulting Construction
construction THROW-BALL-OFF
constructional
constituents
t : THROW
b : BALL
o : OFF
form
tf before bf
bf before of
meaning
evokes MOTION as m
tm.throwee ↔ bm
m.mover ↔ bm
m.path ↔ om
30
Precisely defining the
learning algorithm
31
Language Learning Problem
 Prior knowledge



Initial grammar G (set of ECG constructions)
Ontology (category relations)
Language comprehension model
(analysis/resolution)
 Hypothesis space: new ECG grammar G’


Search = processes for proposing new
constructions
Relational Mapping, Merge, Compose
32
Language Learning Problem
 Performance measure

Goal: Comprehension should improve with training

Criterion: need some objective function to guide
learning…
Probability of Model given Data:
P( X | M ) P( M )
P( X )
P( M | X )    P( X | M ) P( M )
log P( M | X )  log P( X | M )  log P( M )
P( M | X ) 
Minimum Description Length:
 log P( M | X )   log P( X | M )  log P( M )
33
Minimum Description Length
 Choose grammar G to minimize cost(G|D):

cost(G|D) = α • size(G) + β • complexity(D|G)

Approximates Bayesian learning;
cost(G|D) ≈ posterior probability P(G|D)
 Size of grammar = size(G) ≈ prior P(G)

favor fewer/smaller constructions/roles; isomorphic mappings
 Complexity of data given grammar ≈ likelihood P(D|G)

favor simpler analyses
(fewer, more likely constructions)

based on derivation length + score of derivation
34
Size Of Grammar
 Size of the grammar G is the sum of the size of each
construction:
size( G )   size( c)
cG
 Size of each construction c is:
size( c)  nc  mc   length( e)
where
ec

nc = number of constituents in c,

mc = number of constraints in c,

length(e) = slot chain length of element reference e
35
Example: The Throw-Ball Cxn
construction THROW-BALL
constructional
constituents
t : THROW
b : BALL
form
tf before bf
meaning
tm.throwee ↔ bm
size( c)  nc + mc +  length( e)
ec
size(THROW-BALL)
= 2 + 2 + (2 + 3) = 9
36
Complexity of Data Given Grammar
 Complexity of the data D given grammar G is the sum of the
analysis score of each input token d:
complexity ( D | G)   score( d )
d D
 Analysis score of each input token d is:


score( d )    weight c     typer   height d  semfitd
cd 
rc

where

c is a construction used in the analysis of d

weightc ≈ relative frequency of c,

|typer| = number of ontology items of type r used,

heightd = height of the derivation graph,

semfitd = semantic fit provide by the analyzer
37
Preliminary Results
38
Experiment: Learning Verb Islands
 Subset of the CHILDES database of parent-child
interactions (MacWhinney 1991; Slobin et al.)
 coded by developmental psychologists for

form: particles, deictics, pronouns, locative phrases, etc.

meaning: temporality, person, pragmatic function,
type of motion (self-movement vs. caused movement;
animate being vs. inanimate object, etc.)
 crosslinguistic (English, French, Italian, Spanish)

English motion utterances: 829 parent, 690 child utterances

English all utterances: 3160 adult, 5408 child

age span is 1;2 to 2;6
39
Learning Throw-Constructions
1. Don’t throw the bear.
throw-bear
2. you throw it
you-throw
3. throwing the thing.
throw-thing
4. Don’t throw them on the ground.
throw-them
5. throwing the frisbee.
throw-frisbee
MERGE
throw-OBJ
6. Do you throw the frisbee?
COMPOSE
you-throw-frisbee
7. She’s throwing the frisbee.
COMPOSE
she-throw-frisbee
40
Learning Results
41
Summary
 Cognitively plausible situated learning processes
 What do kids start with?

perceptual, motor, social, world knowledge

meanings of single words
 What kind of input drives acquisition?

Social-pragmatic knowledge

Statistical properties of linguistic input
 What is the learning loop?

Use existing linguistic knowledge to analyze input

Use social-pragmatic knowledge to understand situation

Hypothesize new constructions to bridge the gap
42
Download