Hao Wang, Toben Mintz Department of Psychology University of Southern California

advertisement
Hao Wang, Toben Mintz
Department of Psychology
University of Southern California
The Problem of Learning Syntactical Categories
 Grammar includes manipulations of lexical items
based on their syntactical categories.
 Learning syntactical categories are fundamental to the
acquisition of language.
The Problem of Learning Syntactical Categories
 Nativist approach
 Children are innately endowed with the possible
syntactical categories.
 How to map a lexical item to its syntactical category or
categories?
 Empirical approach
 Children have to figure out the syntactical categories in
their target language, and assign categories to lexical
items.
 There is no or little help from syntactical constraints.
Approaches Based on Semantic Categories
 Grammatical Categories correspond to
Semantic/Conceptual Categories
(Macnamara, 1972; Bowerman, 1973; Bates &
MacWhinney, 1979; Pinker, 1984)
object  noun action  verb
 But what about
 action, noise, love
 to think, to know
(Maratsos & Chalkley, 1980)
Grammatical Categories from Distributional Analyses
 Structural Linguistics
Grammatical categories defined by similarities of word
patterning (Bloomfield , 1933; Harris, 1951)
 Maratsos & Chalkley (1980):
Distributional learning theory
 lexical co-occurrence patterns
 (and morphology and semantics)
 the cat is on the mat
 cat, mat
Grammatical Categories from Distributional Analyses
 Patterns across whole utterances
(Cartwright & Brent, 1997)
 My cat meowed.
 Your dog slept.
 Det N X/Y.
 Bigram co-occurrence patterns
(Mintz, Newport, & Bever, 1995, 2002; Redington, Chater & Finch,
1998)
 the cat is on the mat
Frequent Frames (Mintz, 2003)
 Frames are defined as “two jointly occurring words
with one word intervening”.
 “would you put the cans back ?”



“you get the nuts .”
“you take the chair back .
“you read the story to Mommy .”
 Frame: you_X_the
Sensitivity to Frame-like Units
 Frames lead to categorization in adults (Mintz, 2002)
 Fifteen-month-olds are sensitive to frame-like
sequences (Gómez & Maye, 2005)
Distributional Analyses Using
Frequent Frames (Mintz, 2003)
 Six corpora from CHILDES (MacWhinney, 2000).
 Analyzed utterances to children under 2;6.
 Accuracy results
0.9
Mean Token Accuracy
averaged over
all corpora.
1
0.8
0.7
Actual Categorization
0.6
Chance Categorization
0.5
0.4
0.3
0.2
Categorization Type
Limitation of the Frequent Frame Analyses
 Requires two passes through the corpus
 Step 1, identify the frequent frames by tallying the frame
frequency.
 Step 2, categorizing words using those frames.
 Tracks the frequency of all frames
 E.g., approximately 15000 frame types in one of the
corpora in Mintz (2003).
Goal of current study
 Provides a psychological plausible model of word
categorization
 Children possesses limited memory and cognitive
capacity.
 Human memory is imperfect.
 Children may not be able to track all the frames he/she
has encountered.
Features of current model
 It processes input and updates the categorization
frames dynamically.
 Frame is associated with and ranked by a activation
value.
 It has a limited memory buffer for frames.
 Only stores the most activated 150 frames.
 It implements a forgetting function on the memory.
 After processed a new frame, the activation of all frames
in the memory decreased by 0.0075.
Child Input Corpora
 Six corpora from CHILDES (MacWhinney, 2000).
 Analyzed utterances to children under 2;6.
 Peter (Bloom, Hood, Lightbown, 1974; Bloom, Lightbown, Hood, 1975)
Eve (Brown, 1973)
Nina (Suppes, 1974)
Naomi (Sachs, 1983)
Anne (Theakston, Lieven, Pine, Rowland, 2001)
Aran (Theakston et al., 2001)
 Mean Utterance/Child: ~17,200
 MIN: 6,950 ; MAX: 20,857
Procedure
 The child-directed utterances from each corpus was
processed individually
 Utterances were presented to the model in the order of
appearance in the corpus
 Each utterance was segmented into frames
 “you read the story to Mommy”
 you read the



read the story
the story to
story to Mommy
Procedure continued…
 you read the
 read the story
Memory
Activation
Frame
 the story to
1.0000
you_X_the
 story to Mommy
1.0000
read_X_story
1.0000
the_X_to
1.0000
story_X_Mommy
Procedure continued…
 The memory buffer only
stores most activated 150
frames.
 It becomes full very quickly
after processing several
utterances.
Memory
Activation
Frame
1.0000
you_X_the
1.0000
read_X_story
1.0000
the_X_to
1.0000
story_X_Mommy
1.0000
to_X_it
1.0000
the_X_on
…
…
Procedure continued…
 “you put the”
 Frame: you_X_the
 Look up you_X_the frame in
the memory
 Increase the activation of
you_X_the frame by 1
 Re-rank the memory by
activation
Memory
Activation
Frame
1.0000
you_X_the
1.0000
read_X_story
1.0000
the_X_to
1.0000
story_X_Mommy
1.0000
to_X_it
1.0000
the_X_on
…
…
Procedure continued…
 “you have a”
 Frame: you_X_a
 Look up you_X_a frame in
the memory
 story_X_Mommy < 1
 Remove story_X_Mommy
 Add you_X_a to memory,
set the activation to 1
 Re-rank the memory by
activation
Memory
Activation
Frame
1.0000
you_X_the
1.0000
read_X_story
1.0000
the_X_to
1.0000
to_X_it
1.0000
the_X_on
0.8175
story_X_Mommy
…
…
Procedure continued…
 A new frame not in memory
 The activation of all frames
in memory are greater than 1
 There is no change to the
memory.
Memory
Activation
Frame
1.0000
you_X_the
1.0000
read_X_story
1.0000
the_X_to
1.0000
to_X_it
1.0000
the_X_on
0.8175
story_X_Mommy
…
…
Evaluating Model Performance
hits
Accuracy 
hits  false _ alarms
 Hit: two words from the same linguistic category
grouped together
 False Alarm: two words from different linguistic
categories grouped together
 Upper bound of 1
Accuracy Example
V
V
V
ADV
V
V
Hits: 10
False Alarms: 5
Accuracy:
10
 .67
10  5
Ten Categories for Accuracy
 Noun, pronoun
 Determiner
 Verb, Aux., Copula
 Wh-word
 Adjective
 Negation -- “not”
 Preposition
 Conjunction
 Adverb
 Interjection
Averaged accuracy across 6 corpora
Accuracy
Eve
0.782019
Peter
0.803401
Anne
0.872820
Aran
0.860191
Nina
0.828753
Naomi
0.773230
Average
0.820069
The Development of Accuracy
 Accuracy
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
1000
2100
3200
4300
5400
6500
7600
8700
9800
10900
12000
13100
14200
15300
16400
17500
18600
19700
20800
21900
23000
24100
25200
26300
27400
28500
29600
30700
31800
32900
Accuracy
are very
high and
stable in
the entire
process
Accuracy of Eve Corpus
Number of Frames Processed
Compare to Frequent Frames
 After
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
100
1100
2100
3100
4100
5100
6100
7100
8100
9100
10100
11100
12100
13100
14100
15100
16100
17100
18100
19100
20100
21100
22100
23100
24100
25100
26100
27100
28100
29100
30100
31100
32100
33100
Ratio
processing
about half of
the corpus,
70% of
frequent
frames are in
the most
activated 45
frames in
memory.
Ratio of Frequent Frames in Top 45 Most Activated Frames
in Memory and Total Number of Frequent Frames (Eve
Corpus)
Number of Frames Processed
Memory of Final Step of Eve Corpus
#
0
1
2
3
4
5
6
7
8
9
10
w2 type w2 token
9
351
20
230
70
203
27
115
44
115
3
110
5
110
15
108
38
90
2
89
11
86
Activation
326.25225
205.151
178.16525
90.7135
90.379
85.2665
85.10525
83.2965
65.2905
65.132
61.1075
Frame
what_X_you
you_X_to
you_X_it
you_X_a
you_X_the
are_X_doing
what_X_that
you_X_me
to_X_it
would_X_like
why_X_you
Stability of Frames in Memory
 Big changes
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
100
1100
2100
3100
4100
5100
6100
7100
8100
9100
10100
11100
12100
13100
14100
15100
16100
17100
18100
19100
20100
21100
22100
23100
24100
25100
26100
27100
28100
29100
30100
31100
32100
33100
Overlap of Frames in Current and Last Step
of frames
in memory
in early
stage, but
become
stable after
processing
10% of the
corpus
Stability of Frames in Memory (Eve Corpus)
Number of Frames Processed
Summary
 After processed the entire corpus, the learning
algorithm has identified almost all of the frequent
frames by highest activation.
 Consequently, high accuracy of word categorization is
achieved.
 After processing fewer than half of the utterances, the
45 most activated frames included approximately 70%
of frequent frames.
Summary
 Frames are a robust cue for categorizing words.
 With limited and imperfect memory, the learning
algorithm can identify most frequent frames after
processing a relatively small number of utterances.
Thus yield a high accuracy of word categorization.
Download