Hao Wang, Toben Mintz Department of Psychology University of Southern California The Problem of Learning Syntactical Categories Grammar includes manipulations of lexical items based on their syntactical categories. Learning syntactical categories are fundamental to the acquisition of language. The Problem of Learning Syntactical Categories Nativist approach Children are innately endowed with the possible syntactical categories. How to map a lexical item to its syntactical category or categories? Empirical approach Children have to figure out the syntactical categories in their target language, and assign categories to lexical items. There is no or little help from syntactical constraints. Approaches Based on Semantic Categories Grammatical Categories correspond to Semantic/Conceptual Categories (Macnamara, 1972; Bowerman, 1973; Bates & MacWhinney, 1979; Pinker, 1984) object noun action verb But what about action, noise, love to think, to know (Maratsos & Chalkley, 1980) Grammatical Categories from Distributional Analyses Structural Linguistics Grammatical categories defined by similarities of word patterning (Bloomfield , 1933; Harris, 1951) Maratsos & Chalkley (1980): Distributional learning theory lexical co-occurrence patterns (and morphology and semantics) the cat is on the mat cat, mat Grammatical Categories from Distributional Analyses Patterns across whole utterances (Cartwright & Brent, 1997) My cat meowed. Your dog slept. Det N X/Y. Bigram co-occurrence patterns (Mintz, Newport, & Bever, 1995, 2002; Redington, Chater & Finch, 1998) the cat is on the mat Frequent Frames (Mintz, 2003) Frames are defined as “two jointly occurring words with one word intervening”. “would you put the cans back ?” “you get the nuts .” “you take the chair back . “you read the story to Mommy .” Frame: you_X_the Sensitivity to Frame-like Units Frames lead to categorization in adults (Mintz, 2002) Fifteen-month-olds are sensitive to frame-like sequences (Gómez & Maye, 2005) Distributional Analyses Using Frequent Frames (Mintz, 2003) Six corpora from CHILDES (MacWhinney, 2000). Analyzed utterances to children under 2;6. Accuracy results 0.9 Mean Token Accuracy averaged over all corpora. 1 0.8 0.7 Actual Categorization 0.6 Chance Categorization 0.5 0.4 0.3 0.2 Categorization Type Limitation of the Frequent Frame Analyses Requires two passes through the corpus Step 1, identify the frequent frames by tallying the frame frequency. Step 2, categorizing words using those frames. Tracks the frequency of all frames E.g., approximately 15000 frame types in one of the corpora in Mintz (2003). Goal of current study Provides a psychological plausible model of word categorization Children possesses limited memory and cognitive capacity. Human memory is imperfect. Children may not be able to track all the frames he/she has encountered. Features of current model It processes input and updates the categorization frames dynamically. Frame is associated with and ranked by a activation value. It has a limited memory buffer for frames. Only stores the most activated 150 frames. It implements a forgetting function on the memory. After processed a new frame, the activation of all frames in the memory decreased by 0.0075. Child Input Corpora Six corpora from CHILDES (MacWhinney, 2000). Analyzed utterances to children under 2;6. Peter (Bloom, Hood, Lightbown, 1974; Bloom, Lightbown, Hood, 1975) Eve (Brown, 1973) Nina (Suppes, 1974) Naomi (Sachs, 1983) Anne (Theakston, Lieven, Pine, Rowland, 2001) Aran (Theakston et al., 2001) Mean Utterance/Child: ~17,200 MIN: 6,950 ; MAX: 20,857 Procedure The child-directed utterances from each corpus was processed individually Utterances were presented to the model in the order of appearance in the corpus Each utterance was segmented into frames “you read the story to Mommy” you read the read the story the story to story to Mommy Procedure continued… you read the read the story Memory Activation Frame the story to 1.0000 you_X_the story to Mommy 1.0000 read_X_story 1.0000 the_X_to 1.0000 story_X_Mommy Procedure continued… The memory buffer only stores most activated 150 frames. It becomes full very quickly after processing several utterances. Memory Activation Frame 1.0000 you_X_the 1.0000 read_X_story 1.0000 the_X_to 1.0000 story_X_Mommy 1.0000 to_X_it 1.0000 the_X_on … … Procedure continued… “you put the” Frame: you_X_the Look up you_X_the frame in the memory Increase the activation of you_X_the frame by 1 Re-rank the memory by activation Memory Activation Frame 1.0000 you_X_the 1.0000 read_X_story 1.0000 the_X_to 1.0000 story_X_Mommy 1.0000 to_X_it 1.0000 the_X_on … … Procedure continued… “you have a” Frame: you_X_a Look up you_X_a frame in the memory story_X_Mommy < 1 Remove story_X_Mommy Add you_X_a to memory, set the activation to 1 Re-rank the memory by activation Memory Activation Frame 1.0000 you_X_the 1.0000 read_X_story 1.0000 the_X_to 1.0000 to_X_it 1.0000 the_X_on 0.8175 story_X_Mommy … … Procedure continued… A new frame not in memory The activation of all frames in memory are greater than 1 There is no change to the memory. Memory Activation Frame 1.0000 you_X_the 1.0000 read_X_story 1.0000 the_X_to 1.0000 to_X_it 1.0000 the_X_on 0.8175 story_X_Mommy … … Evaluating Model Performance hits Accuracy hits false _ alarms Hit: two words from the same linguistic category grouped together False Alarm: two words from different linguistic categories grouped together Upper bound of 1 Accuracy Example V V V ADV V V Hits: 10 False Alarms: 5 Accuracy: 10 .67 10 5 Ten Categories for Accuracy Noun, pronoun Determiner Verb, Aux., Copula Wh-word Adjective Negation -- “not” Preposition Conjunction Adverb Interjection Averaged accuracy across 6 corpora Accuracy Accuracy 1 Eve 0.782019 Peter 0.803401 0.6 Anne 0.872820 0.4 Aran 0.860191 Nina 0.828753 Naomi 0.773230 Average 0.820069 0.8 0.2 0 Eve Peter Anne Aran Nina Naomi The Development of Accuracy Accuracy 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 1000 2100 3200 4300 5400 6500 7600 8700 9800 10900 12000 13100 14200 15300 16400 17500 18600 19700 20800 21900 23000 24100 25200 26300 27400 28500 29600 30700 31800 32900 Accuracy are very high and stable in the entire process Accuracy of Eve Corpus Number of Frames Processed Compare to Frequent Frames After 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 100 1100 2100 3100 4100 5100 6100 7100 8100 9100 10100 11100 12100 13100 14100 15100 16100 17100 18100 19100 20100 21100 22100 23100 24100 25100 26100 27100 28100 29100 30100 31100 32100 33100 Ratio processing about half of the corpus, 70% of frequent frames are in the most activated 45 frames in memory. Ratio of Frequent Frames in Top 45 Most Activated Frames in Memory and Total Number of Frequent Frames (Eve Corpus) Number of Frames Processed Memory of Final Step of Eve Corpus # 0 1 2 3 4 5 6 7 8 9 10 w2 type w2 token 9 351 20 230 70 203 27 115 44 115 3 110 5 110 15 108 38 90 2 89 11 86 Activation 326.25225 205.151 178.16525 90.7135 90.379 85.2665 85.10525 83.2965 65.2905 65.132 61.1075 Frame what_X_you you_X_to you_X_it you_X_a you_X_the are_X_doing what_X_that you_X_me to_X_it would_X_like why_X_you Stability of Frames in Memory Big changes 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 100 1100 2100 3100 4100 5100 6100 7100 8100 9100 10100 11100 12100 13100 14100 15100 16100 17100 18100 19100 20100 21100 22100 23100 24100 25100 26100 27100 28100 29100 30100 31100 32100 33100 Overlap of Frames in Current and Last Step of frames in memory in early stage, but become stable after processing 10% of the corpus Stability of Frames in Memory (Eve Corpus) Number of Frames Processed Summary After processed the entire corpus, the learning algorithm has identified almost all of the frequent frames by highest activation. Consequently, high accuracy of word categorization is achieved. After processing fewer than half of the utterances, the 45 most activated frames included approximately 70% of frequent frames. Summary Frames are a robust cue for categorizing words. With limited and imperfect memory, the learning algorithm can identify most frequent frames after processing a relatively small number of utterances. Thus yield a high accuracy of word categorization.