Rules or Connections in Past Tense Inflections Psychology 209 February 4, 2013 Is there a past tense rule? • Early on, children often produce exceptional past tenses correctly (went, took, etc). • But at some point, they also produce ‘regularizations’. • Also, children (and adults) produce ‘regular’ inflections for novel items when prompted, as in: this man is ricking… yesterday he ____. • This was once taken as suggesting that young children discover ‘the past tense rule’. • The fact that children learn exceptions was explained by ‘memorization’ or ‘lexical lookup’. An Alternative to Assuming that Children ‘Acquire’ the Past-Tense Rule • Rumelhart and McClelland proposed that the past tense reflects regularities captured in the connections among units in a connectionist system that learns from examples. • We demonstrated this by implementing and running a simple model of how people perform the task. • The resulting debate has been fierce… and there are many who still think our approach is misguided. • But as I’ll try to show the evidence appears to be consistent with our perspective. • The work illustrates how models influence empirical research; how easy it is to get misleading results in experiments; and how important it is for researchers on each side of an issue to follow up on each other’s findings. Overview • The RM model, introducing the connectionist alternative • Early critiques and responses lead to… • The Pinker symbolic, dual mechanism account • PDP approach argues that a single-system connectionist approach has advantages over the dual mechanism approach • The exchange that was published in TiCS (2002) is available in the readings – your can read about the debate and judge for yourself. • The issues have not been pursued as intensively since then. Training and Testing Procedure • Training: – Present WF pattern representing present tense of verb. – Compute WF pattern representing past tense of verb using stochastic sigmoid activation function. – Compare computed past-tense pattern to correct past tense pattern. – Adjust connections using Perceptron Convergence Procedure: • Increase strength of connection from active input units to output units that should be active but are not. • Decrease strength of connection from active input units to output units that should not be active but are. • Testing: – – – – Present WF pattern of present tense of verb. Compute WF pattern. Compare to various alternatives on various measures. OR: Generate output using fixed decoding net. Training Regime • First ten epochs use 10 most frequent words only – Feel, have, make, get, give, take, come, go, look, need • Remainder of training uses 10 most frequent plus 400 words of ‘middle frequency’ • Each word is presented once per epoch • An additional 84 lower-frequency words are saved for generalization testing Recapitulation of U-shaped learning Varieties of Regular Past-Tenses Types of Irregular Verbs Responses to t/d verbs and other verbs Performance with Novel Irregulars Quasi-Regularity • The tendency for forms that are irregular to share features of regular form – – – – do/did; say/said; make/made; have/had keep/kept; kneal/knelt; learn/learnt… hide/hid; lead/led; beat/beat; cut/cut; bid/bid… 59% of irregular past tense forms end in d or t • In the PDP approach, such items participate in the regular pattern just as regular forms do. • In other approaches, they are often relegated to a separate system, thought to operate without reference to the rules, and therefore must be treated as though they are out-andout exceptions. Novel Regulars 48/72 only activated correct responses; 6 activated no response; these are the remaining 18 items Summary • The RM model can learn regulars and exceptions. • It correctly inflects most unfamiliar regular verbs, and makes over-regularization errors. • It also captures children’s tendency to produce occasional ‘irregularization’ responses and other signs of sensitivity to sub-regularities. • It produces a U-shaped developmental curve, like children. • It does all this in a single system without explicit rules, rule-acquisition mechanisms, or a separate lexicon of exceptions. Critique (Pinker and Prince, 1988) • Training regime unrealistic – Child’s experience is relatively constant over time. • Performance on regulars not good enough – Makes quite a few errors, some quite strange • Model can’t produce different past tenses for homophones – ring the bell, ring the city, wring the clothes • Wickelfeature representation has problems Reply: Conceptualizations are not implementations (MacWhinney and Leinbach) • Included semantic as well as phonological input – Allows different outputs for cases of same phonology but different meanings • Used a different input representation that led to better performance on regulars – Right-justified slot-based representation • Model learned the regular pattern quite well – Concept of ‘Condensing the Regularities’ • Did not address U-shaped curve Plunkett and Marchman • Trained with a corpus modeled after English: – ‘Stationary’ frequency-weighted training environment – Exceptions: Small number of hf ‘arbitraries’ others come in clusters and share some features with the regular pattern (most stem phonemes are preserved in the past tense) • Found ‘micro-U’ shaped patterns during learning: – Performance on a given item can vacillate so that correct responses precede incorrect responses. – This is consistent with the actual pattern seen in most children, where overregularizations are infrequent. • Suggested that properties of networks actually offer an explanatory basis for understanding U-shaped development and the distribution of word forms in the language: – Micro-U reflects competing changes in weights – E.g. arbitraries very difficult to learn; must be hi freq and few in number – no change and vowel change items are much easier, need not be of high frequency. Pinker (1991, and elsewhere) • Noted that performance on exceptions does show some signs of exhibiting features like those seen in the RM model. E.g., there is some similarity-based generalization, so that forms occasionally ‘join’ irregular clusters (e.g., kneel-knelt, gling-glang). • Proposed a dual mechanism account in which there is one system that uses categorical, ‘algebraic’ rules insensitive to item properties, and another that uses an ‘associative memory mechanism’ much like the RM model. • With Marcus, developed the notion that the rule is completely insensitive to semantic and phonological factors, depending only on the form-class of the stem. • Has suggested in many places that the past tense rule is acquired ‘suddenly’ in a ‘Eureka Moment’ • Pinker (with Ulman) claimed that brain-damage can produce a specific deficit in use of the regular inflection. • He also argued that a familial genetic defect can lead to a deficit in use of the regular inflection. Reply to Pinker (see McClelland and Patterson paper in TiCS exchange) • • • • • Past tenses are acquired gradually, not suddenly. This is true of other inflectional forms as well and does not suggest the sudden acquisition of a rule in a Eureka moment. Initially children learn typical cases and gradually generalize to other cases. The tendency to regularize and the tendency to irregularize are affected by phonological and semantic similarity to known items of both types. – According to Pinker, this should happen for irregular items only. – The individuals with these disorders have difficulty with complex phonological forms, and their difficulty with regular past tenses disappears if you control for phonological complexity. For both the effects of brain damage and the genetic anomaly, the evidence supports to view that the deficit is phonological, not a matter of rules. Pinker’s approach fails to address the quasi-regularity in exceptions, and therefore misses much of what is systematic in language.