Rules or Connections in Past Tense
Psychology 209
February 4, 2013
Is there a past tense rule?
• Early on, children often produce exceptional
past tenses correctly (went, took, etc).
• But at some point, they also produce
• Also, children (and adults) produce ‘regular’
inflections for novel items when prompted, as
this man is ricking… yesterday he ____.
• This was once taken as suggesting that young
children discover ‘the past tense rule’.
• The fact that children learn exceptions was
explained by ‘memorization’ or ‘lexical lookup’.
An Alternative to Assuming that Children
‘Acquire’ the Past-Tense Rule
• Rumelhart and McClelland proposed that the past tense
reflects regularities captured in the connections among units
in a connectionist system that learns from examples.
• We demonstrated this by implementing and running a simple
model of how people perform the task.
• The resulting debate has been fierce… and there are many
who still think our approach is misguided.
• But as I’ll try to show the evidence appears to be consistent
with our perspective.
• The work illustrates how models influence empirical research;
how easy it is to get misleading results in experiments; and
how important it is for researchers on each side of an issue to
follow up on each other’s findings.
• The RM model, introducing the connectionist
• Early critiques and responses lead to…
• The Pinker symbolic, dual mechanism account
• PDP approach argues that a single-system
connectionist approach has advantages over
the dual mechanism approach
• The exchange that was published in TiCS
(2002) is available in the readings – your can
read about the debate and judge for yourself.
• The issues have not been pursued as
intensively since then.
Training and Testing Procedure
• Training:
– Present WF pattern representing present tense of verb.
– Compute WF pattern representing past tense of verb using
stochastic sigmoid activation function.
– Compare computed past-tense pattern to correct past tense
– Adjust connections using Perceptron Convergence Procedure:
• Increase strength of connection from active input units to
output units that should be active but are not.
• Decrease strength of connection from active input units to
output units that should not be active but are.
• Testing:
Present WF pattern of present tense of verb.
Compute WF pattern.
Compare to various alternatives on various measures.
OR: Generate output using fixed decoding net.
Training Regime
• First ten epochs use 10 most frequent words
– Feel, have, make, get, give, take, come, go, look, need
• Remainder of training uses 10 most frequent
plus 400 words of ‘middle frequency’
• Each word is presented once per epoch
• An additional 84 lower-frequency words are
saved for generalization testing
Recapitulation of U-shaped learning
Varieties of Regular Past-Tenses
Types of Irregular
Responses to t/d
verbs and other
with Novel
• The tendency for forms that are irregular to
share features of regular form
do/did; say/said; make/made; have/had
keep/kept; kneal/knelt; learn/learnt…
hide/hid; lead/led; beat/beat; cut/cut; bid/bid…
59% of irregular past tense forms end in d or t
• In the PDP approach, such items participate in
the regular pattern just as regular forms do.
• In other approaches, they are often relegated
to a separate system, thought to operate
without reference to the rules, and therefore
must be treated as though they are out-andout exceptions.
48/72 only activated
correct responses;
6 activated no response;
these are the remaining
18 items
• The RM model can learn regulars and
• It correctly inflects most unfamiliar regular
verbs, and makes over-regularization errors.
• It also captures children’s tendency to produce
occasional ‘irregularization’ responses and
other signs of sensitivity to sub-regularities.
• It produces a U-shaped developmental curve,
like children.
• It does all this in a single system without
explicit rules, rule-acquisition mechanisms, or
a separate lexicon of exceptions.
Critique (Pinker and Prince, 1988)
• Training regime unrealistic
– Child’s experience is relatively constant over time.
• Performance on regulars not good enough
– Makes quite a few errors, some quite strange
• Model can’t produce different past tenses for
– ring the bell, ring the city, wring the clothes
• Wickelfeature representation has problems
Reply: Conceptualizations are not
implementations (MacWhinney and Leinbach)
• Included semantic as well as phonological
– Allows different outputs for cases of same phonology but
different meanings
• Used a different input representation that led
to better performance on regulars
– Right-justified slot-based representation
• Model learned the regular pattern quite well
– Concept of ‘Condensing the Regularities’
• Did not address U-shaped curve
Plunkett and Marchman
• Trained with a corpus modeled after English:
– ‘Stationary’ frequency-weighted training environment
– Exceptions:
Small number of hf ‘arbitraries’
others come in clusters and share some features with the regular pattern
(most stem phonemes are preserved in the past tense)
• Found ‘micro-U’ shaped patterns during learning:
– Performance on a given item can vacillate so that correct responses
precede incorrect responses.
– This is consistent with the actual pattern seen in most children, where overregularizations are infrequent.
• Suggested that properties of networks actually offer an
explanatory basis for understanding U-shaped development
and the distribution of word forms in the language:
– Micro-U reflects competing changes in weights
– E.g. arbitraries very difficult to learn; must be hi freq and few in number
– no change and vowel change items are much easier, need not be of high
Pinker (1991, and elsewhere)
• Noted that performance on exceptions does show some
signs of exhibiting features like those seen in the RM
model. E.g., there is some similarity-based generalization,
so that forms occasionally ‘join’ irregular clusters (e.g.,
kneel-knelt, gling-glang).
• Proposed a dual mechanism account in which there is one
system that uses categorical, ‘algebraic’ rules insensitive to
item properties, and another that uses an ‘associative
memory mechanism’ much like the RM model.
• With Marcus, developed the notion that the rule is
completely insensitive to semantic and phonological
factors, depending only on the form-class of the stem.
• Has suggested in many places that the past tense rule is
acquired ‘suddenly’ in a ‘Eureka Moment’
• Pinker (with Ulman) claimed that brain-damage can
produce a specific deficit in use of the regular inflection.
• He also argued that a familial genetic defect can lead to a
deficit in use of the regular inflection.
Reply to Pinker (see McClelland and
Patterson paper in TiCS exchange)
Past tenses are acquired gradually, not suddenly. This is true of
other inflectional forms as well and does not suggest the sudden
acquisition of a rule in a Eureka moment.
Initially children learn typical cases and gradually generalize to
other cases.
The tendency to regularize and the tendency to irregularize are
affected by phonological and semantic similarity to known items
of both types.
According to Pinker, this should happen for irregular items only.
The individuals with these disorders have difficulty with complex phonological
forms, and their difficulty with regular past tenses disappears if you control for
phonological complexity.
For both the effects of brain damage and the genetic anomaly,
the evidence supports to view that the deficit is phonological, not
a matter of rules.
Pinker’s approach fails to address the quasi-regularity in
exceptions, and therefore misses much of what is systematic in