A Cross Disciplinary Look at Statistics and Grounding in Human Lexical Learning Harlan D. Harris and James S. Magnuson Introduction

A Cross Disciplinary Look at Statistics and Grounding in Human Lexical
Learning
Harlan D. Harris and James S. Magnuson
Department of Psychology
Columbia University
New York, NY 10027
{harlan,magnuson}@psych.columbia.edu
Introduction
As the role of learning in complex intelligent systems
has become more prominent in Artificial Intelligence, researchers have become more interested in the topic of lexical learning. Their concerns have included: how to recognize new words in speech recognition tasks (Bazzi &
Glass 2000), how to extract lexical semantics from corpora (Boguraev & Pustejovsky 1996), how to link lexical
semantics to constrained semantic representations (Thompson & Mooney 2003; Siskind 1996) and how to ground
new lexical items in the agent’s environment (Steels 1996;
Roy 2003).
Psychologists have also been concerned with this issue,
from a number of different viewpoints. Developmental psycholinguists have asked questions about word segmentation (Cairns et al. 1997; Christiansen, Allen, & Seidenberg
1998), influences on and from (proto)syntax (Pinker 1987),
innate biases on semantics (Imai & Gentner 1997), the rapid
pace of word learning in children (Carey & Bartlett 1978),
and other topics. Psycholinguists studying adults have also
investigated word learning, particularly in the domain of
second-language learning (Mitchell & Myles 1998).
There are likely a number of ways in which wisdom developed in each of these fields could be beneficially transfered. Here, we will briefly review some recent important
findings in the literature of lexical acquisition, then outline
an experiment being conducted that will investigate the role
of statistical pattern extraction, attention, and reference on
lexical representations, and conclude by discussing the results that we may find and their relevance to natural language
in AI and NLP.
Psychological Issues in Word Learning
The process by which infants and young children, and indeed adults, learn language is complex and filled with controversy. Questions related to grammar acquisition are directly related to fundamental issues of human cognition such
as the extent of innateness and the unique properties that
make us human (Pinker 1994; Elman et al. 1996). The acquisition of words and their meanings has been a major topic
for many years, driven by a number of remarkable facts.
c 2004, American Association for Artificial IntelliCopyright gence (www.aaai.org). All rights reserved.
By the end of their first 12 months of life, a typical infant
can clumsily produce a very small number of words. These
first few words are primarily proper names, common nouns,
and social words such as “hi” (Bloom 2000). By about 3
years of age, children seem to be learning up to 10 words
per day (Carey 1978), of all grammatical categories. Notably, the production vocabulary lags behind comprehension
throughout acquisition (Dapretto & Bjork 2000). Also, the
use of newly-learned lexical items is often limited to particular structures. Tomasello and others have shown that while
a two-year-old child may produce many different verb constructions (intransitive, with instrument, etc.), for any particular verb only one or two constructions may be produced
(Tomasello 1992; Pine, Lieven, & Rowland 1998).
Current controversies include the role of syntax on semantic acquisition and vice-versa (Pinker 1987; Naigles &
Hoff-Ginsberg 1995; Tomasello 2000; Fisher 2002), and the
topics in question here, the role of general-purpose statistical processes in word learning, and the processes involved
in extracting lexical semantics from a scene.
In recent years, psychologists have become increasingly
interested in evaluating the power of statistical and connectionist techniques to account for data that previously
was viewed as reflecting algebraic processes (Christiansen
& Curtin 1999). Saffran and other researchers have performed a series of experiments showing that infants are
able to notice statistical patterns in series of auditorilypresented pseudo-words, and have suggested that this statistical knowledge may be helpful in segmenting sentences
into words (Saffran, Aslin, & Newport 1996; Christiansen,
Allen, & Seidenberg 1998) (but see Marcus et al. 1999).
After listening to several minutes of a synthesized voice
monotonously producing syllables, where the sequence is
comprised of a small number of randomly repeated threesyllable words (“budopa bikuti...”), pre-linguistic infants
were able to discriminate (as measured by proportions of
head-turns to test stimuli) between new sequences of these
words and a list consisting of the same syllables in random
order. In other words, infants are remarkably sensitive to
syllable transition probabilities in a process that later work
suggests may be being used to extract lexical candidates for
learning (Saffran 2001).
However, the precise role of this statistical process remains unclear. Naigles recently contrasted the conclusions
drawn by researchers working with pre-linguistic infants
(e.g., Saffran, Marcus) with those drawn by older children (e.g., Tomasello) (Naigles 2002). Whereas infants
can use statistical information to abstract relatively-complex
grammar-like patterns from syllable sequences (Marcus et
al. 1999), older children are unable to abstract verb usage to
multiple syntactic frames (Tomasello 1992). Naigles suggests that infants find it easy to extract the patterns from
the acoustic stimuli, but that linking (grounding) these new
forms with meaning is difficult, requiring cognitive and
other skills that do not develop until the second year.
The complexity of determining meaning of a novel word
given only context has been noted for at least 50 years. If a
speaker says “gavagai” while pointing at a rabbit, does the
word mean “rabbit,” “running,” “white,” “tasty,” or something else (Quine 1964)? How do we know what to represent
in our heads, and how do we know what in that representation should be linked to the word (Johnson-Laird 1987)? Researchers have demonstrated the importance of shared attention (Tomasello & Todd 1983), dialogue (Golinkoff 1986),
previous experience with word learning (Smith et al. 2002;
Regier 2003), and theory of mind (O’Neill 1996) toward resolving reference. Gilette et al. (1999), in a clever set of
experiments, examined the sorts of context required to identify the intended referent of an unknown word. Those results
showed that only simple concrete nouns are learnable without additional linguistic context.
Although it seems quite possible that the output of statistical processes are available for the creation of lexical items
(Saffran 2001), the nature of the processes that link these
items to semantics are by no means entirely clear, either in
children, where they are still developing, or even in adults.
The experiment described below examines lexical learning in adults, using different experimental conditions to look
at what sorts of reference and attention are required in order
for the output of the statistical process to affect the lexicon.
Work in Progress
One of the most ubiquitous findings in cognitive psychology
is that increased familiarity with a stimulus (due to increased
frequency, say) results in faster reactions to that stimulus.
For example, subjects name pictures of objects with high
frequency names more quickly than they name pictures of
objects with low frequency names (Oldfield & Wingfield
1965).
In a study currently in progress, we are using reaction time
to examine the relationship between the lexicon and statistical information available in the segmentation tasks used by
Saffran et al. 1996. Subjects are first explicitly taught a set of
new words describing novel objects, with some words presented more frequently and some less frequently (Magnuson
et al. 2003). Then, in the statistical segmentation phase, subjects listen to a continuous string of nonsense words, including some tokens of the newly learned low-frequency words,
while performing different tasks. The experimental conditions are distraction (not paying attention), attention (asked
to pay attention), segmented (explicitly given cues to word
boundaries), and referential (“grounded” – provided with visual objects associated with each lexical item). Then, sub-
jects are tested on the words they explicitly learned and their
reaction times are measured. To the extent that the items
present in the continuous string of nonsense words are linked
to the items in the lexicon, those items will have become
high-frequency. The reaction time measure is sensitive to
this frequency difference, allowing an evaluation of the importance of each of the experimental conditions to the representation of the lexical items. A final step explicitly asks
the subjects to evaluate syllable sequences as being present
or not present in the continuous stimuli, replicating earlier
studies (Saffran et al. 1999) and evaluating to what extent
segmentation by statistical pattern occurred in our task.
Discussion
Our hypothesis is that there will be significant effects of attention condition on the results in the eye-tracking phase. It
may be, for example, that subjects will be sensitive to the
distributional patterns, but that there is no interaction between learning in the statistical segmentation and explicit
lexical training components of the experiment unless subjects are directed to pay attention (look for words) in the statistical segmentation component. Furthermore, we predict
that the reference condition will show the strongest effect
of exposure, as some of the same mechanisms that underlie
word learning (e.g., linking of phonetic form to semantics)
will have been utilized in both the lexical training and segmentation portions of the experiment.
As researchers in AI and NLP expand upon previous work
in automatic lexical acquisition, using new techniques to
ground lexical items in a shared environment, an understanding of how humans link lexical items to semantics is
likely to be valuable. For example, results from experiments such as ours can help establish how statistical segmentation processes interact with representations in the lexicon. Other sorts of results will be able to provide, for example, biases useful for category formation for new lexical
items (Markman 1989) or information about how humans
act when teaching new words to other agents (be they children, adults, or computers) (Fisher & Tokura 1995). When
intelligent agents interact with people in shared environments, word learning will be one of many important skills
agents will need to perform. And as with other domains in
the cognitive sciences, there is little doubt that as practical
intelligent lexical-learning agents are developed, the lessons
learned will be informative to psychologists as well.
Acknowledgments
The work described here is funded by an NIH grant.
References
Bazzi, I., and Glass, J. 2000. Modeling out-of-vocabulary
words for robust speech recognition. In Proceedings of
the International Conference on Spoken Language Comprehension, 401–404.
Bloom, P. 2000. How Children Learn the Meanings of
Words. Cambridge, MA: MIT Press.
Boguraev, B. K., and Pustejovsky, J. 1996. Corpus Processing for Lexical Acquisition. Cambridge, MA: MIT
Press.
Cairns, P.; Shillcock, R.; Chater, N.; and Levy, J. 1997.
Bootstrapping word boundaries: A bottom-up corpusbased approach to speech segmentation. Cognitive Psychology 33:111–153.
Carey, S., and Bartlett, E. 1978. Acquiring a single new
word. Papers and Reports on Child Language Development 15:17–29.
Carey, S. 1978. The child as word learner. In Halle, M.;
Bresnan, J.; and Miller, J. A., eds., Linguistic theory and
psychological reality. Cambridge, MA: MIT Press.
Christiansen, M. H., and Curtin, S. L. 1999. The power
of statistical learning: No need for algebraic rules. In Proceedings of the 21st Annual Conference of the Cognitive
Science Society. Mahwah, NJ: Cognitive Science Society.
Christiansen, M. H.; Allen, J.; and Seidenberg, M. S.
1998. Learning to segment speech using multiple cues: A
connectionist model. Language and Cognitive Processes
13:221–268.
Dapretto, M., and Bjork, E. 2000. The development of
word retrieval abilities in the second year and its relation to
early vocabulary growth. Child Development 71:635–678.
Elman, J. L.; Gates, E. A.; Johnson, M. H.; KarmiloffSmith, A.; Parisi, D.; and Plunkett, K. 1996. Rethinking
Innateness: A Connectionist Perspective on Development.
Cambridge, MA: MIT Press.
Fisher, C., and Tokura, H. 1995. The given/new contract
in speech to infants. Journal of Memory and Language
34:287–310.
Fisher, C. 2002. The role of abstract syntactic knowledge
in language acquisition: A reply to Tomasello (2000). Cognition 82:259–278.
Gilette, J.; Gleitman, H.; Gleitman, L.; and Lederer, A.
1999. Human simulations of vocabulary learning. Cognition 73:135–176.
Golinkoff, R. 1986. ”I beg your pardon?” The preverbal
negotiation of failed messages. Journal of Child Language
13:455–476.
Imai, M., and Gentner, D. 1997. A cross-linguistic study
of early word meaning: Universal ontology and linguistic
influence. Cognition 62:169–200.
Johnson-Laird, P. N. 1987. The mental representation of
the meaning of words. Cognition 25:189–211.
Magnuson, J. S.; Tanenhaus, M. K.; Aslin, R. N.; and Dahan, D. 2003. The time course of spoken word learning
and recognition: Studies with artificial lexicons. Journal
of Experimental Psychology: General 132(2):202–227.
Marcus, G. F.; Vijayan, S.; Rao, S. B.; and Vishton, P. M.
1999. Rule learning in 7-month-old infants. Science
283:77–80.
Markman, E. 1989. The internal structure of categories. In
Markman, E., ed., Categorization and naming in children.
Cambridge, MA: MIT Press. 39–63.
Mitchell, R., and Myles, F. 1998. Second Language Learning Theories. Edward Arnold.
Naigles, L., and Hoff-Ginsberg, E. 1995. Input to verb
learning: Evidence for the plausibility of syntactic bootstrapping. Developmental Psychology 31:827–837.
Naigles, L. R. 2002. Form is easy, meaning is hard: resolving a paradox in early child language. Cognition 86:157–
199.
Oldfield, R. C., and Wingfield, A. 1965. Response latencies in naming objects. The Quarterly Journal of Experimental Psychology 17:273–281.
O’Neill, D. 1996. Two-year-old children’s sensitivity to
a parent’s knowledge state when making requests. Child
Development 67:659–677.
Pine, J.; Lieven, E.; and Rowland, C. 1998. Comparing
different models of the development of the English verb
vocabulary. Linguistics 36:807–830.
Pinker, S. 1987. The bootstrapping problem in language
acquisition. In MacWhinney, B., ed., Mechanisms of language acquisition. Hillsdale, NJ: Erlbaum. 399–432.
Pinker, S. 1994. The Language Instinct. New York:
HarperPerennial.
Quine, W. V. O. 1964. Word and Object. MIT Press.
Regier, T. 2003. Emergent contraints on word-learning:
a computational perspective. Trends in Cognitive Sciences
7(6):263–268.
Roy, D. 2003. Grounded spoken language acquisition:
Experiments in word learning. IEEE Transactions on Multimedia 5(2):197–209.
Saffran, J. R.; Aslin, R. N.; and Newport, E. L. 1996. Statistical learning by 8-month-old infants. Science 274:1926–
1928.
Saffran, J. R.; Johnson, E. K.; Aslin, R. N.; and Newport,
E. L. 1999. Statistical learning of tone sequences by human
infants and adults. Cognition 70:27–52.
Saffran, J. R. 2001. Words in a sea of sounds: The output
of infant statistical learning. Cognition 81:149–169.
Siskind, J. M. 1996. A computational study of crosssituational techniques for learning word-to-meaning mappings. Cognition 61:39–91.
Smith, L. B.; Jones, S. S.; Landau, B.; Gershkoff-Stowe,
L.; and Samuelson, L. 2002. Object name learning provides on-the-job training for attention. Psychological Science 13(1).
Steels, L. 1996. Perceptually grounded meaning creation.
In Tokoro, M., ed., Proceedings of the International Conference on Multi-Agent Systems, 338–344. Menlo Park,
CA: AAAI Press.
Thompson, C. A., and Mooney, R. J. 2003. Acquiring
word-meaning mappings for natural language interfaces.
Journal of Artificial Intelligence Research 18:1–44.
Tomasello, M., and Todd, J. 1983. Joint attention and
lexical acquisition style. First Language 4:197–212.
Tomasello, M. 1992. First verbs: a case study of early
grammatical development. Cambridge: Cambridge University Press.
Tomasello, M. 2000. Do young children have adult syntactic competence? Cognition 74:209–253.