Acquisition of speech sounds,
segmenting the phonetic string
The target
• Arbitrary relationship between sound and
• Abstract concepts, displacement from here
and now
• Hierarchical organisation, compositionality:
• Astronomical (perhaps infinite) number of
form-meaning pairings
• Sketchiness: only some of the intended meaning is expressed
explicitly, the rest has to be inferred by the listener
– Done.
• Ambiguity: words, structures, whole sentences may be
ambiguous – relevant meaning needs to be recovered from
– Drunk gets nine months in violin case (Pinker 1984)
• Cultural embeddedness: communication requires implicit
understanding of the intentions of conversation partners,
social habits
– She had her leg waxed vs. She had her leg broken
• Transmissibility: messages are transmitted rapidly and
relatively errorlessly – processing is done on-line using a
variety of sources of information
The input
Varies across cultures
No systematic instruction in most cultures
Teaching at best through correction of errors
Syntactic rules are never explained, most
people are not aware of them
Speech sounds
• Set of acoustic cues identify speech sounds
• They are influenced by:
– Neighbouring sounds
– Overall utterance prosody
– Speech rate
– Speaker voice quality
– Environmental noise, etc.
Phonemes and allophones
• Phoneme: speech sounds that distinguish
words (sinner vs. singer)
• Allophones: phonetically different speech
sounds that do not distinguish words (enged)
• Every language has its own phoneme
• Categorical perception: we cannot hear
“meaningless” differences in sounds. Sounds
are automatically categorised.
19th century
Head-turn preference paradigm
Light from left or right
Auditory stimulus from same area
Length of looking measured
Results of experiments: language discrimination
• Newborns can distinguish their native language from other
languages of a different rhythm (Mehler et al 1988)
• They can also discriminate between two unfamiliar
languages of different rhythms (Mehler & Christophe 1998,
Nazzi et al 1998)
• But at 2 months of age they stop discriminating between
unfamiliar languages (Mehler et al 1988, Christophe &
Morton 1998)
• In the first few months discrimination is based on rhythm
and intonation
• At 5 months, infants can also rely on individual speech
sounds (they can distinguish languages with similar
intonation patterns, such as Dutch and English) (Nazzi et al
• By 9 months, infants prefer to listen to
unknown words not violating the phonotactic
patterns of their language (e.g., pottle vs.
• They also prefer nonsense words with
frequent sound combinations
Results of experiments: phoneme discrimination
• Categorical perception at 1 month of age (Eimas
• Confirmed by ERP studies: odd-ball detection
paradigm (electrophysical response to novel
auditory stimulus) (Cheour-Luhtanen et al 1995)
• For the first few months, infants can distinguish
phonemes in any language, but by age 6-8
months, they are only sensitive to contrasts in
their native language (Kuhl et al 2006)
• At 9 months, infants can re-learn to
discriminate phonemic contrasts in foreign
languages (Kuhl et al 2003)
– after twelve 25 minute reading sessions
Kuhl: The linguistic genious of babies
Possible learning mechanisms
• Top-down from word meaning: phonemes distinguish
– But 10 month-olds have a very small vocabulary
• Distributional learning:
– non-contrastive sound pairs (allophones) occur in
complementary distribution – e.g. English vowel
– systematic phonetic variation which is not contextually
predictable is contrastive in the target language
• Categories influence perception: items in the same
category are perceived to be more similar than items
across category boundaries
• There are no universal syllables
• Languages differ widely in their rhythmic
• Only subtle cues to word boundaries (segment
duration, allophonic variation, phonotactics), but
they are neither obvious nor universal
• Segmentation errors are common in adult speech
• Top-down processes seem to play a major role for
whereare the s
Saffran 2003
Distributional evidence
• Transitional probabilities
• pretty baby, pretty dress, pretty doll
-> pre+ty highly probable (within word), ty+ba less probable (across word
• Nonsense language learning experiments (Saffran et al 1996, Aslin,
Saffran & Newport, 1998):
In training phase 8 month-olds listen to uninterrupted sequence of 3syllable words: bidakupadotigolabubidaku
In test phase they listen longer to nonwords (kupado) than to words
• Works with musical notes (Saffran, Johnson, Aslin & Newport 1999)
• Tamarin monkeys are as sensitive as human children (Hauser,
Newport & Aslin, 2001)
• Infants under 6 months cannot reliably identify these word
Statistical learning
• Implicit, procedural learning of probabilistic
co-occurrence patterns
• based on invariance detection through
attention to relatively stable patterns or
regularities (Gogate & Hollich 2010)
Prosodic properties as cues
• Noise detection experiments
6 month-olds are slower to detect noise if it interrupts two-syllable
sequences with typical stress pattern (Morgan 1996)
• 9 month-olds prefer words with typical stress pattern
(Jusczyk 1997, 1999, 2002)
• they listened to isolated words with typical (kingdom, hamlet)
vs. atypical (device, align) stress pattern
• then they listened to passages with or without the test words
• they listened longer to passages with the typical words
• Frequency of sounds varies by environment: word initial,
word medial, across word boundaries. 9 month-olds are
sensitive to these patterns (Mattys & Jusczyk 2001, Brent &
Cartwright 1996)
