INSTITUTE FOR RESEARCH IN COGNITIVE SCIENCE UNIVERSITY OF PENNSYLVANIA

advertisement
INSTITUTE FOR RESEARCH IN
COGNITIVE SCIENCE
UNIVERSITY OF PENNSYLVANIA
Pinkel Lecture
SPEAKER INTRODUCTION
DR. JOHN TRUESWELL: Good afternoon and welcome to the 12 t h Annual
Benjamin and Anne Pinkel Lecture. I'm Professor John
Trueswell. I'm the Director of the Institute for Research in
Cognitive Science here at Penn. And this is, as I've just
said, the 12 t h in the series of Pinkel lectures. And the
Pinkel endowed lecture series was established 12 years ago
through a generous gift from Sheila Pinkel on behalf of the
estate of her parents Benjamin and Ann Pinkel.
The series serves as a memorial tribute to their lives.
Benjamin Pinkel received a Bachelor's degree in Electrical
Engineering here at Penn in 1930. And throughout his life he
was actively interested in the philosophy of the mind and
published a monograph in 1992 on the subject entitled
Consciousness, Matter and Energy: The Emergence of Mind in
Nature. In fact we have a copy right here which will be a
gift to our speaker. The objective of the book was and I
quote "a reexamination of the mind-body problem in light of
new scientific information". The lecture series is intended
to advance the discussion and rigorous study of the deep
questions which engaged Dr. Pinkel's investigations.
And over the past 12 years the series has brought some of the
most interesting minds in the field of cognitive science as
it pertains to thought, learning and consciousness. These
include Daniel Dennett, Liz Spelke, Martin Nowack, Stan
Dehaene, Geoff Hinton, Ray Jackendoff, Colin Camerer, Elissa
Newport, Christof Koch, Alvaro Pascual-Leone and Alvaro was
last year.
It's a great pleasure to add to this list, this esteemed
list, Dr. Patricia Kuhl, who will be speaking about a
cracking the speech code, language and the infant brain. Now
Dr. Kuhl is the Bezos Family Foundation endowed Chair for
Early Childhood Learning at the University of Washington.
She's co-director of the Institute for Learning and Brain
Sciences and director at the University of Washington's NSF
Science of Learning Center. And she's also a professor of
speech and hearing sciences.
She's known internationally for her work on early language
development and its neural underpinnings. She's perhaps best
known for her research demonstrating that early exposure to a
language greatly alters how infants perceive and process
INSTITUTE FOR RESEARCH IN COGNITIVE SCIENCE
UNIVERSITY OF PENNSYLVANIA
Pinkel Lecture
1
speech. Dr. Kuhl's many years of research have aptly
demonstrated and I quote here from one of her papers "infants
are born citizens of the world with regard to language. They
can distinguish sounds from languages around the world even
if they've never heard them before. By the end of the first
year of life however they become language specialists and the
ability to attend to sounds from foreign languages greatly
diminishes as their native language abilities significantly
increase."
This is truly groundbreaking work that has shaped a
generation of speech perception researchers, cognitive
scientists and cognitive neuroscientists. And for this she
has been internationally recognized in many ways. She's a
member of the American Academy of Arts and Sciences, the
Rodin Academy, the Norwegian Academy of Sciences and Letters.
She was awarded the Silver Medal of the Acoustical Society of
America in 1997. And in 2005 the Kenneth Craik Research
Award from Cambridge University. She's a Fellow of the
American Association for the Advancement of Science, the
Acoustical Society of America, and the American Psychological
Society. And in 2008 in Paris, Dr. Kuhl was awarded the Gold
Medal for the Acoustics Branch of the American Institute of
Physics. It's truly an honor to have Dr. Kuhl here as the
12 t h Pinkel lecturer. So please give a warm welcome to Dr.
Kuhl.
[Applause]
INSTITUTE FOR RESEARCH IN COGNITIVE SCIENCE
UNIVERSITY OF PENNSYLVANIA
Pinkel Lecture
2
LECTURE
DR. PATRICIA KUHL: I was very excited to be here today and very
interested to learn more about Professor Pinkel. I had no
idea that he was prescient in terms of telling us how
important study of the mind was going to be. It's a great
pleasure to be here and see all of my old colleagues and talk
about something I really love and that is the science of
learning with regard to language and the human brain.
Humans' ability for language has stunned many great
scientists for centuries. And I think that the theories are
moving along. A lot has been discovered since we actually
started studying babies in the early 70s. Early theorists
had not. What we've uncovered in these studies is quite
stunning and leads us to sort of revise the model by which we
imagine children are going about their task every day as they
learn a language or two or three.
We've always been interested in the extent to which language
relies on special machinery and relies on special
psychological processes. And that too is under revision as
we look at new experiments that involve from the very
beginning infant's social skills and cognitive skills. So
we'll have a lot to say about that today. It's very
interesting that the infant data are playing a very
significant role in the debate about language.
So I'm going to start today by showing what I think is a very
interesting and mysterious process, what biologists will call
a critical period in development. And language is one of the
quintessential examples when biologists talk to people and
want to give an example from human development rather than
animals, they point to language and say there is a critical
period with regard to learning. If you find your age on the
bottom of this curve and look at your skill, this cartoon
sort of illustrates what the phenomenon is all about.
As opposed to many other things in the psychological
literature, adults are not better at the acquisition of a
second language. They're actually worse and substantially
worse. And this curve which actually is a cartoon kind of
representing the data shown in a paper by Johnson and Newport
some time ago summarizes a lot of literature having to do
with syntactic performance in a second language and
phonological performance in a second language.
INSTITUTE FOR RESEARCH IN COGNITIVE SCIENCE
UNIVERSITY OF PENNSYLVANIA
Pinkel Lecture
3
And what it illustrates is that between 0 and 7 the kids are
brilliant regardless of how many languages you put in front
of them, that they're exposed to, they will acquire those
languages with great skill. And beyond the age of about 7
there's a systematic decline. Decline can be seen at 8 to 10
years, at 11 to 15 years, and 17 to 39. I guess that's most
of us in this room, maybe all of us in this room, we kind of
drop off the map.
It isn't as though you can't learn a second language but you
do it differently and it is not as automatic and particularly
with regard to the subtleties of syntax and phonology, with
the production and perception of speech in particular, you
never become the expert that you would have become had you
been exposed in the first 7 years.
Now in addition to this being a fairly depressing curve to
start out a talk with regard to demonstrating your lack of
skill, it provides a really fundamental puzzle that a lot of
people have grappled with. No one really disputes this
curve. We know that the curve is made up of little curves.
There are curves in here that you could draw for phonological
learning, for word learning and for syntactic learning where
the infants and children seem to specialize in that aspect of
language learning. But the broad curve is not disputed.
What is under great debate is what causes it. What's it
about?
Leninburg had a hypothesis early on that it had to do with
brain and the development of the corpus callosum. And that
everything changed with regard to the brain once the corpus
callosum was in place which happens at about the age of 5.
But that's no longer what we consider to be the most
interesting hypothesis. We're working on what's going on in
the infant brain at that period in development. We're
hypothesizing that it's learning itself that causes the
failure to learn later. That the neural architecture, the
composition of the neural networks that develop as you're
exposed to the signals coming from a particular language, its
auditory patterns, its statistical properties, builds
networks that are then resistant. Your networks for Japanese
simply don't fit French. So once that is established the
learning itself reduces the potential for you to learn later.
So that's one of the things that we're trying to entertain.
INSTITUTE FOR RESEARCH IN COGNITIVE SCIENCE
UNIVERSITY OF PENNSYLVANIA
Pinkel Lecture
4
We're spending our time studying the very earliest processes.
Most of what I'll tell you about today has to do with what
babies are doing in the first year of life. So we're looking
very early on to see what the markers are. We think that we
can play out major theories with regard to language
development by looking at the phonetic level, the earliest
processing of sounds from different languages and how that
points the infant on a particular path.
The reason for doing it is, one, we hope to learn what the
magic is that kids are putting to work here that we can't do
here. And potentially, potentially, if we understood the
algorithms that they put to work we might imagine being able
to invent training programs that would help adults learn in
the more automatic and beautiful way that kids learn. And
secondly, those of us who are interested in development
disabilities, this is where they happen. So kids who have
language disorders either stemming from Autism, Fragile X,
specific language impairment, all of these disorders have
their origins in development, some of them obviously genetic
impairments. But the hope is that with early diagnostics, so
I'm going to show you some examples of biomarkers that we
think will be effective in diagnosing Autism at about 6
months of age. So biomarkers getting in early to examine the
precursors for language might allow us to treat children with
developmental disabilities at a much earlier stage when the
brain is so plastic, when you're in this rapid period of
learning, so that we could catch them up before it became too
late to do so.
So in order to develop the arguments, we're going to tell you
a little bit about speech, all right? So I'm going to be
talking not about grammar but about the processing of sounds
that are used in language and how that occurs. What are the
major problems? So we start with a little bit of physics of
sound. And what you see in this graph is the posture of the
mouth and the tongue when you produce two isolated vowels.
And these are all phonetic transcription, Ah, and A. At the
bottom you see the physics of the situation where the major
components, when you listen to a vowel or any kind of sound
like Ah or A, what you're doing is tracing the formant
frequencies so this is a steady state version Aaah. AAA.
And you can see the distinction is in these resonant
frequencies that change very slightly between these two
vowels.
INSTITUTE FOR RESEARCH IN COGNITIVE SCIENCE
UNIVERSITY OF PENNSYLVANIA
Pinkel Lecture
5
Now this is a steady state situation which is not what I'm
doing right now, not what we walk around doing. We're
speaking rapidly so the steady state almost never occurs. So
you can imagine the millisecond changes that your auditory
system has to track in order for you to hear the difference
between Bah and Pah, or Ah, and Ee. You have to track these
formant frequencies in real time. They change constantly.
And we're talking about millisecond differences and small
differences in frequency and small differences in amplitude
and duration. So these are hard from the standpoint of
physics.
No computer in the world has solved the problem that human
minds do very early in development and that itself is a
tremendous puzzle given that there's a little software
company just across the lake from the University of
Washington in Seattle whose main, one of their main jobs, is
to crack the speech code. So Bill Gates and his team of say
500 or so researchers now, since they've been watching us and
trying to mimic with the computers what babies are doing,
they have yet to solve the problem of speech recognition by
machines. So there's something that we're bringing to this
task that isn't just the raw statistics of the situation that
a computer could solve.
So I want you to get one flavor of the problem by just
listening to the variability. Listen to the variability as a
number of different speakers produce the vowel Ah. [Sound
sample] Okay. So that's not difficult, right? You hear all
of the variations. You can tell whether someone's young or
old, male or female. But you know it's an Ah. Here's the
same speakers producing the vowel AA. [Sound sample]. Okay.
So this is the problem that Bill Gates' computer can't solve.
You and I do it at the drop of a hat. But the machine just
simply can't do this yet. The only speech recognition by
machine that exists at the moment are programs in which you
have to isolate each word and speak very slowly. So, you,
have, to, talk like this, because it can't sort out, you
know, it can't count the words, can't find the words, can't
identify the phonemes because there's so much variability.
But the kids knock this problem off at, you know, 6 months of
age.
So if we look at the data, this is the old data, one of the
first pieces of data about phonetic perception and this color
INSTITUTE FOR RESEARCH IN COGNITIVE SCIENCE
UNIVERSITY OF PENNSYLVANIA
Pinkel Lecture
6
is awfully hard to see but what we have here are two
syllables, Bah and Pah, this is a cartoon illustrating that
if we take one of the critical acoustic differences, change
it in small steps along a continuum, what you'll see, the
infant data shown by Imus [phonetic], et al, in Science 1971
was that babies are really keen, right here at the boundary,
the acoustic boundary between two phonemes. All of a sudden
at the boundary they get better at hearing small
distinctions. And we demonstrated a few years later that
this is a deep phylogenic ability. Right? So chinchillas
and later we demonstrated that monkeys have that same
tendency to break an auditory continuum right at the place
where language has put the boundary.
So there's something about our inheritance. There's
something about complex auditory signals. It seems to be
true for mammalian hearing that was capitalized on in the
invention of a sound structure for languages. And we can see
that across many different languages, babies are capable of
hearing these distinctions and animals are as well. But of
course that's not the only problem. Hearing this distinction
right here at the boundary is fantastic for the babies.
Animal data doesn't take anything away from a baby's ability
to do that. It's a great leg up on the problem when you can
hear these fine distinctions and you're particularly
sensitive here at the boundaries. But that's not all there
is.
So when Bill Gates looks at the problem with his computers,
this is what he sees. Now we're going to take formant one
and formant two, if you don't know anything about the
acoustics of sounds, of speech sounds, a vowel like Ah, is
like a chord on the piano. The notes compose a chord, a
unified chord. The formant frequencies, F1, F2, when put
together with three other formants produce a vowel. But the
problem is there's huge variability. So these are data taken
from a number of different speakers where each symbol in
here, these are phonetic symbols, each one is a different
talker.
You can see the huge variability and overlap in the vowels of
English. So you're going to [speaks the vowel progression]
in this graph. But the problem is if you're a computer, the
acoustic physical measurement itself doesn't tell you which
vowel it is because the overlap is so profound. Right? And
this is in isolated utterances; it's not in running speech
INSTITUTE FOR RESEARCH IN COGNITIVE SCIENCE
UNIVERSITY OF PENNSYLVANIA
Pinkel Lecture
7
like I'm doing.
So you can see why we're not all typing into--and we're all
still typing into our computers instead of using a
microphone. If this problem had been solved by Microsoft or
any other software company we would have long ago dispensed
with the keystrokes. We would be talking into computers and
controlling devices with our voices. But that's not a
reality in spite of millions of dollars being thrown at that
problem.
Now the other thing that's interesting from the standpoint of
language development is this same vowel triangle is used
across all languages. So Swedish shoves 16 vowels in this
space. And it's really a minefield to look at the production
of Swedish speakers. Japan's does as well only with 5 vowels
and Spanish with 5 vowels. But they also show, we're as
sloppy as we can be so Japanese speakers will use the whole
space and show overlap even in their 5 vowels. They don't
make it easier for the kids to acquire by speaking a language
that uses fewer vowels.
So herein lies the problem. The babies have the ability to
hear fine distinctions when you isolate everything in a
carefully controlled experiment. But in the real world
they're hearing all this variability and have to in order to
learn words. Decide how many categories does my language use
and which ones are they.
So we've started by studying the babies right from the
beginning and saying in studies across many countries in the
world what sounds can they distinguish in the beginning and
how does that, you know, transition to a set of sounds that
are exclusive to a language? So one of the techniques that
we pioneered, and our lab is about 10 years old, looking at
event related potential, so sort of old technology but very,
very improved in studying human cognition now.
Babies wear a cap with sensors, electrode sensors in it,
picking up the electrical activity as they listen to sound.
And in the standard procedure, babies are hearing a
background sound like Ah, Ah, Ah, and then on occasion they
will hear a different sound like Eeh. And we're looking at
brainwave changes between the repeated standard and the
deviant sound. And what you see plotted here is the mismatch
negativity. Mismatch negativity in adults and babies
INSTITUTE FOR RESEARCH IN COGNITIVE SCIENCE
UNIVERSITY OF PENNSYLVANIA
Pinkel Lecture
8
indicate which sounds the listener is sensitive to.
And this technique and I'll come back to with the sensitivity
of the brain measure over a behavioral measure a little bit
later, but here is the behavioral measure. We've used this
one for 30 years. This is a technique that has mom or a dad
holding the baby. Baby's around 6 months of age. And
there's an assistant playing with toys to the baby's right.
She has a box of toys and she's bringing them up quietly.
These are silent toys and keeping the baby's attention while
the loudspeaker is repeating the background sound.
So Ah, Ah, Ah, coming out of this one. And what the baby has
to do is while watching the toys be attentive to the sound
change because when it changes to anything else the baby's
got three seconds to turn their head to the black box and if
they do so at the right time and not the wrong time the black
box is animated and something fun inside lights up. So we
run control trials and experimental trials to make sure that-control trials you're not changing the sound. You're still
monitoring for head turns, right? Let's look at a baby in
this task.
[Runs video of head turning task]
All right. She's brilliant and she knows it. Okay. So any
place on the planet, we're set up in 8 countries to do these
tests, with any contrast that's been tested across all those
languages, babies show the ability to perform in the head
turn tasks such that they demonstrate their ability to
discriminate the sounds. So they need experience in
listening to the sounds. And that's why as John said I've
called the babies citizens of the world. They simply come
into the world prepared. And that's no trivial task when
these acoustic cues are so minute. They come into the world
ready for any language. And that distinguishes them from us,
right? We're culture-bound listeners. And we are very, very
good at the sounds of the languages we've experienced in the
first 7 years, not very good at all with the languages that
we have not been exposed to.
So the critical question is when do the babies change from
citizens of the world to the culture-bound listeners that
each of us are? And here the data are very, very
interesting. This is data from a study that we published in
2006 where we're looking at a contrast that's important for
INSTITUTE FOR RESEARCH IN COGNITIVE SCIENCE
UNIVERSITY OF PENNSYLVANIA
Pinkel Lecture
9
English but not for Japanese. So the Rah, Lah, distinction
in syllables that codes Rake and Lake and Rod and Lod and all
of the words that in English are distinguished by R and L but
are irrelevant to Japanese. So Japanese speakers produce and
perceive R and L as equivalent. We all do that with sounds
of other languages. Japanese people will often produce Rice
for Lice and so we notice it a lot--we're doing the same
thing when we listen and try to speak in other languages.
So what we see here is the Japanese and American babies are
equivalent at 6 to 8 months of age and then something
dramatic changes in the next 2 months. So this is head turn
data. The kids are all above chance, 65%, but then 2 months
later between 8 months and 10 months, something very dramatic
happens. So Janet Worker made this discovery that when
foreign languages were tested, babies failed, you know,
started to fail at about 10 months of age to discriminate
that foreign language contrast.
And we added this component in 2006, demonstrating that while
there is a change in performance on the nonnative, there's
also a change in performance on the native. Kids are mapping
that native contrast. We considered this a very important
finding. So we want to know obviously what are they doing
during that two months of time.
And the answer is going to be twofold. The answer is that
they're doing pretty fancy statistics. There's a
computational component to what it is that the kids are
doing. And there's also, interestingly, a social component.
And the argument I'm going to make in this talk is that the
social brain, the social component is gaining or guiding or
enabling the computational component. Okay. So I want to
unpack that a little bit.
Remember in the graph where I showed you there were tons of
examples that babies are exposed to in English. The Bill
Gates problem is the variability is huge. So if you look at
all instances, the range of sounds we make when we produce
the vowel E in English is huge. How is it that infants know
and they overlap, if it isn't just the raw occurrence of the
sounds, how do they know?
And the answer looks to be from studies of statistical
learning done by a variety of people, done in my lab, done in
Jessica May's lab, at the phonetic level and at the word
INSTITUTE FOR RESEARCH IN COGNITIVE SCIENCE
UNIVERSITY OF PENNSYLVANIA
Pinkel Lecture
10
level by Jen Saffron [phonetic], babies are literally taking
statistics on the input. They're very, very sensitive to
statistical structure and the form of distributional
frequency with regard to sounds. How frequently,
comparatively frequently, are sounds occurring in the
language they're hearing and they end up discriminating the,
you know, modal values. And when it comes to words so I
won't get back to that today, they're paying attention to
transitional probabilities between syllables.
So this was a new discovery in the 90s. It was finally an
answer to the learning problem. We didn't have a good
learning model. We had Skinnerian learning, reinforcement
learning which is totally inadequate with regard to
explaining anything about language development. It simply
wasn't that we were patting the babies on the back or giving
them M and M's for learning. It had nothing to do with it.
So we lacked a learning theory.
The statistical learning model provided a way to crack the
code statistically. If babies were sensitive to
distributional properties and those distributional properties
are there in input that's a potential leg up. Okay? So the
kinds of experiences that were done, the ones I did, were
fairly complex. What I was trying to do in these experiments
is compare American babies and Swedish babies at 6 months of
age with a variety of vowels. The green are English vowel E,
the yellow are Swedish vowel, front-rounded vowel OO. Okay?
And I was trying to mimic that, you know, that set of
properties that says there's great variability. How are
babies contending with the variability?
What we did is use the head turn task to take the prototype.
These were adult defined best instances from the categories,
play that as the background sound. One group of babies in
America in Seattle and in Stockholm heard the E English
prototype. The other half of the kids heard the OO
prototype. Once they listened to that background sound we
played all the alternative sounds as the, you know, test
trials. And to see which ones of the babies are turning
their heads to.
Are they able to organize a category presumably through
statistical learning, their sensitivity to distributional
properties? And what we demonstrated in 1992 was indeed by 6
months of age the babies who've been just laying in a crib
INSTITUTE FOR RESEARCH IN COGNITIVE SCIENCE
UNIVERSITY OF PENNSYLVANIA
Pinkel Lecture
11
or, you know, interacting with real people in either
Stockholm or Seattle had totally different perceptual
systems. By 6 months of age what we could see is the
American infant's response in head turn to the E vowels that
American English E as opposed to the Swedish OO was totally
different.
The higher the line the more generalize-ability from the
prototype to the neighboring vowels so what the babies were
doing was treating more of the E vowels as part of a category
and not so the native, the nonnative vowel. Swedish babies
just the opposite. They were organizing their vowels into
more of a category when compared to the nonnative sound for
them.
So we had a benchmark finding that's now been followed up by
many findings illustrating kids' tendencies either in shortterm laboratory experiments or in 6 months of listening to a
language that the brain is responding to the distributional
properties of the language they hear. So it said to us that
this form of statistical learning is important. It's part of
what they come with.
Now what I want to develop next is the role that social
interaction plays in this statistical learning. So I'm going
to develop two examples of social phenomena. It was
demonstrated in experiments. The way that we as the input to
the babies, right, we're talking to them all the time, what
are we doing to shape this process? So let's start with the
phenomenon called mother-ese or father-ese or parent-ese or
caretaker-ese, whichever one you think is most politically
correct.
We're all aware of the phenomenon if you or your spouse or
any person in your environment has ever talked to a baby in
your presence, you understand that what they do is something
kind of strange. They don't sound normal, typical, when they
speak to babies. And it looks, again, the physics of the
situation looks like this. You bring a mother or a father
into the laboratory and you record adult directed speech and
ID or infant directed speech and you plot the pitch of the
voice. You get these wildly different looking patterns.
The mother is talking to another adult and she said I had a
little bit and the doctor gave me Benedictine [phonetic] for
it. It's not boring. It doesn’t sound any different than I
INSTITUTE FOR RESEARCH IN COGNITIVE SCIENCE
UNIVERSITY OF PENNSYLVANIA
Pinkel Lecture
12
do. Average for a female, 300 hertz, it has quite a bit of
variation. But it's not what's going on, on the bottom
graph, right? So at the same time, if you're recording her
speaking to her baby, she turns to the 2-monther on her lap
and she says hey, can you say ah? Say Ah, hey, you. Say Hi,
hi. It's 700 hertz, it's, you know, it's only 7:00 o'clock
in the morning on the West Coast. It's way too early to get
that high. It is really strange sounding.
It was discovered by the anthropologists in the 60s who were
trekking around to other countries and saying it sounds
really odd when adults talk to children and particularly to
babies. It has a very interesting syntax and a very
interesting semantics but acoustically it's really a
different signal.
So you begin to wonder given that this is a fairly universal
phenomenon, what value does this have to the kids? You want
to know whether it has value, also to know whether they like
it.
Do they seek it out? So in laboratory tests, if you give
babies in the laboratory at about 15 weeks a choice between
mother-ese and adult directed speech and they have to make
little head turns. There are a number of different
techniques that have been used, even sucking has been used,
to turn one or the other on, the kids will do whatever they
have to do to turn on mother-ese or father-ese, no matter
what language it's in. Okay? So you give them 20 trials.
They'll sample both sides and go for the mother-ese.
So we know that they like it. It attracts their attention.
It sustains their attention. Does it do anything for them?
Well we published a study in 1997 looking at mothers across 3
language groups, so English, Russian and Swedish. And in
each case we were measuring the formant frequencies of the
corner vowels, E, Ah and Oo. Those are the biggest, you
know, the most different and most universal vowels in the
worlds' languages. And again these are the phonetic symbols
for E, ah and Oo. The red triangles are how we speak to the
babies and the blue triangles are the way we speak to one
another.
So what you see across all three languages is that we
exaggerate, we stretch the acoustic cues when we speak to the
kids. We also exaggerate facial expression. Instead of
INSTITUTE FOR RESEARCH IN COGNITIVE SCIENCE
UNIVERSITY OF PENNSYLVANIA
Pinkel Lecture
13
saying beed, bid, bood, we say Beeeed, beeed, and so we're
stretching the differences, I think the significant thing is
we're stretching the differences between the critical
elements in the language. And we appear to do that no matter
what those elements are.
So in Mandarin, mothers speaking to their infants will
stretch the tone differences. And we've measured other
sounds and consonants. And there's an acoustic exaggeration
going on that we think is helpful to the kids. And we can,
for example, look at these vowel triangles and in individual
mothers at 2 months, later measure the 7-monther's ability in
the laboratory to hear distinctions with synthesized,
carefully synthesized and different utterances, certainly not
their own mothers, and the more exaggerated and pronounced
the pattern of mother-ese was in early development, the
better the kids are at 7, 6 and 7 months in discriminating
the sounds. So we think it makes a difference to them.
We also think it's important that they care about this
signal, so one illustration of that is testing children with
Autism. If you look at toddlers who are diagnosed with
Autism between about 24 months and 4 years of age, and give
them a preference test that allows them to choose either
speech or a non-speech analog that mimics the formant
frequencies but does not sound like speech and I'll play it
for you in a minute, we get a totally different pattern. The
children with Autism hands down prefer the non-speech where
as the typical kids will show the opposite pattern.
Now here are those signals. Here's the mother-ese signal.
[Plays audio of mother-ese]. She says look what I have.
It's a pot. And here's the non-speech analog of that [Plays
audio of computer generated analog]. So that's a fairly
interesting strange sounding signal. Right? So typical kids
will go back and forth, toddlers, they’ll listen to it a
couple of times but they certainly don't prefer it. Children
with Autism prefer it hands down. They'll turn it on over
and over and over again. There are 8 samples they're hearing
in random distribution. And it correlates very strongly with
their symptoms of Autism and how severe they are with the
degree of their language deficit and the degree of their
cognitive deficit.
So we're now using this in tests of siblings of children with
Autism where the percentage, the prediction is that 30% of
INSTITUTE FOR RESEARCH IN COGNITIVE SCIENCE
UNIVERSITY OF PENNSYLVANIA
Pinkel Lecture
14
the children who are siblings of diagnosed children with
Autism will turn out to be diagnosed with Autism within the
next year and a half. So we're using this test at 6 months
to see whether we can predict which of the children will turn
out to be diagnosed with Autism. So we think of it as a very
exciting potential measure.
So socially, again, this is a social signal. And what you
see with children with Autism is really an almost pained look
on their faces when you play it. They do not like the
intoned, the more melodic it is, which is mother-ese, the
worse they like it. And the more you put a face in front of
them, close by, the worse they like the exaggeration typical
of mother-ese. They will hide from it by covering their
faces. They'll choose whatever else there is as an option,
something more mechanical. Something either non-speech like
this test but even mechanical sort of robotic speech is more
pleasant to a child with Autism than a typical intoned
mother-ese speech.
So let me tell you a second example where the social role for
early language learning is important. This will set it up.
This is another graph like the Japanese Rah, Lah, finding.
Here we're testing babies in Taiwan and babies in Seattle,
Taipei and Seattle babies. This is a Mandarin Chinese
contrast that I heard as Shi-shi. I can't distinguish it.
My graduate students and post-docs say Dr. Kuhl, can't you
try a little bit harder. Listen really hard, shi-shi. And
I'll say, you know, I can't hear that distinction, no matter
how hard I try. The babies at 6 months are brilliant at it.
Equally good in both countries and then 2 months later the
babies in Taipei are doing beautifully and getting better,
the babies in Seattle are getting worse.
We decided to do the following experiment. We wanted to
learn whether or not a baby exposed to new statistics from a
new language for the first time during what we think of as a
sensitive period for sound, you know, understanding and
mapping out the sounds. What happens if you give them the
statistics of a brand new language? They're hearing natural
Mandarin for 12 sessions between 9 months and 10.5 months.
Yeah, it was like having Mandarin visitors. A family moves
into your house for 6 weeks and 3 times a week they sit on
the floor and play with the babies. Right? What's it going
to do to their brains at this critical moment when they've
INSTITUTE FOR RESEARCH IN COGNITIVE SCIENCE
UNIVERSITY OF PENNSYLVANIA
Pinkel Lecture
15
now lost the capacity?
So they had 12 sessions, 25 minutes each. They heard 4
different talkers so it really was a pretty natural kind of
experiment as opposed to kind of an artificial language.
Here's what these sessions look like.
[Run video of Mandarin tutor]
Okay how's your Mandarin coming along?
[Audience chuckling]
Here's what the babies looked like in the session.
[Run video of babies responding to Mandarin]
Okay. So what did we want to know? What have we done to the
brains with this 12 session intervention? And what happens
to the mothers who are sitting behind them that have also
been privy to 12 sessions of Mandarin? And what's it about?
So obviously you had to run a control group of babies. So it
maybe just coming into the laboratory and stimulated in this
way is interesting to the babies, they love it. And maybe
that will alone improve their attention to sound contrast.
So we brought another group of 32 babies into the laboratory
and they heard the American graduate students, same book,
same toy, same dosage but listening only to English. So
here's the control group, thank goodness we have an
experiment, listening to English didn't improve their
Mandarin skills. And as a scientist obviously we've got to
show that.
But look what happened to the babies who had been exposed to
12 sessions of the Mandarin. They are statistically
equivalent to the babies in Taiwan who have been listening
for an awfully long time, for 10.5 months. So the message
here is you give the right kind of stimulation to infants at
the right time of development and they're simply going to map
that structure. They can take the statistics on a brand new
language.
And by the way the mothers are right here with the control
group. It made absolutely no different to them to have 12
sessions. They seemed also attentive but their abilities
before and after exposure did not change. So this is not
something that simply rolls across the basilar membrane and
INSTITUTE FOR RESEARCH IN COGNITIVE SCIENCE
UNIVERSITY OF PENNSYLVANIA
Pinkel Lecture
16
changes perception.
It was a striking demonstration of this ability to learn at
that point in time. And we couldn't resist, of course,
running the following condition. Given that there are tons
of DVDs out there in the market and audio tapes that claim to
teach your baby to Gallic and French and any number of other
languages, the question is what role does the human being
play in this.
And because
statistical
that if the
screen that
information
we had all of the results with regard to
learning experiments, wouldn't it be the case
babies were interested and paid attention to the
the mere, you know, presentation of the
would do it.
So we brought 32 new babies in and filmed these beautiful
DVDs and the graduate students looking at them said wow those
are pretty incredible. I mean they really look convincing
and you can see the face and you can see the books and the
kids were staring at it. They looked glued to the machine,
to the TV screen. And another group of 32 babies who had
their same dosage, audio only.
So we projected that the audio only group might not learn as
well and that the video group would learn perfectly well.
But aren't experiments invented for surprise value? Here's
the audio only group, absolutely no learning whatsoever. Not
so much of a surprise but the surprise came in this one.
While the kids had stared intently at the screen, nothing was
going on out there, absolutely nothing. Right?
So 12 sessions of attentive watching to a flat screen TV and
a beautiful DVD did nothing to alter their brains. So again
we had this completely contrasting result, perfect learning,
if you take as perfect the Taiwanese kids who had been
exposed for the long time and absolutely no learning, no
learning whatsoever and nothing in between. The audio and
video group looked identical and identical to the control
group. Nothing going on there.
So it made us wonder what is happening here. We've had lots
and lots of conjectures. You know, you can begin and end and
have 100 possibilities with regard to what's going on. We
raised two properties that we thought ought to be looked at
first. One is a kind of motivational explanation. Maybe in
the presence of other human beings our arousal is up, our
INSTITUTE FOR RESEARCH IN COGNITIVE SCIENCE
UNIVERSITY OF PENNSYLVANIA
Pinkel Lecture
17
attention is up and we sort of turn on general learning in a
way that doesn't happen when you're not in the presence of
another human being. It seems plausible.
A second one is different. And that's a more informational
explanation. We noted and you can see in that short video
clip that the babies are tracking nonstop what the adult is
doing. They are staring at her and at each new toy that she
brings up, trying to follow each and every gaze. And we know
that gaze following from a social perspective develops at
about this time.
And perhaps it's that information gleaned from having a tutor
label an object while they and the babies are jointly
attending to that object that provides a kind of information
that's either totally missing when you're not in a social
situation but reduced over a television set. Because while
she was looking at her toys it's not going to be as easy on a
television set and it's not there in the audio condition.
So we had the motivation and the informational aspects and
we're following up in experiments now. So here's something
that's underway right now but I don't have the answer to it.
What we're doing in experiments now is using a touch screen
technology and asking whether the added attention of babies
if you make them turn on the television themselves, you give
them control of it for a 10 second presentation, does that
increase learning? So the sound isn't good but what this
baby. She's getting a 10 second exposure to Mandarin. And
then she's got to slap the screen to turn it on again.
[Runs video of trial]
It doesn't matter what she does during the exposure but when
she gets the checkerboard, she has to turn it on again.
Okay. Now the kids really like this. Now this baby tries to
kiss Lotus on the screen. And they often try to grab the
toys from the screen. So there is this sense in which some
of the kids don't know whether it's real or not and they're
kind of testing it. So, so far with the 20 babies that have
been through the routine, we have no effect on the group
level, either in the behavioral head turn tasks or in the ERP
brain tasks. They don't appear to be learning as a group.
However there are some babies, this one in particular, who
are highly social and interactive with regard to the
television set. Those babies are showing ERP responses. So
INSTITUTE FOR RESEARCH IN COGNITIVE SCIENCE
UNIVERSITY OF PENNSYLVANIA
Pinkel Lecture
18
when they treat the stimulus as potentially social, one of
the most predictive variables is whether they talk to the
television set. If they vocalize to the TV, they show
learning. They can vocalize to their mother sitting behind
them, it doesn’t make any difference. Vocalizing to your
mother, oftentimes the first time they get the hang of
turning it on, they'll turn around and go did you just see
what I just did. It's the equivalent in babbling, right, uh,
uh, uh. That sort of thing. They're excited about the fact
that they made this work.
But it's the kids who are talking to the television and
trying to kiss the TV or interact with it socially that are
showing learning. So we're not done with this but it's a
very interesting thing because it definitely raises arousal
and attention on the part of the baby to activate the screen.
Here's the other one. Now this is more looking at the
informational. We're with Javier Movellan [phonetic] and
UCSD in San Diego, using his social robot. This is a robot,
a hunk of metal that he had sitting in a toddler classroom.
And it was just a hunk of metal to begin with. And he was
looking at, kind of an ethnographic approach, what do the
kids do with this hunk of metal.
When it didn't behave socially they would do nothing with it
or go hit it. Boys especially, you know, bang at it, do
something. But not interact with it socially. As soon as he
made the head rotate, so he's got a couple of dots that look
like eyes, as soon as the head rotates to follow the kids,
there's a camera in the middle of its face, and he added
giggling. So as soon as the kids touched it, it would
giggle. And when I came onto the experiment we added
Finnish.
[Audience laughing]
So this robot now speaks Finnish to the children; before it
hadn't spoken. And when they come by and have a toy in their
hand, the robot will ask for the toy in Finnish. And it has
a pincer. You can see the pincer arm that will take the toys
from the child and put it down a chute so it becomes a little
game where the kids will run over, give an object. And the
robot names the object in a Finnish frame, a couple of
different sentence frames and names the object.
So we're looking at learning over a two week intervention.
INSTITUTE FOR RESEARCH IN COGNITIVE SCIENCE
UNIVERSITY OF PENNSYLVANIA
Pinkel Lecture
19
And the question is can you learn anything from a disembodied
source with regard to language. How social does it have to
be? And again we're seeing huge variation. For the kids who
come over and play on short periods of time, there doesn't
appear to be any learning. But for the kids who are in
sustained interaction with the robot and playing with it,
using a lot of toys and staying for a long time with lots of
touches and other kinds--smiles and things like that, they
appear to be learning.
Some of the kids will walk around the rest of the classroom
and label everything in sight that they were exposed to in
its Finnish equivalent. So these experiments are ongoing and
we're simply trying to crack what the social is about.
Trying to understand what's happening.
We've also started another set of experiments with Spanish.
And we finished our first round. And it's very interesting
because we're now looking, the child learning in the social
setting is so potent, we're asking are they learning more
than phonemes. So we're now looking at Spanish word learning
in the same setting. Are they learning the phonemes? Are
they also learning words in Spanish they've been exposed to?
So let's take a look at these sessions. We've got 4 cameras
pointed at the children and the tutors because we were, the
design of the experiment was to say are the babies' social
behaviors predictive of future learning. So we've got 4
cameras and we're coding micro-interaction between the tutor
and the baby, particularly with regard to eye gaze patterns
and trying to measure joint visual attention and its degree
of prediction for phoneme and word learning. So here are the
sessions.
[Runs video of social behavior during trial]
Little father-ese there.
[Vide running]
Okay. So the same kind of dosage, 12 sessions between 9 and
10 months of age. We have been doing a variety, a great
number of tests, pre and post with 3 hypotheses that we're
trying to test. The computational piece says that infants
will show phonetic and word learning after natural language
exposure at the right time in development. So we expected to
see both phoneme learning and word learning. And we've
INSTITUTE FOR RESEARCH IN COGNITIVE SCIENCE
UNIVERSITY OF PENNSYLVANIA
Pinkel Lecture
20
confirmed that hypothesis. They learn words just as well as
they learn the phonemes. And you can contrast the words they
were exposed to in Spanish from words they weren't exposed
to.
The social hypothesis says that for both phonemes and words
the more socially engaged the babies are, they more they will
learn. We've confirmed that hypothesis. I'll show you a
little data for both the phonemes and the words. The
cognitive hypothesis is, you know, there's a phenomenon out
there and I haven't got time to explain it all, but executive
control, particularly inhibitory control is advanced in
bilingual speakers when compared to monolingual speakers.
So Bliastock [phonetic] has done a lot of work on adult
bilinguals across many different languages, showing not that
general intelligence is increased but this ability to
executively control your attention and particularly
inhibitory control tasks, you're better at inventing new
solution to a problem and inhibiting the old solution to a
problem when it's not useful if you're bilingual.
And so in the baby tasks, I'll explain them in a minute but
we wondered whether there would be an association between the
kid's learning and these tasks over 12 sessions and their
cognitive executive control skills. It turns out there is.
So here's the social factors at work. When you look at joint
visual attention between the tutors and the babies and relate
their social engagement to their ERP scores, this is the
phonetic learning data.
You can do this a variety of ways. This is a median split.
The best learners are far in advance on the social skills
than the poor learners. Here's the scatter plot. You can
see that the proportion of gaze shifts to get in alignment
with the tutor are much higher. The higher your peak
amplitude of the mismatch negativity is. The data look
exactly the same for words. And the ERP components for
words. So social engagement predicts, during the sessions,
predicts the post hoc measure of the brain's response to the
phonemes and words of the foreign language.
Here are the cognitive factor data. So the test we used with
the babies is a detour reaching task. So the babies are in a
high chair. And we test them before and after. There's a
Plexiglas box and there's a toy in the box and you put
INSTITUTE FOR RESEARCH IN COGNITIVE SCIENCE
UNIVERSITY OF PENNSYLVANIA
Pinkel Lecture
21
different toys in the box and let the babies go through the
front door of the Plexiglas box to grab the toy. And they'll
do this 20 times.
Then in their sight we close the front door and lift the side
door, Plexiglas side door. And they've seen it and they see
you put the toy in through the side door. You'd think of it
as a pretty easy task to just move your hand movement to take
it from the side. Not so. Monolingual and bilingual babies
differ on this task.
The bilingual kids are faster. They're faster at grabbing
through the side door. And what we see in this graph
surprised us that pre-exposure, no difference between the
best and poor learners. Post-exposure the best learners are
advanced on the cognitive control, inhibitory control tasks.
So there's a linkage. We don't know what direct it goes in
but there's a linkage with just these 12 sessions of
experience. So I think that's pretty exciting.
So these tests, I think, show you some of the components.
The computational components and the social components
involved in that 2-month window. I'm not saying it's
exclusively happening during that 2 months. It's part of our
leg up now on understanding what are the kids doing to alter
perception from the universal citizen stage to the other more
focused stage of listening in that 2 months. They're doing a
computational and a social thing. They can do that for new
languages when presented at that time in development. It
works for phonemes and for words.
One more thing to tout the power of early phonetic learning,
if we take brain measures of the kind I showed you before,
we're looking at this MMN right here and you take them at 7.5
months which is pretty early if we think that they really
start changing as a group after 8 months. If you measure
them at 7.5 months, we can show now that you can predict the
rate of language development to the age of 3 by using a
native contrast. And when we first published these data in
2004 using head turn the critics said oh they're just better
listeners, right?
This isn't to do with language or phonetic perception. And
we argued, no, the prediction is not the same for a nonnative
contrast. The argument is that native contrast, locking onto
those early, will propel you to advance in language very
INSTITUTE FOR RESEARCH IN COGNITIVE SCIENCE
UNIVERSITY OF PENNSYLVANIA
Pinkel Lecture
22
strongly. Your ability to discriminate the nonnative
contrast at 7.5 months means you're still in stage 1; you're
still in phase 1 of development where all phonetic contrasts
are equally interesting. So if you're still good at 7.5
months with the nonnative, you're going to develop more
slowly. And the data bear this out.
So here's a baby wearing the cap. This baby's native
response at 7.5 months, huge negativity. Here's the
nonnative response at 7.5 months, this baby is already
showing the pattern that most kids will take until 10 months
to show. And here's the curve for the entire group of
babies. Word learning at 14, 18, 22 and 26 months out to 30
months actually.
The native predictor, the top half of the distribution, the
bottom half of the distribution. It really helps to be good
at native phoneme discrimination. The nonnative predictor,
these 2 curves are reversed. You're better at word learning
if you're poorer are nonnative in a sense. And these are
both, you know, on growth curve model and these are both
statistically significant.
So it entails this sort of mapping, this neural commitment
which starts the process in my view. The kids begin to
commit to the sounds of the native language and in a sense
give up, not that they can't discriminate it any more, but
they're not attending to it any more. They can still, there
is still, at the group level, an ability to hear that
distinction but they're inhibiting it. They're attending to
the native language. They will zoom forward. The kids who
are still attending to them equally are going to be slower to
develop language.
Okay. We're running a little short here. I have another
brain finding but I think I won't say much about it. If you
do FMRI tasks at 5 years of age and you measure children's
language, IQ and social skills and the SES of their families,
the most powerful predictor after a false discovery
correction for multiple comparisons is the socioeconomic
status of the children's families. Broca's area, measured in
an FMRI machine shows much less specialization, left versus
right hemisphere.
The poorer--and we're not actually measuring income but it
correlates with income. We're doing the Hollingshead which
INSTITUTE FOR RESEARCH IN COGNITIVE SCIENCE
UNIVERSITY OF PENNSYLVANIA
Pinkel Lecture
23
is the standardized measure of the family's socioeconomic
status. What it's really measuring, both parents' education,
both parents' occupation. And what we're attributing this
to; you see this dramatic correlation for SES, correlation
with Broca's area. I mean what it means is that at 5 it's
not an equal playing field. These kids have equal IQs.
These kids have no deficits that we can measure. This is
just the middle of the scale. It runs from 0 to 66. We have
nobody below 30.
These kids all have
are employed. What
environment affects
structures. And in
structures during a
parents who have high school degrees and
it says that the richness of that
the development of certain brain
this case we were tapping the brain
rhyming task in Broca's area.
Follow-up tests using structural MRI say that the complexity
of the language these kids hear is systematically lower,
syntactic complexity, systematically lower, the lower the
SES. So in families with lower SES what they talk about is
different and the complexity of the language is different.
They don't use mental state verbs to talk about what are
people thinking. Why did they do that? You know, what was
going on in their minds? That kind of complexity is much
reduced. The language is more directive: do this, don't do
that. There are simpler structures. It makes a difference
what's going in. And so I think these results are very, very
exciting.
But here's really where we're going. We've been using ERP
for a decade. We'll use FMRI with kids over 5 but we really
want to measure, we can start with the newborns, go all the
way to the aging senior and understand what's happening with
the brain and social stimulation during language. To do that
we need a brain imaging technique that uses whole brain
processing and allows us to look at social systems, language
systems at the same time. So we have to get beyond molecular
neural science which focuses at the individual level at the
biochemistry. Very, very important but it's not systems
neuroscience.
Systems neuroscience wants to look at whole systems: memory,
executive function, language; and understand it. So we are
buying this machine. We open May 24 t h , a couple of months
from now. This is a magnitone encephalography machine. It
looks like a hair dryer. It is not invasive, totally safe,
INSTITUTE FOR RESEARCH IN COGNITIVE SCIENCE
UNIVERSITY OF PENNSYLVANIA
Pinkel Lecture
24
noiseless. It measures with 306 sensors the activity, the
magnetic fields when millions of neurons working together.
And so you're picking up with these 306 sensors the magnetic
fields that occur at the neuronal level, this is directly
mapping neuronal signals and of course it's an engineer
physicist's dream because it plots with millisecond and
millimeter accuracy where the brain activity is during
cognitive tasks.
So we're excited about this mostly because the babies can be
put in the machine. So we're the first in the world to
record babies doing a cognitive task. This is Emma in
Helsinki where we had to do the measurements before we got
our own machine. She's listening through insert earphones to
the sounds of many languages. And you can see she can move.
We can--we're tracking her head movements with this little
cap that's correcting every--we are correcting now every 100
milliseconds for where the location of her head is in the
helmet. So we want to know where Broca's area and structures
are at all points in time.
So we're very excited about this. And illustrated in our
first finding in 2006 how powerful it's going to be, this is
basically a feasibility study but we had newborns in the
machine measuring auditory areas, superior temporal and
inferior frontal, Broca's area, in the newborn period, at 6
months and at 12 months during speech and non-speech. So
what we saw in newborns is that there was no activation in
Broca's. Obviously superior temporal auditory areas respond
to speech, non-speech everything equivalently.
But when you look just at the speech, newborns, by 6 months,
you can see simultaneous activity in these 2 areas of the
brain, post-syllables. And you don't see this for nonspeech. And by 12 months it's even more active. It's as
though Broca's area is recognizing or attempting to simulate.
We don't know exactly what it is. You know, mirror neurons
would project, but that work's been over interpreted. In
humans these shared neural systems exist for perception and
production. We're trying to understand how they might
develop.
When does Broca's area know that that's speech? What's it
doing to try to emulate that movement? So these are the
kinds of experiments we're going to be able to do across the
lifespan with MEG technology. So in conclusion we can say a
INSTITUTE FOR RESEARCH IN COGNITIVE SCIENCE
UNIVERSITY OF PENNSYLVANIA
Pinkel Lecture
25
lot more about early language development than we could
before. The studies done all across the world are
demonstrating important things about learning skills that we
didn't know that kids had 2 decades--just 2 decades ago we
discovered this computational ability of babies that's
implicit and happening all the time.
And now we can say that that's sort of controlled a bit, we
believe, by the social brain. Social interaction has
something to do with turning on this computational skill. At
least we think so. And we think that could be helpful in
evolution, that we're not constantly gathering statistics on
things that don't matter. With language it matters that it's
a human talking to you as opposed to the sounds coming out of
something irrelevant.
Bilingualism may affect cognition. It certainly does in
adulthood. We're interested in the babies. We've got some
suggestive evidence. We believe that mother-ese assists
learning and others have said it before, we're trying to
understand in the early period what does it do at the
phonetic level to assist. And maybe the more important thing
to underscore here is an interest in language helps children
learn. And if they don't have an interest in language as is
true in children with Autism that learning is impaired.
We think of phonetic learning now as a pathway to language.
The early language environment is extremely important that
the kids get talked to. The stretching that we do in motherese but that locking on, that initial learning of which
sounds am I supposed to pay attention to and which ones am I
allowed to ignore, that propels you forward. That scaffolds
you towards more complex levels.
The critical period phenomenon we think is affected by
experience not time. I didn't present you all the data that
we have. My hypothesis is that you're neurally committing to
the structure you hear early in this combination of social
and computational learning. It's mapping structure in the
brain. And that structure in the brain becomes your default
value for language. That's why you can process language a 0
signal to noise ratio and you're so adept at adjusting to the
variations that you hear. We think that it isn't going to be
time, per se. It isn't the maturational hypothesis of
Leninberg [phonetic] it's more about learning. And that
lends itself to very interesting experiments.
INSTITUTE FOR RESEARCH IN COGNITIVE SCIENCE
UNIVERSITY OF PENNSYLVANIA
Pinkel Lecture
26
And then finally these new neural science tools, particularly
MEG, we think is just going to, you know, change the world as
we know it as we uncover more surprises about how kids go
about this marvelous task of learning a language. So in
closing I just want to recognize, you know, it takes an army
to do research, right? I want to thank all the students and
the people across the world who have contributed to these
studies. There are tons of them.
The interdisciplinary and the kind of international team that
we pulled together is really powerful in doing this kind of
work. You simply can't do it in one laboratory by yourself.
You can't just use a single language. It also takes an
army's worth of money, right, to conduct these experiments.
And so we all recognize how hard it is to procure the grants.
I think language is a very strong interest to people.
Cognition, developmental, is a very hot topic. And I think
we can all do well to mine society and the grant agencies'
interest in development cognitive neural science as we kind
of move forward. I think it's going to be a very exciting
decade. So thanks for you attention and participation.
[Applause]
INSTITUTE FOR RESEARCH IN COGNITIVE SCIENCE
UNIVERSITY OF PENNSYLVANIA
Pinkel Lecture
27
QUESTIONS & ANSWERS
DR. TRUESWELL: So just a couple of announcements before we take
questions here. One is that if you have a question you
should use the mic. There will be people walking around with
microphones. I'll stand in front of this microphone here to
help you because this is being recorded. We'd like you to
use microphones. The other announcement is that we're doing
something different this year.
We're having--if you're an undergraduate and would like to
talk with Dr. Kuhl later on this afternoon, we're having a
discussion session over at the Institute for Research in
Cognitive Science. Anybody who is an undergraduate who is
interested in this, you're welcome to attend. It's a 3401
Walnut street on the 4 t h Floor and it's at, I believe, 3:00
P.M. So please come if you can. And then finally there is
going to be a short reception outside here as well after the
question period. So I'll let you take the questions.
DR. KUHL:
Yeah.
DR. TRUESWELL:
DR. KUHL:
LILA:
Um-hum.
Right.
And yet the mystery is going to be that somehow or other
children solve this problem.
DR. KUHL:
LILA:
Yeah.
Okay.
DR. KUHL:
LILA:
It's coming.
Where you introduced the problem that the discovery of the
phonemic system is made difficult or is made impossible in
its own terms by the enormous variability and overlap.
DR. KUHL:
LILA:
Do you have a microphone?
I want to go back first to the first part of your talk.
DR. KUHL:
LILA:
Lila.
Right.
Right. So the problem seems to be unsolvable in its own
terms as we know from Bill Gates is a very smart guy--
DR. KUHL:
[Interposing] Very smart-INSTITUTE FOR RESEARCH IN COGNITIVE SCIENCE
UNIVERSITY OF PENNSYLVANIA
Pinkel Lecture
28
LILA:
--and his 500 cohorts working.
working on this for 50 years--
DR. KUHL:
LILA:
[Interposing] Right.
--as we know. The problem seems unsolvable. And it seems
unsolvable for the usual platonic reason, because of the
overlap you don't know which items belong in the set.
DR. KUHL:
Right.
LILA: Right?
DR. KUHL:
LILA:
Right.
Right.
And as I saw in the specially the Spanish guy--
[Interposing] Yeah, yeah.
--with the little kids.
DR. KUHL:
LILA:
Right.
Okay.
DR. KUHL:
LILA:
Right.
So. Now turning to the second half of your talk, you turn
now to the social question.
DR. KUHL:
LILA:
Um-hum.
Right?
DR. KUHL:
LILA:
Um-hum.
But we're faced with the problem here that in natural
running speech the children aren't given the central member.
DR. KUHL:
LILA:
Right.
And given the central member, well then they can do
statistical learning.
DR. KUHL:
LILA:
Right.
So here's the central member.
DR. KUHL:
LILA:
Which are the E's [phonetic]?
So I was interested in your first experiment, very early
experiment, in which you gave infants a prototype.
DR. KUHL:
LILA:
And people have been
Yeah.
Okay. It seems to me that you see something much more
specific than saying well here was the computational and
INSTITUTE FOR RESEARCH IN COGNITIVE SCIENCE
UNIVERSITY OF PENNSYLVANIA
Pinkel Lecture
29
here's the social.
DR. KUHL:
LILA:
[Interposing] It is.
--it means that you have to solve it top down.
DR. KUHL:
LILA:
[Interposing] Um-hum.
--that enables--but that problem's computational too--
DR. KUHL:
LILA:
Right, right, right, right.
So if you're socially interested in the problem--
DR. KUHL:
LILA:
Right.
Right? Or you can't solve the problem. Just too much
variability. Right? But you can solve it if it's peat-able, peat-a-ble [phonetic]; it's always the same object.
DR. KUHL:
LILA:
Um-hum.
So you have to know which count as the E's.
DR. KUHL:
LILA:
Right.
Right, which are in the system.
DR. KUHL:
LILA:
[Interposing] Yeah.
--the causal condition is the following: you have to solve
this problem top down to solve the Plato problem.
DR. KUHL:
LILA:
Um-hum.
But more specifically, social at its margins, yeah, you
have to get them interested. But that's just an enabling
condition. So I want to try this on you--
DR. KUHL:
LILA:
Um-hum.
Right.
Yep.
So what do you think of that?
DR. KUHL: I think those are wonderful questions. So let's go to
the first one first about the prototype. So where do
prototypes come from? I mean okay so are they platonic, well
one answer was they're platonic ideals. They're built in.
And we say no. You know, they're derived statistically
through a process that we don't quite understand but they're
derived from statistically distributions. Maybe it's--yeah,
that's one possibility. But computers aren't solving it that
INSTITUTE FOR RESEARCH IN COGNITIVE SCIENCE
UNIVERSITY OF PENNSYLVANIA
Pinkel Lecture
30
way, right.
So the second one which comes from the social piece is to say
that it's actually mediated through words. That knowing that
something is referred to by different people in the same way
allows you to decide, okay, that everything that is referred
to in the same way must be part of that statistical
distribution. However, you know, it's hard to imagine that
the kids at 6 months or the data show that by 6 months the
babies in Stockholm and the United States are solving it
already. So either they're cracking the word code much
earlier than we think they are, possible, possible, or
there's something else going on that we don't quite
understand.
Oh yes, you may only need a few words. You may only need a
few words. And you know, the mother-ese phenomenon does do
this in spades by not only naming objects in the presence of
kids and providing more prototypical examples, right? Much
more prototypical in mother-ese as we saw from the stretched
triangles, but also varying--they do two things. They
develop a more prototypical average but they also have
greater extent of variation. It's almost as though motherese tries to be multiple talkers. So it's possible that's a
leg up on the solution which would explain why do computers
not do it. Why isn't it just statistical? What's the piece
of the social? You know, the word, maybe the vehicle in.
The Spanish experiment's interesting because we measured both
words and phoneme learning at the same time. Kids are
exposed to Spanish, 12 sessions, post session we measured
both phonemes and words. We asked the question. At the
individual level do all kids show 1 of the 2 first? So some
kids may learn 1 and not the other. Show learning of 1 and
not the other. And some may show learning of both. The kids
who learn both or neither are not as interesting as the kids
who show learning of 1 and not the other.
Interestingly, we see it almost--we're not done with this
analysis. This is not--so don't hold me to this. Kids seem
to do it differently. Some kids are showing phoneme
learning, not word learning yet. Other kids are showing word
learning, not phoneme learning yet. Suggesting that
different kids may go at this differently which would be very
interesting. But we'd have to do more analyses; we'd
probably have to do more experiments to understand exactly
INSTITUTE FOR RESEARCH IN COGNITIVE SCIENCE
UNIVERSITY OF PENNSYLVANIA
Pinkel Lecture
31
what they are.
this problem.
But you're absolutely right.
They've got
They've got a big problem, variation all over the place.
Social mediation to help steer them in the right direction.
Great computational skills but you still have to know over
what to compute. How do I compute? So categorical
perception will help them somewhat, right, because they are
hearing these boundaries. So they've got some gross
partitioning. That would say gather the statistics on this
acoustic part of the map, put them all together. So it could
be something sloppy like that until they hone it to objects.
We don't know. Keep us in business for a while, right? Go
ahead.
FEMALE VOICE 1:
sounds?
DR. KUHL:
So you said that Autistic kids prefer robotic
Yeah.
FEMALE VOICE 1: Is it because it's just disturbing to them or can
robotic language actually help them learn language better?
DR. KUHL: Well it may be both of those two. So on the one hand
what we see is behavior that looks like they're afraid of the
face and the voice. They will never choose the face when
they have a choice. So they cover their eyes if we don't
give them a choice. And they also, if you give them a
listening choice, they will steer clear, the more emotive the
signal sounds the less they like it.
So I think, you know, MEG data, again, you know, what's
happening to the amygdala when they hear faces--when they see
faces that are very animated as opposed to an object. They
always prefer objects. So it's possible that fear centers in
the brain will illustrate some oddity of their brain's
processing of signals. Somehow what typical babies find
pleasurable, speech is the hands down choice of children
given a choice, and faces, most interesting visual stimulus.
That's different in children with Autism. And I think that
could prevent their learning.
The second question is what if you presented speech to
children with Autism through the robot with a robotic voice?
We haven't tried it yet but I think it's extremely
interesting to think about that. It might help this initial
mapping process.
INSTITUTE FOR RESEARCH IN COGNITIVE SCIENCE
UNIVERSITY OF PENNSYLVANIA
Pinkel Lecture
32
FEMALE VOICE 1:
DR. KUHL:
Thanks.
Yeah.
MALE VOICE 1:
bit.
Hi.
And we'll come back over here.
I wanted to stick up for the machines a little
[Laughter]
DR. KUHL:
Okay go for it.
MALE VOICE 1: The scatter plot that you showed of English vowels
which I think was published in your 2004 Nature Reviews
Neuroscience-DR. KUHL:
[Interposing] Yeah.
MALE VOICE 1:
DR. KUHL:
--they did--
--Peterson and Barney.
MALE VOICE 1:
DR. KUHL:
And they did--
[Interposing] And earlier from, you know--
MALE VOICE 1:
DR. KUHL:
Came originally from Hillenbrand, 1997.
Right.
MALE VOICE 1:
DR. KUHL:
--article.
--well they replicated Peterson and Barney--
[Interposing] Yeah.
MALE VOICE 1: --but they did discriminate analysis on the
measurements for those vowels. And they got 97% correct of-DR. KUHL:
[Interposing] That's--
MALE VOICE 1:
DR. KUHL:
--all identification--
[Interposing] Yeah.
MALE VOICE 1: --92% to 97% depending on exactly how they set it
up. So in fact that plot showed F1 and F2, they threw in F0
in duration.
DR. KUHL:
Right which was helpful.
MALE VOICE 1:
DR. KUHL:
Yeah.
which is helpful because--
[Interposing] Yeah.
MALE VOICE 1:
--humans also get that-INSTITUTE FOR RESEARCH IN COGNITIVE SCIENCE
UNIVERSITY OF PENNSYLVANIA
Pinkel Lecture
33
DR. KUHL:
[Interposing] Right.
MALE VOICE 1:
DR. KUHL:
--so I'm not trying to suggest--
[Interposing] Yeah.
MALE VOICE 1: --that the speech recognition problem by machines
has been solved but I think you-DR. KUHL:
[Interposing] Yeah.
MALE VOICE 1:
DR. KUHL:
[Interposing] Yeah.
MALE VOICE 1:
DR. KUHL:
--how far away--
[Interposing] Okay.
MALE VOICE 1:
DR. KUHL:
--radically exaggerated--
--the machines are--
[Interposing] No that's fair.
Yeah that's fair.
MALE VOICE 1: --so you also said that speech recognition requires
words to be separated and that actually hasn't been true for
quite a long time. It certainly doesn't get perfect
transcription. And there's certainly plenty of research to
be done in that area but I think there might actually be an
opportunity for greater confluence of research-DR. KUHL:
[Interposing] Yeah.
MALE VOICE 1:
DR. KUHL:
[Interposing] Right.
MALE VOICE 1:
DR. KUHL:
--interests and results--
--for the engineers--results that the engineers--
[Interposing] Yeah I'm excited about the--
MALE VOICE 1:
--from your research as well.
DR. KUHL: --okay, thanks for making that clarification. I'm
excited about the engineering perspective and think that, you
know, greater experimental sharing between human learning
systems and machine learning systems is really important. So
a post-doc of mine, Debore [phonetic] and I actually looked
at machine learning for mother-ese and adult directed speech
and could see the value of mother-ese signals in that kind of
thing. Especially with regard to the robustness after
training to new data sets.
INSTITUTE FOR RESEARCH IN COGNITIVE SCIENCE
UNIVERSITY OF PENNSYLVANIA
Pinkel Lecture
34
So I think there has been a lot of progress. I think the
progress may not be happening so much at the phonetic level
and I was only trying to illustrate with that diagram from
Peterson and Barney, I mean, what my understanding is we have
not got the ability to play natural language into a machine
learning system and have it derive the categories of the
language. I mean that's the ultimate test of whether or not
the machine can process natural language in a way that allows
it to sort. So when speakers like in the Hillenbrand or
Peterson and Barney are producing he'd, hid, had, hood, you
know, highly constrained CVC, prototypical examples, that's
not really a fair test that compares them to the babies.
So but I do think that the mother-ese productions, not the
kind of stylistic stuff we do when we are trying to be made
understood by a machine, so sometimes when the machine is
learning the algorithms, make a person repeat, they do just
the wrong stuff, right? They do something that's not what we
do when we pronounce these clear instances of mother-ese. I
think there's a lot to be learned by using input into
machines that really constrain the syntax, constrain the
vocabulary and enhance the acoustics in the way that motherese does.
So I think that it will be a very promising line of studies.
And if we ever get to the place where robots are, you know,
capable of teaching foreign languages to kids I think, again,
the engineering psychology links will be stronger still to
try to understand what kinds of learnings can we do from
machines, what kinds of learning can machines do and humans
do that are comparable or not. That remains an interesting
question.
I used to tell Bill Gates that his computers need a life and
a brain. But I've modulated that a little bit over the
years.
FEMALE VOICE 2: I'd just like to comment on your work with the
Autism. Conventionally it's difficult to diagnose an
Autistic child until they are about 18 months or so-DR. KUHL:
[Interposing] Right.
FEMALE VOICE 2: --and yet there's another test that works
completely in agreement with your findings. If you project
on a screen the picture of the mother, the Autistic kids will
look at the eyes.
INSTITUTE FOR RESEARCH IN COGNITIVE SCIENCE
UNIVERSITY OF PENNSYLVANIA
Pinkel Lecture
35
DR. KUHL:
Um-hum.
FEMALE VOICE 2:
DR. KUHL:
Yes.
FEMALE VOICE 2:
DR. KUHL:
And that seems to mesh very well--
[Interposing] Right.
FEMALE VOICE 2:
DR. KUHL:
The normal ones will look at the mouth.
--with your findings.
So on McClune's data.
FEMALE VOICE 2:
Yeah.
DR. KUHL: Yeah. I think that's really interesting. And, you
know, from the standpoint of the clinicians as well as the
scientists, the earlier that you could diagnose these
children and the earlier you could attempt treatments that
are getting at what you think the fundamental cause is, the
better we would be both scientifically and from a clinical
standpoint. I think it's fascinating. I think these early
precursors, the eye gaze, the social work that MCClune's
doing and this auditory, this sort of early language
precursors in the auditory domain have real promise. Again
tapping the social and the linguistic preparation for
language.
Thank you.
Other comments?
Yeah.
MALE VOICE 2: Just a basic question about for children who are
exposed to Mandarin Chinese between the 8 to 10 months, so
have you studied them after they became older-DR. KUHL:
[Interposing] Yeah.
MALE VOICE 2:
--for--do they keep the ability or do they lose it?
DR. KUHL: Well we're really interested in the forgetting function
for adults and children, thinking that it's quite different.
That children, as we saw in these slides, learned so very
well and we think that it has some staying power but not
forever. So we brought the kids back in at 14 months and we
could see the ERP signatures of that learning are still
there. That's, you know, quite a few months after, without
any intervening Mandarin experience. But then by 18 months
it's gone. So we don't see the neural signature any more.
INSTITUTE FOR RESEARCH IN COGNITIVE SCIENCE
UNIVERSITY OF PENNSYLVANIA
Pinkel Lecture
36
The hypothesis would be, but we couldn't do that with these
kids, if you brought them back at 3 or at 5, that they would
look different when exposed again to Mandarin, than the kids
would if they had never been exposed, like the control group.
We didn't have enough of these children we could bring back
in. But we'd love to do experiments where we demonstrate
that there is some staying power. This early experience that
you have really has a kind of potency with regard to the
neural structure and the vestiges of that would remain quite
a long time.
Whereas in adults we know that when we're exposed to foreign
languages we sometimes learn but it doesn't stay. So there's
something different about the short term memory system and
long term memory systems for language and speech that's going
to link to the critical period that protects us from being
written over, right, if you go to Japan for 2 months, it
wouldn't be good if your English skills were written over by
your newly forming Japanese skills. So there's something
about memory systems and the critical period that we'd like
to understand. And maybe the tractable experiment are these
forgetting functions after interventions to see, you know,
what's happening. And of course brain imaging with that
would tell us something about what's happening in the
auditory areas, in Broca's, and in executive function areas.
MALE VOICE 2:
DR. KUHL:
Thank you.
Yep.
MALE VOICE 3: Well there's a phenomenon that's been quite
important in my own work which is quite puzzling but all
linguists are familiar with it. Children, there's massive
evidence to indicate that children do not acquire the foreign
accent of their parents.
DR. KUHL:
Yes.
MALE VOICE 3: And this happens quite lately in a period that
seems to be considerably later than the intensive learning
program-DR. KUHL:
[Interposing] Than the really early stuff.
MALE VOICE 3: So I was wondering if that later liability can be
modified by social factors of a sort that we don't yet
understand.
INSTITUTE FOR RESEARCH IN COGNITIVE SCIENCE
UNIVERSITY OF PENNSYLVANIA
Pinkel Lecture
37
DR. KUHL: I totally agree with that. I think that… I think that
we may see that the social factors are so potent that when
you take--we've been looking at toddlers. Toddlers who come
into the country say from Japan or from Taiwan, speaking
either Mandarin, their families have spoken Mandarin or
Japanese. The parents are not good English speakers and so
they are attempting a little bit of English at home but
basically the children are exposed to one language at home
and a different language in preschool.
And what we see is that as soon as they develop--this is
mostly anecdotal, ethnographic kinds of research, that as
soon as they develop a social group, a group of pals, a group
of friends, the 3-year olds, 4-year olds, that they
completely lock onto the language style of their peers. So I
interpret that as a social effect. We don't have it made
tractable.
I'm more interested in teenagers who are, you know, edging
towards the end of the critical period and there's also there
a potency with regard to social factors and a group of
emotionally close kids and a mimicry, this is in speech
production data, a mimicry of the kids in their social group.
So I think the social factors are really strong.
The questions are why aren't they strong for adults. Why
don't social factors--or do we avoid social factors? As
adults we protect ourselves socially by kind of--we don't
move to foreign countries and develop close friends to the
same degree as kids would do when they're kind of desperate
for that contact. What is it about the social that might
change over time if this is a real fundamental explanation to
what's going on with learning mechanisms? So I think that's
a puzzle. But it's absolutely true for the kids that this
peer group will, you know, win out over the parents hands
down, very potent. John.
DR. TRUESWELL: So this is somewhat related to that. I think
really, I don't know the developmental literature well enough
to say this for sure but it seems like an understudied topic
in this area is the adaptation to particular speakers.
Speaker-DR. KUHL:
[Interposing] Um-hum.
Um-hum.
DR. TRUESWELL: --speakers, adapting to speaker variability. It
would seem especially for these computational problems that,
INSTITUTE FOR RESEARCH IN COGNITIVE SCIENCE
UNIVERSITY OF PENNSYLVANIA
Pinkel Lecture
38
you know, one of the things that the child has to learn is,
and is a model of an individual speaker, how quickly can you
develop-DR. KUHL:
[Interposing] Um-hum.
DR. TRUESWELL:
speaker--
--a model of the priorities of this individual
DR. KUHL: [Interposing] Right.
DR. TRUESWELL: And how do these map onto the so-called
invariance, or maybe that's what the invariants are,
developing a model for each-DR. KUHL:
[Interposing] Um-hum.
DR. TRUESWELL:
DR. KUHL:
Right.
--speaker quickly.
Um-hum.
DR. TRUESWELL: And so I guess part of the question is are there
studies of that sort going on? You know, how early can
infants adapt to different speakers?
DR. KUHL:
Yeah.
DR. TRUESWELL: And also, you know, would this be part of the
puzzle that, you know, people-DR. KUHL:
[Interposing] Um-hum.
DR. TRUESWELL: --a crucial piece of the puzzle that's being
overlooked right now.
DR. KUHL: Yeah. I don't think the right kinds of experiments are
going on and these are the ones where the engineering
approach would be very interesting to compare to infants.
So what you want to do is model learning with machines and
with infants if you could control the kinds of speaker
variability that they're exposed to. So on a completely
statistical explanation; you'd imagine the kids are
attempting to develop the distribution of allowable instances
of a particular type. Now you have to decide what you're
averaging over, but let's say that you can limit the extent
to which there are speaker variances in the input of a child.
Maybe a child raised only by the mother for the first 6
months, just to do a thought experiment, right?
INSTITUTE FOR RESEARCH IN COGNITIVE SCIENCE
UNIVERSITY OF PENNSYLVANIA
Pinkel Lecture
39
So you'd be over-representing in your statistical
distributions. Her utterances, they would simply look like
her. And so when other females, males and children come into
the picture that statistical distribution has to be modified
until the very next instance you hear doesn’t change the
distribution any more. So once you've heard what, a million
instances of a vowel and you've heard many different talkers
of different ages, is that the point at which you stabilize
your distribution doesn't change any more? Because the next
instance is not new. It's subsumed by the distribution.
So how that happens, the stability and how many instances
does it take, you know, we have absolutely no idea how much
it takes. And then we also know that we drift, both in
speech production and in speech perception. If you go to
Britain you will drift your categories. You'll sound more
British with regard to your productions. You'll also
perceive things differently. So why is that you don't
overthrow your distributions when you're trying to learn a
foreign language and you're not completely, you know, mapping
a whole new structure when you go to Japan for 2 months?
Yet you can drift when you go to another country. You're
beginning to sound different and your perception is shifting
if you do micro perceptual tests. So how does the
neurobiology protect us from the overthrow problem while
still allowing us to adjust, again, socially why would we do
it? Well because socially it's sort of interesting to adjust
the behavior patterns, the speech patterns, the listening
patterns of the social group you're in.
Comparing human learning experiments with machine modeling
experiments that look to see what does it take to stabilize a
distribution, think of it just statistically, that's an
interesting problem. And I don't--I've been thinking about
it and I write about it but only as a, you know, codunkin
[phonetic], it's not something that I'm solving in
experiments yet. It would be nice to design those and
conduct those. I think that's a really interesting question.
But I don't know how that works. We have a degree of
openness but we're not completely open. How is that?
DR. TRUESWELL:
Thanks--
Well thank you.
That's all the questions.
DR. KUHL: [Interposing] Yeah thanks, that's great.
INSTITUTE FOR RESEARCH IN COGNITIVE SCIENCE
UNIVERSITY OF PENNSYLVANIA
Pinkel Lecture
40
[Applause]
[END 164074.MP3]
INSTITUTE FOR RESEARCH IN COGNITIVE SCIENCE
UNIVERSITY OF PENNSYLVANIA
Pinkel Lecture
41
Download