22075 >> Heather Mitchell: Good afternoon, and welcome. My... am here to introduce and welcome Brian Christian, who is...

advertisement
22075
>> Heather Mitchell: Good afternoon, and welcome. My name is Heather Mitchell, and I
am here to introduce and welcome Brian Christian, who is visiting us as part of the
Microsoft Research Visiting Speakers Series. Brian is here today to discuss his book,
"The Most Human Human: What Talking About Computers Teaches us About What it
Means to be Alive."
In 1950, Alan Turing predicted that computers would become so sophisticated that,
quote, we would be able to speak a machine's thinking. The Turing test has become the
holy grail for scientists studying artificial intelligence. Each year, computer programs
compete for the Most Human Computer award. Brian was awarded the Most Human
Human award in 2009.
Brian holds a dual degree from Brown University in computer science and philosophy,
and an MFA in poetry from the University of Washington. His writing has appeared in
such widely disparate publications as AG and I, Gulf coast, the Seneca Review, Best
New Poets and Cognitive Science. Please join me in welcoming Brian to Microsoft.
[applause]
>> Brian Christian: Thanks so much. And thanks everyone for coming.
I am a humble cartographer of the advances of technology, but you guys actually do this
stuff. So, yes, it's a pleasure to be here.
So really the book tells two stories. The first story is a philosophical story. It's how
computers fit into this long-standing narrative of humans thinking about themselves. So
traditionally we've looked to compare ourselves against animals as an attempt to figure
out what it really means to be human.
And so the invention of the computer, I think, marks this really decisive shift, where we
now are more preoccupied with distinguishing ourselves from machines than from
animals.
So the question has really sort of pivoted. And the implications that come out of that are
extremely exciting, I think. The second story that the book tells is this personal story.
So it's about my own involvement in my own small way in the history of this competition
called the Turing test.
So in the fall of 2009, I found myself in an extremely strange position. I was 5,000 miles
away from my home in Seattle, in Brighton, England, and I was preparing to have a
series of 5-minute-long instant message conversations with a number of strangers.
And my goal in these conversations was probably one of the strangest things that I'd
ever been asked to do, which was that I had to convince them that I was human and
they were going to be slightly skeptical of this.
So it's this very curious position to be in. This test, I'm sure, many of you are familiar
with this. It's called the Turing test. It was first proposed by Alan Turing in 1950 as the
computer, modern computers as we know was still in its infancy, people were already
asking these really philosophical questions, like can these machines actually think in the
human sense.
If they could think, or if we could build a machine that could think, how would we know?
And Turing's answer was, let's just put theory and philosophy aside, I'm going to create a
practical test. We're going to convene this panel of scientists, and they're going to have
these 5-minute-long text chats and they're not going to know whether the text that
they're receiving is coming from a real person or from a computer program that's
designed to mimic human conversation.
And it's their job in five minutes of back and forth chatting to try and figure out, to try to
steer the conversation such that they can make that determination.
It was Turing's famous prediction that by the year 2000 computers would be fooling the
judging panel about 30 percent of the time, and that as a result we'd speak of them as
being intelligent, without expecting to be contradicted.
This prediction did not come to pass. Typically, the test has been held annually since
the early '90s. Typically the best programs are only able to fool maybe one of the
judges.
But 2008 was this very strange kind of watershed year where the top computer program
managed to fool three out of the 12 judges, or 25 percent.
So the computer was basically one vote shy of passing the Turing test.
So kind of a narrow squeeze for humanity. And my ears really perked up at this, and I
thought, well, 2009 could in fact be the pivotal year. And this could be the year that they
finally cross that threshold and this voice sort of rose up in me that was basically not on
my watch, I'm not going to let that happen.
So is there something that I can personally do to get involved and be a part of this,
defend my, the honor of my species.
So I got in touch with the organizers. The man who oversees this particular incarnation
of the Turing test, which is known as the Loebner prize, is a rather colorful and
inscrutable figure, this disco dance floor baron, famous for selling portable roll-up lighted
disco dance floors in the '80s and '90s, made his fortune and decided that he wanted to
immortallize himself in the anales of science and got together and said I think technology
is now at a point where we're ready to have these Turing tests. So these got this
long-standing cash prize which is outstanding for whichever program can break through
first.
So I called up Hugh Loebner, and before I knew it I was in touch with the organizers. All
right. Your name's on the list. You're coming to Briton England to be part of this panel
of four real people that are going to represent humanity in this contest.
And it's this very sort of strange position of, you know, being in this situation of needing
to persuade the judges that I am in fact a human. Knowing that there's a computer that's
trying to persuade them the exact same thing.
So as soon as I knew that I would be participating in this test, the real question for me is
what do I do? How do I deliver a human performance at this event?
And the organizers' advice to me was basically what I had assumed that they would say,
which was, well, you are human, so just be yourself.
And that advice kind of haunted me, just be yourself. There was a level at which I
thought maybe that's a naive overconfidence in human instincts, or at worst, maybe
even fixing the fight. Because we know from experience that conversations are not
uniformly successful.
There's a huge demand in our society for conversation coaches, dating advice coaches,
conflict resolution coaches, and so there is a sense in which we're all sort of struggling to
figure out what it means to communicate better.
And as I looked back over the 2008 transcripts, I was slightly horrified at the
performance that the real people had given. So there were a number of cases of the
humans being in fact down right apologetic that they couldn't make better conversation.
One says: I bet the confederates are really getting tired of talking about the weather.
I'm sorry for being so banal. Meanwhile the computer in the other window is basically
charming the pants off the judge who in no time is gushing LOLs and smiley faces.
I'm reading these logs and I'm thinking we can do better. And so of course I have to
figure out what that means. And so I really committed myself, basically, to being as
disobedient to the organizer's advice to show up in Brighton and be as myself as
possible. I looked back over the history of the tests, at which conversations went well,
which ones didn't.
I looked into how these software programs are built, what they're good at, what sorts of
simplifications they have to make in order to be viable. I talked with psychologists,
linguists, computer scientists, philosophers, all trying to get at this question of what is it
that's really sort of [inaudible] nuance and complex about human dialogue? What sorts
of corners do these programs have to cut and how can I emphasize precisely those
things when I go there?
Now, ordinarily there wouldn't be anything strange about this. We prepare to take
standardized tests. We prepare for tennis competitions.
But given that the Turing test is meant to evaluate how human I am, there's this weird
implication that perhaps being human or being myself is about more than simply
showing up.
And I really came to think that it is. And so the answers that I found along the way really
ended up being applicable to a lot more in life than just the Turing test.
So, first of all, I want to give you guys what I think is a strange and more than slightly
ironic cautionary tale. Dr. Robert Epstein, UC San Diego psychologist, the editor of the
Scientific Volume Parsing the Turing Test and the co-founder with Hugh Loebner of the
Loebner Prize Competition, subscribed to an online dating service in the winter of 2007.
He began writing these long letters to a Russian woman named Yvonna, who would
respond with these lengthy letters of her own describing her life and her family, and her
growing feelings for Dr. Epstein.
Eventually, though, something didn't feel quite right. To make a long story short, Epstein
comes to realize that he's been exchanging lengthy love letters for over four months
with, you guessed it, a computer program.
So poor guy. It's not enough that these Web roughians spam his in-box every day now
they have to spam his heart.
And part of me wants to laugh at the guy. He founded the Loebner prize Turing test
competition, what a chump to basically fall for the very thing.
But I'm also at the same time a little bit sympathetic. You know, the unavoidable
presence of e-mail spam in the 21st century, we're talking 97 percent of all e-mail
messages sent across the net are spam. That's about tens of billions a day.
According to several calculations, you could literally power a small nation. For example,
Ireland, with the electricity required to process the world's daily spam.
But all that spam I think does something arguably worse which is it erodes our sense of
trust. I sort of resent that when I get an e-mail from someone I have to spend a tiny bit
of energy for the first few lines at least deciding that it's actually them talking to me and
their account hasn't been hijacked.
And so in some sense we go through the 21st century now with our guards up; that
basically all communication has now become, at least all digital communication, has
become a kind of Turing test.
When I want to e-mail someone a link and I want them to trust it enough to click on it, I
not only have to send it to them, but I have to act like myself in some way.
I have to say some sort of very-Brian thing so that they'll truss it's really me sending it.
So we're starting to see, even though these programs are not actually passing the Turing
test. They're actually putting this pressure on to actively be ourselves and assert our
personality.
And it's not only a question of etiquette, which I suppose it's always been, it's now
actually a part of online security to act like yourself.
And that for me is extremely interesting. So that's part of it. And I think the other part of
it is I'd like to think in this particular example that Dr. Epstein learned his lesson that
something occurred to him about why it took him four months to realize that there was
no actual human exchange occurring between him and Yvonna.
And I'd like to think that in his next relationship, which is hopefully not only with a
flesh-and-blood homosapian, but someone closer than 11 time zones away, I'd like to
think that whatever lessons he learned that his now actual girlfriend will have Yvonna in
some strange way to thank.
So spam and the publicity kind of dastardly on-line programs trying to make people fall
for them indicates this way in which the Turing test, which used to be this purely
abstract, academic philosophical thing, is actually creeping into daily life.
Increasingly, we're seeing customer service being outsourced to chat bots. If you go to
IKEA, they have Anna, a chat bot, who will direct you to the curtains department. Alaska
Airlines has a chat bot and Continental airlines, and the Army has a chat bot called
Sergeant Star.
And I enjoy asking him about a bunch of classified things and getting his like formulaic, I
can neither confirm nor deny the existence of Delta Force.
But what we're also seeing at the same time is I think part of what the Turing test really
makes us ask is how much of a difference is there between customer service that's
actually being handled by a chat bot and customer service that's being handled by real
people, basically reading from a script, reading a set of prepared solutions or reading off
of an FAQ or stepping someone through this very kind of rigid protocol.
They are not giving a Turing test passing, customer service, in that case. So that the
lines sometimes between people and machines can blur when we're operating this very
rigid way.
I think partly, to back up and sort of widen the lens a little bit, partly to understand the
anxiety that people have about their relationship to computers, it's worth noting that
computers used to be human.
I don't know how familiar this is to most of the people in the room, but back before the
word "computer" meant one of these digital devices in our pockets or on our laps. It
meant something entirely. It was a job description.
So engineers fell in love with computers all the time. It was totally common place. And
in fact it's really fascinating, if you go back and look at these really early papers by
Turing and others in the 40s and 50s where they're trying to explain to their readers
exactly what these contraptions are, they'll say things like: Well, you know, it's kind of
like a computer.
And what they mean is it's sort of like a person whose job it is is to do math. Part of
what I find really fascinating about the past few decades is that it is the digital computer
that has not only become the default term but the literal one. And to call a human a
computer is now a metaphor. Whereas to call a machine a human is a metaphor. Now
it's switched.
Now it's the human math whiz that's like a computer. But the whole point was the
computer was modelled on him. So we're in the strange position of actually imitating our
old imitators. We are like what used to be like us.
The Harvard psychologist Daniel Gilbert says that every psychologist must at some point
in his or her career write a version of what he calls the sentence.
And the sentence reads: The human being is the only animal that blank. So humans'
sense of self you could say is the story of these failed debunked versions of the
sentence.
We once thought humans were unique for being able to use syntax. That's being
chipped away. We thought humans were unique for using tools. That's being chipped
away. And perpetually these psychologists are trying to come up with new formulations.
But what I think is really interesting is again it's not animals that we're really comparing
ourselves to anymore.
So for Aristotle and Plato and Descartes, they were incredibly impressed with our ability
to remember facts and think in this really rigid abstract way and do things like long
division.
We are not impressed by that capacity, really, anymore. We're constantly talking about
how bad we are at math and how bad our memories are.
So this yardstick for what it meant to be human has now changed. We have this
different yardstick. So you can of course ask is it appropriate, first of all, for our sense of
self to be reactive to other things in the world and define ourselves purely in relationship
to these other things that we don't fully have control over?
Are we in this constant process of kind of backing off of our claims to what makes us
special? Part of what I think is interesting about the Turing test in particular is that it's
not only a benchmark for machine intelligence, but it cuts both ways. The Oxford
philosopher John Lucas says that when machines pass the Turing test, it's not going to
be because they're so intelligent, it's going to be because we're becoming so wooden.
And that's what I find really compelling about it. It's like if we get to this position where
these machines conversations are indistinguishable, to what extent is that an
achievement on the part of the machines, to what extent is it an indictment on us.
That's part of what I was really engaged by as I went through the process of trying to
prepare for the test, is what can it teach me at a practical level about just how to talk to
people.
And that's what I found really sort of rewarding about it. Is I think the Turing tests
ultimate questions are practical questions, which is what does it mean to communicate
when someone as effectively as possible given the parameters of time and language
and medium.
And what is it that we're particularly good at and how do we emphasize that. And as you
look at the history of chat bots you start to get these really interesting answers.
So I'll give you, for example, one of my favorite transcripts. This is from the 1997
Loebner prize competition.
So I'll use the convention of calling the judge, judge, and calling the thing or person at
the other end remote, just so that I don't have to commit to a word that's either like
person or machine.
Remote: Hi, my name's Kathryn, what's yours? Judge: My name is Elliott. Remote:
Well, it's nice to meet you Elliott. Did you see that story on CNN last night about the
lesbian company who came out at a White House party on Sunday? Judge: Yes, I did, I
think it may all be a publicity study for Ellen. Remote: Ellen DeGeneres was one of
them she was kissing her lover in front of Bill Clinton. Do you think he was
embarrassed? Judge: I think he wants the gay vote, he will do anything to get it and
keep it. Does not have a conservative base. Remote: Are you still a friend of Bill
Clinton? Judge: I'm not sure he has any real friends. He's achieved a lot, left a lot of
people behind. Remote: You really don't like the guy? Well, at least that's definite. I
thought most people were just indifferent and numb by now.
The conversation continues. I have to say having gone over the history of the Turing
test, this is up there as probably one of the most trenchant focused conversations that's
ever been had in the test which is why it's so startling that the remote in this case is a
computer.
And that really threw me for a loop. Until I read the computer's conversations with the
other judges. What I discovered was basically it was this very simple gimmick. It had
this political conversation kind of mapped out, and basically whatever you said, it would
just say the next thing.
So there was this one judge that got very impatient and the program asks him, what do
you think of Bill Clinton, and he says well all I know is I really like pancakes. And the
computer just continues on as though nothing as's happened well I really think he needs
the conservative vote blah, blah, blah.
And I think that really cuts to one of the key things about the Turing test. It's not merely
a demonstration of the sophistication of the computer's behavior; that no static demo is
going to be sufficient. It only really matters once you start to interact with it and see how
it responds to you.
And to my mind, at least, there's a nice parallel with our sense of what human
intelligence is all about. That it's not -- we tend not to think of ourselves as having this
massive raw horsepower brain, but this really sort of nimble agile responsive, reactive
type of intelligence.
And to some extent I think you see the same thing in politics itself, where we're not that
impressed with a candidate's prepared remarks. We sort of hold judgment to see how
they can field the Q&A.
If this were a sort of Turing test, you could ask yourself, you know, is this really the
author is this just some handsome stage actor that's been hired to deliver the remarks.
And you won't really get to know that answer until you ask me questions at the end. So
ask me good stuff so I can prove that I'm really the author.
And I think there's an interesting parallel for me in the recent history of consumer
computing, where Moore's law is more or less continuing. We're creating smaller and
smaller processors, denser transistors, all of this is going on as it did in the '90s.
But consumers generally speaking are less excited about getting the absolute cutting
edge in power as they are in getting something with a better interface.
So a lot of the big breakthroughs have been going from desktops to laptops, or
smartphones or tablet computing. We're really excited about those breakthroughs and
these are not the cutting edge in terms of speed, memory, horsepower.
But they present something that's more reactive, more sort of nimble, where if you have
to spend a lot of time using the interface, typing in commands, that sort of thing, the fact
that the processor speed is so great behind the scenes is of little importance compared
to if you can just make these really natural gestures and get it to do what you want.
And I think a real great recent example of that is the Kinect. That it represents, you
know, not this giant sort of hulking super-powered processing thing, but it, more
importantly, represents a new way of interacting with computers, of them being sensitive
to what you're doing and responding to you in a natural sort of fluid way.
I think that's been a paradigm shift that's also been happening in theoretical computer
science, where in the 1960s, the big paradigm was what's called computability theory.
The basic mantra of computability theory is we don't really care how long it takes to do
the computation, we just want to know if it's possible or not. The idea is to produce
correct answers timely would be great but it's not that important.
Whereas, real life situations, if you think about being in a conversation with someone, is
much more exactly the opposite, which is produce a timely answer, correct if possible.
And the more correct the better. But timeliness is the crucial thing.
And that was one of these sort of paradigm shifts of going from computability theory to
complexity theory. You see the same thing in linguistics, so Nom Chomsky's theories in
the '50s and '60s are all about I don't care about shifts in attention, I don't care about
negotiating who is talking, who is listening or people finishing other's sentences or
struggling to find a word, I'm not interested in the performance of language, I'm
interested in this sort of platonic ideal of sentence grammar.
And it's that same thing of as correct as possible, but let's set aside these kind of
performance concerns. And there's this paradigm shift in linguistics in the '80s and '90s
of becoming in fact very interested in just those sorts of things.
How in a conversation do we negotiate whose turn it is? How is it that we sort of give
these yeahs and uh-huhs and actively shape what the person is saying just through the
act of listening?
How is it we sometimes find ourselves in conversations where there is no actual turns
being taken? There's just this steady stream of language that everyone's sort of
chipping in.
You see it most common with humor, where someone makes a joke, someone will put a
twist on the joke, someone will then replace a word and add that and you're all sort of
just building this thing together.
It's completely far off from the more sort of call in response Q and A walkie-talkie style
presentation. It's more the theoretical model.
So as an aside, I think this is one of the great dangers of cell phones, that in the land line
era it was not always easy to reach someone, but when you reached them you were
interacting in a real low latency, low lag kind of way. Land line lag is about a hundred
millisecond or tenth. And typical cell phone lag is about six times that. And we're really
excited by the fact we can reach other all the time. But I think what we're slightly less
aware of is what we have to sacrifice in order to do that.
Which is precisely this really delicate ballet of timing that makes some of these great
moments of humor and spontaneity possible. And I think looking at chat bots is one
example of these artificial approximations of conversation.
It gives us a window unto that and makes us more aware of this quality of natural
interaction that we hadn't really been that impressed with or focused on previously.
And this was one of the things that I tried to do in the Turing test. I tried as much as
possible to break down the Q&A process. So a typical Turing test conversation looks a
little bit like a deposition where the judge is saying how many legs does an ant have,
enter and you type six, enter.
Okay. You know, who is Mozart. And so part of my goal was I looked at that model and
I said: You know, we're giving the computers way too easy of a time, because part of
the nuance of negotiating normal conversation is figuring out when to speak, when to try
to get a word in edge-wise, when to wrap up what you're saying because you've sensed
that they're getting bored or you sense they're about to interject you quickly try to wrap
up so you can yield the floor.
There are incredibly complicated things going on when we're doing that. I didn't want
the computers to get off the hook easily. So that was my goal, it was to be sort of as
fluidly interchanging as possible.
The Loebner prize protocols make it slightly easier, because you can see someone
typing in real time. So you can see their characters appearing on the screen. You can
see them back spacing, changing their mind.
And I tried to do that as much as possible. So if someone said, hey, how is it -- and they
start to type going but then they make a typo so they start back spacing. Instead of just
waiting patiently, I would try to saver them the awkwardness and just be like I'm good,
yes, I know what you're typing, and the answer is I'm good.
And the same thing happens in communication. Someone falters, they're looking for a
word and we swoop in and we gracefully take the mic.
I think at a practical level, for those of us that use things like G chat and instant
message, one of my little points of advice that I've gleaned from participating in the
Turing test is hit enter way more than you think you would need to.
Because they can't see their sentence as you're building it. So instead of carefully
manicuring and editing a sentence and sending it type the first three words, hit enter
while you are typing the next several words, so that they can see the thought that you're
trying to make.
If they know where you're going with it, they'll just answer. Or if they think you're missing
their point, they'll just interrupt you.
These are some of the great graces of natural conversation. So we can do things to try
to encourage that happening in these artificial mediums.
One of the things that we saw with this Kathryn program is that it's extremely impressive
in a very, very limited domain. If you have this one particular conversation, it seems
great. But other than that it's not going anywhere.
Typically, this is a trade-off that Bot programmers have to make between the range of
behavior that they can produce and the consistency or coherence of that behavior.
So recently a number of bots have been put on to the Internet, and basically what they
do is they have all these conversations with real users that log in.
And they're kind of scraping all the data. So they'll say, howdy, what's up? And you will
type: Oh, nothing much. And a computer kind of files it away in its giant Rolodex, okay,
I have noticed that humans like to say nothing much when you ask them what's up.
And over millions of these conversations, it's got this massive archive of human-like
things to say that can be actually extremely impressive.
So if you go in and, for example, I started typing in French to see if it would be able to
react to me in French. I found not only did it understand my French, it corrected my
French.
It said, no, that's not how you would say that. We conjugate it this way.
Which implies that someone else had made that very same mistake.
You can ask it, you can start typing in Bohemian rhapsody lyrics, and you'll find it sings
along with you. You type in scaremoosh scaremoosh and it types back, will you do the
fandango. And it's really uncanny. There's range of behavior it's able to access is
extremely uncanny.
But you start to find it out, not so much by realizing that you're not talking to a human, as
by realizing that you're not talking to a human.
And so if you ask it where do you live it will say central London. You'll say have you ever
been to Disney World. It will say no, because I live just down the street. It doesn't make
sense because it's cobbling together different pieces of previous conversations.
As long as you ask questions with the right answer, they will generally give you the right
answer but if you ask questions that require a particular embodied life experience, you
start to realize that it's just this giant kaleidoscope of all these different people.
And so that was something I as much as possible tried to be attuned to. If they would
ask me a factual question like what's the weather like today, instead of just saying good,
I would say something like, well, okay, instead of saying it's rainy, I would say, well it's
rainy, but you know I'm from Seattle so that's pretty much par for the course.
Whatever they would ask me I would try to embed it in this life narrative to give them a
sense of there was a reason I knew everything I knew. There was a reason I didn't know
everything I didn't know.
And it all traced back to this life history. Another big part of a program like clever Bot,
which is sort of scraping together all of these conversations, is a principle called The
Book. And The Book comes from game theory of things like checkers and chess. And
it's this idea that because every game of checkers or chess starts in exactly the same
position, there's a lot of theory generated over what are the appropriate moves in that
position.
And because in the case of chess, there are only 20 starting moves. There are only 20
replies.
It takes a while before the combinatorics explodes and you find yourself in a position
you've never played before.
So at the highest level of play, the game is not even thought to begin until what's called
the opening novelty, which is the first move that gets the game out of the book, out of the
realm of positions that have been played by other top players.
And the struggle to get out of book is actually one of the big struggles of high level play.
The computer has certainly made the problem worse because they have access to giant
databases of every grand master game ever.
But it's a problem that plagues a lot of high level games. So, for example, one of the
anecdotes that I raise in the book is that checkers died in 1863. The World Checkers
Championship was being held in Glasgow, Scotland between Robert Morten and James
Riley and they were to play a 40-game series over a span of a month. And 21 of those
40 games were the exact same game, start to finish; that there was such a body of
theoretical knowledge about which were the best moves and both players, the stakes
were too high for them to play inferior moves, merely to get the other person into this
fresh position that they just stuck to the script and kept playing the same game over and
over.
And it basically threatened to kill checkers. You can imagine that the sponsors were not
thrilled at the outcome. And so part of the reaction on the part of the professional
checkers community was if you don't like how the players are opening their games, open
the games for them. And so they adopted this thing and it became known as the
three-move restriction, which is you sit the two players down.
You draw moves out of a hat, make a bunch of random moves at the board and let them
play.
It's basically a way of strong-arming them out of the book. So for me there's this great
conversational analog to the book, which is clever Bot, for example, is comprised of
about ten million previous conversations.
So as long as you're saying something that has been said in a corpus of ten million
conversations, you're on pretty solid turf. The computer is going to have a response
ready. Of course, one of the beautiful things about language is that it's really easy to
utter a sentence that has never been formulated before.
So the words "perlescent" and "dudes" have probably never been put into a sentence
before. But, of course, we don't do this. Etiquette and the ritual of society gives us a
kind of conversational book. Hi, how are you? I'm good. How are you? How was your
drive? It was not bad. A little bit of traffic. How was yours?
And we spend a lot of time in this kind of conversational book. These questions that we
know to ask and answers that we know to give.
And so when I was in the Turing test, a big challenge for me became how do you get out
of book in five minutes of IMing without seeming like you are trying so hard that it starts
to backfire.
So the judge would come in and say, hi, how's it going, and I would immediately get
nervous. It's like how's it going, the classic conversational opener. We need to get out
of here as quickly as possible. We need to find some way to have an argument about
The Rolling Stones or decide whether we'd rather be a whale or a squid, just something
that's never happened before in this giant database, how are we going to get there.
And for me it's one of the great parallels between these sorts of formal games. And the
same thing that happens when you meet an old friend over coffee, is that how do we sort
of plow through these formalities and patterns and get through to the place where you're
sort of actively struggling to put words together, actively trying to figure out what you
want to say.
And with chat bots, that's how you break them. And with people, that's how you come to
really know them. So for me, you know, the computer in this broad philosophical sense
is the most significant chapter in this really long philosophical history of who are we,
what are we all about? Can we look to the external environment for clues about what
makes us different from everything we're experiencing?
And you know the computer has come along and I think corrected some long-standing
errors in the history of philosophy. So if you go back to Aristotle and Descartes, they're
extremely dismissive of animal cognition, because their thing is, you know, only humans
have souls. Let's prove the argument.
Okay. So animals can navigate physical space. They can recognize objects. They can
recognize each other. They can form these social groups. They can achieve goals.
They can do certain rudimentary types of problem solving, all of that must be totally
unimportant and unimpressive.
Let's pay attention to this very narrowly defined set of abilities. We have the ability to
think abstractly and think in this very rigid step-by-step logical way where we're making
deductions, we can do mathematics, all the things that, of course, the computer came
along in the 20th century and not only proved itself capable of, but much more capable
than ourselves.
And I think really sort of hit Aristotle and Descartes where it hurts. So it sort of throws us
back in the question of what are the impressive things about ourselves? And I think AI is
coming back to us with all these really surprising verdicts.
Who would have thought, for example, that computers would be landing planes before
they'd be riding bikes, whereas for us it's exactly the other way around.
That computers are translating the minutes of the United Nations, but they still have a
really difficult time being presented a Polaroid of a cup and saying that's a cup, which
any 5-year-old can do.
That I think the progress but also the obstacles to AI has given us, I think, this very
surprising and fascinating window on to what is impressive about ourselves, perhaps
more so than we thought.
That you know some of what we thought was easy is actually hard. So what we thought
was hard is actually easy.
And the big bottom line for me is that there's this tremendous unsung complexity to the
activities of daily life. That you're walking down the street, you're dodging pedestrians,
you see someone that's vaguely familiar and you say that's my buddy Bob from college.
Like Bob what's been going on in the past ten years and he's in the strange position of
having to compress ten years of experience into one sentence, that we typically don't
think of these things as major cognitive achievements as compared to factoring giant
numbers.
In fact, they are. In fact the mundane kind of daily activities are where a lot of the real
complexity of human experience is.
I think the Turing test is a real window into that in the verbal communicative sphere, that
it's giving us a sense of what really is going on when we talk, what are the things that are
truly interesting and subtle and impressive. And it's not just a pat on the back.
It is that to some extent, but it's also a call to action, that we don't always know what
these things are. We don't always manage to get out of book even despite our best
efforts.
And so it's an occasion not just to celebrate these qualities, but to actively pursue them.
So we build these things in our own image, but the approximation always leaves this
gap. And that gap then always has something new to teach us about who we are.
So I'll stop there and take questions if you guys have any. [applause].
>>: I can see why you have the most human prize.
>> Brian Christian: Thanks.
>>: One of the important points in your description of the Turing test as you went through
it in England was the time allowed for this interaction. Five minutes. And you said
obviously a shorter test makes it a lot easier for a computer to win longer.
Can you just talk about the function of time, what is it that's going on that makes that
important?
>> Brian Christian: I mean, intuitively if the Turing test were two seconds long it would
be pretty easy for the computers to pass themselves off. They'd say hi. End of test, and
you're left with nothing to go on.
And I think that we can pretty much intuit that if the Turing test were eight hours long,
that there would be plenty of opportunities to try all sorts of different things that no
program would possibly hold up to that kind of test.
So you can make the argument that this 5-minute mark is this very arbitrary thing. That
instead of looking at what percentage of machines can fool us in five minutes, we should
go exactly the other direction and look at what duration of time is the appropriate window
in which computers can fool us, and watching that window get larger and larger.
So that was one of the things that I was aware of; that the test is to some extent a race
against the clock, and so amongst my sort of more nuanced strategies, there was also
the very simple strategy of just typing as fast as I possibly could.
>>: So when you type correctly, obviously humans make many spelling errors. And you
said that the judges could actually see as each character hit the screen. So wasn't that
kind of a dead give-away or were the computers programmed to do that as well?
>> Brian Christian: In fact, the programs are designed to fake typos, which -- this is one
of the criticisms that people level against the Turing test. It's of no use to computer
science that we're spending man-hours faking typos.
So touché. But I, for example, was aware of my typos, partially originating from the
British keyboard layout. So the at sign and quotation mark sign were reversed. So
whenever I made that particular typo, I deliberately left it in rather than back spacing it
and would say something like argh, UK keyboard, such that my typo was not simply like
noise in the signal, but it was actually an artifact of muscle memory operating on a
different keyboard layout.
So my hope was that that was sort of a more nuanced type of typo. But I don't know if
that worked or not.
>>: I have a one-word question for you. Watson.
>> Brian Christian: Okay. I may need a few more words to answer you. So I think the
Watson super computer -- you guys probably all know this -- but IBM developed the
super computer that beat the two big human Jeopardy champions.
I infer from your question that you are curious, is this like a big deal or not?
>>: Just wanted to know your response to it is.
>> Brian Christian: So I think it's part of this great sort of counterintuitive narrative of the
development of AI, that when we look at someone like Ken Jennings or Brad Rutter
we're impressed by how many facts they know. We're not impressed or excited by how
many facts exist on Wikipedia or something like that.
And, in fact, the really amusing thing for IBM the challenge was not encoding, tagging
and then accessing all world knowledge, it was how are we going to deal with Alex
Trabek's puns. How are we going to handle figurative language and these sort of
oblique clues that don't fully articulate what they're getting at but any human can
understand.
That was like basically the entire challenge for them; that there was this giant pun team.
And for me that's one of these great sort of life-affirming results, that these things that go
totally underappreciated in fact turn out to be completely challenging.
Yes?
>>: Have you seen the new Androids that have been built that are basically -- basically
look human, I know there's one in Denmark that we've seen on YouTube, that's where
basically this professor built this robot to look like himself, and with the mimicking of the
facial for expression and everything else. Just wondering what your reaction is to that,
put that together with the communication and bots, what ->> Brian Christian: Yes, there's definitely -- we're starting to come out of the uncanny
valley in terms of these Android systems where they are no longer looking like creepy
corpses, but they're actually looking like slightly animate people.
And in fact Hugh Loebner has set aside a special $100,000 gold award for the first
system that's able to pass the Turing test over video.
So if you're like Skype videoing, I don't think we're going to be seeing that in the near
future.
But it's interesting to think about also, you know, it's one thing to create like a really
impressive wax statue, and it's another thing to create a really impressive robot hit the
smile button and it smiles.
But you can imagine almost a purely gestural Turing test, where like with an infant, the
way parents communicate with infants you make a face, what face does it make back.
That would be this other whole other level of -- I don't know, it's interesting to me to
speculate what the different thresholds will be.
>>: Do you think are we not being too tough on the computers in a way you look at the
Turing test, the judges are really very smart people, scientists.
If we put the computers in front of an average representation of the population they
would have a far better chance of passing the test, talking about Watson, I think it caught
a lot of flack because it said that Ottawa was in the states -- but if you go out on the
street and you ask someone about Ottawa, is it in the states, I'm sure that you're going
to get a significant percentage of the population that would say, yeah, yeah, it's in the
states, of course.
So are we not in fact holding the computers to a bar which is higher than what we would
expect for your average human?
>> Brian Christian: Yeah. I think that's a great question. So one of the sort of AI jokes
back in the '60s was a professor claiming that he had invented a program that passes
the Turing test by perfectly mimicking a catatonic person.
And it sits there and it doesn't type just like a catatonic person would sit there and not
type.
So part of what the Turing test hinges on is which people are we comparing computers
to. I think speaking to the Toronto question, this is actually one of the really interesting
things, is that the categories U.S. cities it answers Toronto and we kind of all guffaw. In
fact, there's several U.S. cities called Toronto they're just really small rural cities that we
sort of automatically dismiss, and that the smugness of being able to guffaw Toronto is
obviously -- he's obviously not referring to Toronto Kansas, population 300, that that's
the disambiguation of language. It's a great example.
So for us to even make fun of it kind of hints at our ability to disambiguate language in
this kind of subtle way.
>>: I have kind of a bookend to that question. If I took a test and I only had to get a 30, I
would think, gee, this is easy. But where did that percentage come from?
>> Brian Christian: The percentage -- as far as I know -- totally pulled out of thin air by
Alan Turing. What Alan Turing is saying in his original paper is as soon as a program
hits this mark, it's going to be magically endowed with intelligence, but he's really talking
about popular opinion.
So he's saying I think by the year 2000 computers will be fooling about 30 percent of the
people and will generally, as a result, there will be this kind of [inaudible] of talking about
them like they're intelligent. And since he's Alan Turing and the grandfather of a lot of
computer science, those particular numbers have been really carved into stone at these
magic numbers. But he's really just talking about sort of a paradigm shift.
But, actually, on your comment of only needing to get a 30 percent on a test. One of the
best tests that I ever took when I was in school is this test called the American
invitational math exam. And to give you an example of how difficult this test is, it's pen
and pencil only. It requires nothing more than like sophomore-level geometry and
prealgebra. It's three hours long. And the average score is one. One question correct.
That what I found really exhilarating, thinking about the difference between a test like
that and a test like the SAT for American history, where there's a ton of questions, John
Adams, X, Y, Z, this year and everyone goes down the list and checks off all these
boxes.
That part of the skill in the AMIE is you'll be happy to solve just one problem. So you
look at this giant list and you think, okay, where am I going to attack? What seems
doable? What seems easy. What seems really fun but maybe actually too hard?
Here's a problem where I know that there's a brute force solution, if I sit here and
calculate a bunch of multiplication on this piece of paper.
But I'm pretty sure there's a more elegant solution, but will thinking up the elegant
solution take more time than doing all the arithmetic?
You have to make all of these decisions about priorities. And in that sense I think it
represents this much more sort of human version of a math test rather than here's a ton
of problems, do them all and I'll expect a 100, whereas like here's a bunch of extremely
tough problems, find one that seems maybe doable.
I think that's much closer to what it's like to work in the real world. So I get excited by
these deliberately impossible tests.
>>: I have like two questions. The first one was, you were talking about how like a
[inaudible] is what makes the human unique compared to the computer. I'm just curious,
going from like a standpoint where [inaudible] user tries to do things where we create
personas to design by. And I'm curious what's to say you just can't give a computer a
persona to live by, to almost imitate a life, like you were saying?
>> Brian Christian: Yeah. I don't think that there's a reason you can't, that it can't be
done. That I think if we develop systems that are truly Turing test passing like in this
totally indistinguishable way, part of it will be that you have to -- instead of training them
on a giant dataset you've trained them by just like putting them in a carrier and walking
around the world for like ten years.
That, you know, everything we know is a product of this sort of continuous experience.
So, yeah, I think it's totally possible. The question is, can you marry these two
approaches of these sort of more authored bots and the more sort of like wiki bots? Is
there a way to get the best of both worlds where you're sort of crowd sourcing,
collaborating, but there's also a coherence that I think that becomes one of the big
challenges, and I think it's an exciting one.
>>: So my other question was there's been this interesting debate going on in the past
couple of days about Bill Gates gave a talk and he was saying we should kind of reduce
our investment in liberal arts degrees because they don't lead to job creation.
And then a couple of days after, the iPad 2 announcement, Steve Jobs came out and
said that their products are the merriment of liberal arts. I noticed in your bio you have a
dual degree in CS, philosophy and poetry. I'm curious, there's some connection here
between the human-human and how it all interrelates. I was curious, since you do seem
to have both degrees from both sides of the fence, where do you sit in this debate? Like
who do you agree with or disagree with and how does that affect like the
human-humankind of idea?
>> Brian Christian: I don't necessarily want to take sides in the Gates Jobs face-off. But
I do think -- I mean, it strikes me if you look at something like poetry, it seems like it
would have no connection whatsoever to something like cyber security.
But, in fact, as we were talking at the beginning of the talk about e-mail spam, that part
of how we authenticate our e-mails is not in this content-based way of like here's my
passcode, here's like my secret pin so you know it's a real e-mail from me, but instead I
act like myself or try to say something that seems very me.
And it becomes a question of aesthetics. That I think there's really a space where
aesthetics and something like IT come together.
And so without knowing the full argument that Gates had in mind, I would like to argue
that there's still a place for the arts and technology to come together.
>>: Two more.
>> Brian Christian: Two more.
>>: From your experience, if you were now made a judge, do you have a strategy that
you would use based on what you see as sort of the current set of weaknesses?
>> Brian Christian: Typically -- typically there are two kinds of judges. There are the
interrogators, who just sort of barrel tough questions at you.
And the small talkers, that just kind of say, hey, what's up, and see where you go with
that. It's also interesting if you go back on the history of the test, there were really two
eras of the Loebner prize. First 5 years it was done with specific topics. You'd sit down
to a terminal it would give you a top pig, talk about the NHL. They had to throw it out
partially because there was no rigorous way to define topic.
If I start talking about like the Cold War context of the 1980s hockey Olympic match
between the U.S. and USSR, are we still talking about hockey or now talking about
politics, that they had these judges who would declare certain remarks out of bounds.
But there was no rigorous way to do it. So they scrapped it in favor of this totally wide
open test. And in theory that makes the test infinitely harder for the programmers,
because they have to prepare for everything.
In practice, it actually made it easier, because they mostly had to prepare for small talk.
So I think that my strategy as a judge would be something where I would go in with a
totally random topic, but at the same time allow someone to sort of figure out what they
wanted to say about it, where I was getting them out of book but also giving them
enough breathing room not to merely answer questions with facts but to see how their
mind kind of wandered through a specific domain. I think there's a place for it.
>>: You mentioned the challenge, that the direct puns caused for the IBM scientists,
have any of the Turing judges ever focused on humor as a way to distinguish?
>> Brian Christian: Yeah. Humor has been one of those things like typos where very
early programs were kind of preloaded with these jokes that they could crack.
And it made a really big impression on a lot of the judges early on. Because we think of
humor as one of these characteristically human things.
But to some extent there's a great quote by Salvador Dali that I love that says the first
person to compare a woman's cheeks to a Rose was obviously a Poet. The second
person to make that comparison may well have been an idiot.
That to coin a joke requires a sense of humor, but to tell a joke doesn't necessarily.
And so for that reason I think people who can sort of draw into this reservoir of one-liners
and cracks, in real life I now find that slightly less impressive than the people who are
operating in a more sort of witty word play style where they'll take exactly what you said
and torque it slightly and you know you're getting an original joke. So I think that's the
trick, is it's not just a joke but a complete context-sensitive kind of one-of-a-kind joke.
>> Heather Mitchell: Thank you.
>> Brian Christian: Thanks so much.
[applause]
Download