22075 >> Heather Mitchell: Good afternoon, and welcome. My name is Heather Mitchell, and I am here to introduce and welcome Brian Christian, who is visiting us as part of the Microsoft Research Visiting Speakers Series. Brian is here today to discuss his book, "The Most Human Human: What Talking About Computers Teaches us About What it Means to be Alive." In 1950, Alan Turing predicted that computers would become so sophisticated that, quote, we would be able to speak a machine's thinking. The Turing test has become the holy grail for scientists studying artificial intelligence. Each year, computer programs compete for the Most Human Computer award. Brian was awarded the Most Human Human award in 2009. Brian holds a dual degree from Brown University in computer science and philosophy, and an MFA in poetry from the University of Washington. His writing has appeared in such widely disparate publications as AG and I, Gulf coast, the Seneca Review, Best New Poets and Cognitive Science. Please join me in welcoming Brian to Microsoft. [applause] >> Brian Christian: Thanks so much. And thanks everyone for coming. I am a humble cartographer of the advances of technology, but you guys actually do this stuff. So, yes, it's a pleasure to be here. So really the book tells two stories. The first story is a philosophical story. It's how computers fit into this long-standing narrative of humans thinking about themselves. So traditionally we've looked to compare ourselves against animals as an attempt to figure out what it really means to be human. And so the invention of the computer, I think, marks this really decisive shift, where we now are more preoccupied with distinguishing ourselves from machines than from animals. So the question has really sort of pivoted. And the implications that come out of that are extremely exciting, I think. The second story that the book tells is this personal story. So it's about my own involvement in my own small way in the history of this competition called the Turing test. So in the fall of 2009, I found myself in an extremely strange position. I was 5,000 miles away from my home in Seattle, in Brighton, England, and I was preparing to have a series of 5-minute-long instant message conversations with a number of strangers. And my goal in these conversations was probably one of the strangest things that I'd ever been asked to do, which was that I had to convince them that I was human and they were going to be slightly skeptical of this. So it's this very curious position to be in. This test, I'm sure, many of you are familiar with this. It's called the Turing test. It was first proposed by Alan Turing in 1950 as the computer, modern computers as we know was still in its infancy, people were already asking these really philosophical questions, like can these machines actually think in the human sense. If they could think, or if we could build a machine that could think, how would we know? And Turing's answer was, let's just put theory and philosophy aside, I'm going to create a practical test. We're going to convene this panel of scientists, and they're going to have these 5-minute-long text chats and they're not going to know whether the text that they're receiving is coming from a real person or from a computer program that's designed to mimic human conversation. And it's their job in five minutes of back and forth chatting to try and figure out, to try to steer the conversation such that they can make that determination. It was Turing's famous prediction that by the year 2000 computers would be fooling the judging panel about 30 percent of the time, and that as a result we'd speak of them as being intelligent, without expecting to be contradicted. This prediction did not come to pass. Typically, the test has been held annually since the early '90s. Typically the best programs are only able to fool maybe one of the judges. But 2008 was this very strange kind of watershed year where the top computer program managed to fool three out of the 12 judges, or 25 percent. So the computer was basically one vote shy of passing the Turing test. So kind of a narrow squeeze for humanity. And my ears really perked up at this, and I thought, well, 2009 could in fact be the pivotal year. And this could be the year that they finally cross that threshold and this voice sort of rose up in me that was basically not on my watch, I'm not going to let that happen. So is there something that I can personally do to get involved and be a part of this, defend my, the honor of my species. So I got in touch with the organizers. The man who oversees this particular incarnation of the Turing test, which is known as the Loebner prize, is a rather colorful and inscrutable figure, this disco dance floor baron, famous for selling portable roll-up lighted disco dance floors in the '80s and '90s, made his fortune and decided that he wanted to immortallize himself in the anales of science and got together and said I think technology is now at a point where we're ready to have these Turing tests. So these got this long-standing cash prize which is outstanding for whichever program can break through first. So I called up Hugh Loebner, and before I knew it I was in touch with the organizers. All right. Your name's on the list. You're coming to Briton England to be part of this panel of four real people that are going to represent humanity in this contest. And it's this very sort of strange position of, you know, being in this situation of needing to persuade the judges that I am in fact a human. Knowing that there's a computer that's trying to persuade them the exact same thing. So as soon as I knew that I would be participating in this test, the real question for me is what do I do? How do I deliver a human performance at this event? And the organizers' advice to me was basically what I had assumed that they would say, which was, well, you are human, so just be yourself. And that advice kind of haunted me, just be yourself. There was a level at which I thought maybe that's a naive overconfidence in human instincts, or at worst, maybe even fixing the fight. Because we know from experience that conversations are not uniformly successful. There's a huge demand in our society for conversation coaches, dating advice coaches, conflict resolution coaches, and so there is a sense in which we're all sort of struggling to figure out what it means to communicate better. And as I looked back over the 2008 transcripts, I was slightly horrified at the performance that the real people had given. So there were a number of cases of the humans being in fact down right apologetic that they couldn't make better conversation. One says: I bet the confederates are really getting tired of talking about the weather. I'm sorry for being so banal. Meanwhile the computer in the other window is basically charming the pants off the judge who in no time is gushing LOLs and smiley faces. I'm reading these logs and I'm thinking we can do better. And so of course I have to figure out what that means. And so I really committed myself, basically, to being as disobedient to the organizer's advice to show up in Brighton and be as myself as possible. I looked back over the history of the tests, at which conversations went well, which ones didn't. I looked into how these software programs are built, what they're good at, what sorts of simplifications they have to make in order to be viable. I talked with psychologists, linguists, computer scientists, philosophers, all trying to get at this question of what is it that's really sort of [inaudible] nuance and complex about human dialogue? What sorts of corners do these programs have to cut and how can I emphasize precisely those things when I go there? Now, ordinarily there wouldn't be anything strange about this. We prepare to take standardized tests. We prepare for tennis competitions. But given that the Turing test is meant to evaluate how human I am, there's this weird implication that perhaps being human or being myself is about more than simply showing up. And I really came to think that it is. And so the answers that I found along the way really ended up being applicable to a lot more in life than just the Turing test. So, first of all, I want to give you guys what I think is a strange and more than slightly ironic cautionary tale. Dr. Robert Epstein, UC San Diego psychologist, the editor of the Scientific Volume Parsing the Turing Test and the co-founder with Hugh Loebner of the Loebner Prize Competition, subscribed to an online dating service in the winter of 2007. He began writing these long letters to a Russian woman named Yvonna, who would respond with these lengthy letters of her own describing her life and her family, and her growing feelings for Dr. Epstein. Eventually, though, something didn't feel quite right. To make a long story short, Epstein comes to realize that he's been exchanging lengthy love letters for over four months with, you guessed it, a computer program. So poor guy. It's not enough that these Web roughians spam his in-box every day now they have to spam his heart. And part of me wants to laugh at the guy. He founded the Loebner prize Turing test competition, what a chump to basically fall for the very thing. But I'm also at the same time a little bit sympathetic. You know, the unavoidable presence of e-mail spam in the 21st century, we're talking 97 percent of all e-mail messages sent across the net are spam. That's about tens of billions a day. According to several calculations, you could literally power a small nation. For example, Ireland, with the electricity required to process the world's daily spam. But all that spam I think does something arguably worse which is it erodes our sense of trust. I sort of resent that when I get an e-mail from someone I have to spend a tiny bit of energy for the first few lines at least deciding that it's actually them talking to me and their account hasn't been hijacked. And so in some sense we go through the 21st century now with our guards up; that basically all communication has now become, at least all digital communication, has become a kind of Turing test. When I want to e-mail someone a link and I want them to trust it enough to click on it, I not only have to send it to them, but I have to act like myself in some way. I have to say some sort of very-Brian thing so that they'll truss it's really me sending it. So we're starting to see, even though these programs are not actually passing the Turing test. They're actually putting this pressure on to actively be ourselves and assert our personality. And it's not only a question of etiquette, which I suppose it's always been, it's now actually a part of online security to act like yourself. And that for me is extremely interesting. So that's part of it. And I think the other part of it is I'd like to think in this particular example that Dr. Epstein learned his lesson that something occurred to him about why it took him four months to realize that there was no actual human exchange occurring between him and Yvonna. And I'd like to think that in his next relationship, which is hopefully not only with a flesh-and-blood homosapian, but someone closer than 11 time zones away, I'd like to think that whatever lessons he learned that his now actual girlfriend will have Yvonna in some strange way to thank. So spam and the publicity kind of dastardly on-line programs trying to make people fall for them indicates this way in which the Turing test, which used to be this purely abstract, academic philosophical thing, is actually creeping into daily life. Increasingly, we're seeing customer service being outsourced to chat bots. If you go to IKEA, they have Anna, a chat bot, who will direct you to the curtains department. Alaska Airlines has a chat bot and Continental airlines, and the Army has a chat bot called Sergeant Star. And I enjoy asking him about a bunch of classified things and getting his like formulaic, I can neither confirm nor deny the existence of Delta Force. But what we're also seeing at the same time is I think part of what the Turing test really makes us ask is how much of a difference is there between customer service that's actually being handled by a chat bot and customer service that's being handled by real people, basically reading from a script, reading a set of prepared solutions or reading off of an FAQ or stepping someone through this very kind of rigid protocol. They are not giving a Turing test passing, customer service, in that case. So that the lines sometimes between people and machines can blur when we're operating this very rigid way. I think partly, to back up and sort of widen the lens a little bit, partly to understand the anxiety that people have about their relationship to computers, it's worth noting that computers used to be human. I don't know how familiar this is to most of the people in the room, but back before the word "computer" meant one of these digital devices in our pockets or on our laps. It meant something entirely. It was a job description. So engineers fell in love with computers all the time. It was totally common place. And in fact it's really fascinating, if you go back and look at these really early papers by Turing and others in the 40s and 50s where they're trying to explain to their readers exactly what these contraptions are, they'll say things like: Well, you know, it's kind of like a computer. And what they mean is it's sort of like a person whose job it is is to do math. Part of what I find really fascinating about the past few decades is that it is the digital computer that has not only become the default term but the literal one. And to call a human a computer is now a metaphor. Whereas to call a machine a human is a metaphor. Now it's switched. Now it's the human math whiz that's like a computer. But the whole point was the computer was modelled on him. So we're in the strange position of actually imitating our old imitators. We are like what used to be like us. The Harvard psychologist Daniel Gilbert says that every psychologist must at some point in his or her career write a version of what he calls the sentence. And the sentence reads: The human being is the only animal that blank. So humans' sense of self you could say is the story of these failed debunked versions of the sentence. We once thought humans were unique for being able to use syntax. That's being chipped away. We thought humans were unique for using tools. That's being chipped away. And perpetually these psychologists are trying to come up with new formulations. But what I think is really interesting is again it's not animals that we're really comparing ourselves to anymore. So for Aristotle and Plato and Descartes, they were incredibly impressed with our ability to remember facts and think in this really rigid abstract way and do things like long division. We are not impressed by that capacity, really, anymore. We're constantly talking about how bad we are at math and how bad our memories are. So this yardstick for what it meant to be human has now changed. We have this different yardstick. So you can of course ask is it appropriate, first of all, for our sense of self to be reactive to other things in the world and define ourselves purely in relationship to these other things that we don't fully have control over? Are we in this constant process of kind of backing off of our claims to what makes us special? Part of what I think is interesting about the Turing test in particular is that it's not only a benchmark for machine intelligence, but it cuts both ways. The Oxford philosopher John Lucas says that when machines pass the Turing test, it's not going to be because they're so intelligent, it's going to be because we're becoming so wooden. And that's what I find really compelling about it. It's like if we get to this position where these machines conversations are indistinguishable, to what extent is that an achievement on the part of the machines, to what extent is it an indictment on us. That's part of what I was really engaged by as I went through the process of trying to prepare for the test, is what can it teach me at a practical level about just how to talk to people. And that's what I found really sort of rewarding about it. Is I think the Turing tests ultimate questions are practical questions, which is what does it mean to communicate when someone as effectively as possible given the parameters of time and language and medium. And what is it that we're particularly good at and how do we emphasize that. And as you look at the history of chat bots you start to get these really interesting answers. So I'll give you, for example, one of my favorite transcripts. This is from the 1997 Loebner prize competition. So I'll use the convention of calling the judge, judge, and calling the thing or person at the other end remote, just so that I don't have to commit to a word that's either like person or machine. Remote: Hi, my name's Kathryn, what's yours? Judge: My name is Elliott. Remote: Well, it's nice to meet you Elliott. Did you see that story on CNN last night about the lesbian company who came out at a White House party on Sunday? Judge: Yes, I did, I think it may all be a publicity study for Ellen. Remote: Ellen DeGeneres was one of them she was kissing her lover in front of Bill Clinton. Do you think he was embarrassed? Judge: I think he wants the gay vote, he will do anything to get it and keep it. Does not have a conservative base. Remote: Are you still a friend of Bill Clinton? Judge: I'm not sure he has any real friends. He's achieved a lot, left a lot of people behind. Remote: You really don't like the guy? Well, at least that's definite. I thought most people were just indifferent and numb by now. The conversation continues. I have to say having gone over the history of the Turing test, this is up there as probably one of the most trenchant focused conversations that's ever been had in the test which is why it's so startling that the remote in this case is a computer. And that really threw me for a loop. Until I read the computer's conversations with the other judges. What I discovered was basically it was this very simple gimmick. It had this political conversation kind of mapped out, and basically whatever you said, it would just say the next thing. So there was this one judge that got very impatient and the program asks him, what do you think of Bill Clinton, and he says well all I know is I really like pancakes. And the computer just continues on as though nothing as's happened well I really think he needs the conservative vote blah, blah, blah. And I think that really cuts to one of the key things about the Turing test. It's not merely a demonstration of the sophistication of the computer's behavior; that no static demo is going to be sufficient. It only really matters once you start to interact with it and see how it responds to you. And to my mind, at least, there's a nice parallel with our sense of what human intelligence is all about. That it's not -- we tend not to think of ourselves as having this massive raw horsepower brain, but this really sort of nimble agile responsive, reactive type of intelligence. And to some extent I think you see the same thing in politics itself, where we're not that impressed with a candidate's prepared remarks. We sort of hold judgment to see how they can field the Q&A. If this were a sort of Turing test, you could ask yourself, you know, is this really the author is this just some handsome stage actor that's been hired to deliver the remarks. And you won't really get to know that answer until you ask me questions at the end. So ask me good stuff so I can prove that I'm really the author. And I think there's an interesting parallel for me in the recent history of consumer computing, where Moore's law is more or less continuing. We're creating smaller and smaller processors, denser transistors, all of this is going on as it did in the '90s. But consumers generally speaking are less excited about getting the absolute cutting edge in power as they are in getting something with a better interface. So a lot of the big breakthroughs have been going from desktops to laptops, or smartphones or tablet computing. We're really excited about those breakthroughs and these are not the cutting edge in terms of speed, memory, horsepower. But they present something that's more reactive, more sort of nimble, where if you have to spend a lot of time using the interface, typing in commands, that sort of thing, the fact that the processor speed is so great behind the scenes is of little importance compared to if you can just make these really natural gestures and get it to do what you want. And I think a real great recent example of that is the Kinect. That it represents, you know, not this giant sort of hulking super-powered processing thing, but it, more importantly, represents a new way of interacting with computers, of them being sensitive to what you're doing and responding to you in a natural sort of fluid way. I think that's been a paradigm shift that's also been happening in theoretical computer science, where in the 1960s, the big paradigm was what's called computability theory. The basic mantra of computability theory is we don't really care how long it takes to do the computation, we just want to know if it's possible or not. The idea is to produce correct answers timely would be great but it's not that important. Whereas, real life situations, if you think about being in a conversation with someone, is much more exactly the opposite, which is produce a timely answer, correct if possible. And the more correct the better. But timeliness is the crucial thing. And that was one of these sort of paradigm shifts of going from computability theory to complexity theory. You see the same thing in linguistics, so Nom Chomsky's theories in the '50s and '60s are all about I don't care about shifts in attention, I don't care about negotiating who is talking, who is listening or people finishing other's sentences or struggling to find a word, I'm not interested in the performance of language, I'm interested in this sort of platonic ideal of sentence grammar. And it's that same thing of as correct as possible, but let's set aside these kind of performance concerns. And there's this paradigm shift in linguistics in the '80s and '90s of becoming in fact very interested in just those sorts of things. How in a conversation do we negotiate whose turn it is? How is it that we sort of give these yeahs and uh-huhs and actively shape what the person is saying just through the act of listening? How is it we sometimes find ourselves in conversations where there is no actual turns being taken? There's just this steady stream of language that everyone's sort of chipping in. You see it most common with humor, where someone makes a joke, someone will put a twist on the joke, someone will then replace a word and add that and you're all sort of just building this thing together. It's completely far off from the more sort of call in response Q and A walkie-talkie style presentation. It's more the theoretical model. So as an aside, I think this is one of the great dangers of cell phones, that in the land line era it was not always easy to reach someone, but when you reached them you were interacting in a real low latency, low lag kind of way. Land line lag is about a hundred millisecond or tenth. And typical cell phone lag is about six times that. And we're really excited by the fact we can reach other all the time. But I think what we're slightly less aware of is what we have to sacrifice in order to do that. Which is precisely this really delicate ballet of timing that makes some of these great moments of humor and spontaneity possible. And I think looking at chat bots is one example of these artificial approximations of conversation. It gives us a window unto that and makes us more aware of this quality of natural interaction that we hadn't really been that impressed with or focused on previously. And this was one of the things that I tried to do in the Turing test. I tried as much as possible to break down the Q&A process. So a typical Turing test conversation looks a little bit like a deposition where the judge is saying how many legs does an ant have, enter and you type six, enter. Okay. You know, who is Mozart. And so part of my goal was I looked at that model and I said: You know, we're giving the computers way too easy of a time, because part of the nuance of negotiating normal conversation is figuring out when to speak, when to try to get a word in edge-wise, when to wrap up what you're saying because you've sensed that they're getting bored or you sense they're about to interject you quickly try to wrap up so you can yield the floor. There are incredibly complicated things going on when we're doing that. I didn't want the computers to get off the hook easily. So that was my goal, it was to be sort of as fluidly interchanging as possible. The Loebner prize protocols make it slightly easier, because you can see someone typing in real time. So you can see their characters appearing on the screen. You can see them back spacing, changing their mind. And I tried to do that as much as possible. So if someone said, hey, how is it -- and they start to type going but then they make a typo so they start back spacing. Instead of just waiting patiently, I would try to saver them the awkwardness and just be like I'm good, yes, I know what you're typing, and the answer is I'm good. And the same thing happens in communication. Someone falters, they're looking for a word and we swoop in and we gracefully take the mic. I think at a practical level, for those of us that use things like G chat and instant message, one of my little points of advice that I've gleaned from participating in the Turing test is hit enter way more than you think you would need to. Because they can't see their sentence as you're building it. So instead of carefully manicuring and editing a sentence and sending it type the first three words, hit enter while you are typing the next several words, so that they can see the thought that you're trying to make. If they know where you're going with it, they'll just answer. Or if they think you're missing their point, they'll just interrupt you. These are some of the great graces of natural conversation. So we can do things to try to encourage that happening in these artificial mediums. One of the things that we saw with this Kathryn program is that it's extremely impressive in a very, very limited domain. If you have this one particular conversation, it seems great. But other than that it's not going anywhere. Typically, this is a trade-off that Bot programmers have to make between the range of behavior that they can produce and the consistency or coherence of that behavior. So recently a number of bots have been put on to the Internet, and basically what they do is they have all these conversations with real users that log in. And they're kind of scraping all the data. So they'll say, howdy, what's up? And you will type: Oh, nothing much. And a computer kind of files it away in its giant Rolodex, okay, I have noticed that humans like to say nothing much when you ask them what's up. And over millions of these conversations, it's got this massive archive of human-like things to say that can be actually extremely impressive. So if you go in and, for example, I started typing in French to see if it would be able to react to me in French. I found not only did it understand my French, it corrected my French. It said, no, that's not how you would say that. We conjugate it this way. Which implies that someone else had made that very same mistake. You can ask it, you can start typing in Bohemian rhapsody lyrics, and you'll find it sings along with you. You type in scaremoosh scaremoosh and it types back, will you do the fandango. And it's really uncanny. There's range of behavior it's able to access is extremely uncanny. But you start to find it out, not so much by realizing that you're not talking to a human, as by realizing that you're not talking to a human. And so if you ask it where do you live it will say central London. You'll say have you ever been to Disney World. It will say no, because I live just down the street. It doesn't make sense because it's cobbling together different pieces of previous conversations. As long as you ask questions with the right answer, they will generally give you the right answer but if you ask questions that require a particular embodied life experience, you start to realize that it's just this giant kaleidoscope of all these different people. And so that was something I as much as possible tried to be attuned to. If they would ask me a factual question like what's the weather like today, instead of just saying good, I would say something like, well, okay, instead of saying it's rainy, I would say, well it's rainy, but you know I'm from Seattle so that's pretty much par for the course. Whatever they would ask me I would try to embed it in this life narrative to give them a sense of there was a reason I knew everything I knew. There was a reason I didn't know everything I didn't know. And it all traced back to this life history. Another big part of a program like clever Bot, which is sort of scraping together all of these conversations, is a principle called The Book. And The Book comes from game theory of things like checkers and chess. And it's this idea that because every game of checkers or chess starts in exactly the same position, there's a lot of theory generated over what are the appropriate moves in that position. And because in the case of chess, there are only 20 starting moves. There are only 20 replies. It takes a while before the combinatorics explodes and you find yourself in a position you've never played before. So at the highest level of play, the game is not even thought to begin until what's called the opening novelty, which is the first move that gets the game out of the book, out of the realm of positions that have been played by other top players. And the struggle to get out of book is actually one of the big struggles of high level play. The computer has certainly made the problem worse because they have access to giant databases of every grand master game ever. But it's a problem that plagues a lot of high level games. So, for example, one of the anecdotes that I raise in the book is that checkers died in 1863. The World Checkers Championship was being held in Glasgow, Scotland between Robert Morten and James Riley and they were to play a 40-game series over a span of a month. And 21 of those 40 games were the exact same game, start to finish; that there was such a body of theoretical knowledge about which were the best moves and both players, the stakes were too high for them to play inferior moves, merely to get the other person into this fresh position that they just stuck to the script and kept playing the same game over and over. And it basically threatened to kill checkers. You can imagine that the sponsors were not thrilled at the outcome. And so part of the reaction on the part of the professional checkers community was if you don't like how the players are opening their games, open the games for them. And so they adopted this thing and it became known as the three-move restriction, which is you sit the two players down. You draw moves out of a hat, make a bunch of random moves at the board and let them play. It's basically a way of strong-arming them out of the book. So for me there's this great conversational analog to the book, which is clever Bot, for example, is comprised of about ten million previous conversations. So as long as you're saying something that has been said in a corpus of ten million conversations, you're on pretty solid turf. The computer is going to have a response ready. Of course, one of the beautiful things about language is that it's really easy to utter a sentence that has never been formulated before. So the words "perlescent" and "dudes" have probably never been put into a sentence before. But, of course, we don't do this. Etiquette and the ritual of society gives us a kind of conversational book. Hi, how are you? I'm good. How are you? How was your drive? It was not bad. A little bit of traffic. How was yours? And we spend a lot of time in this kind of conversational book. These questions that we know to ask and answers that we know to give. And so when I was in the Turing test, a big challenge for me became how do you get out of book in five minutes of IMing without seeming like you are trying so hard that it starts to backfire. So the judge would come in and say, hi, how's it going, and I would immediately get nervous. It's like how's it going, the classic conversational opener. We need to get out of here as quickly as possible. We need to find some way to have an argument about The Rolling Stones or decide whether we'd rather be a whale or a squid, just something that's never happened before in this giant database, how are we going to get there. And for me it's one of the great parallels between these sorts of formal games. And the same thing that happens when you meet an old friend over coffee, is that how do we sort of plow through these formalities and patterns and get through to the place where you're sort of actively struggling to put words together, actively trying to figure out what you want to say. And with chat bots, that's how you break them. And with people, that's how you come to really know them. So for me, you know, the computer in this broad philosophical sense is the most significant chapter in this really long philosophical history of who are we, what are we all about? Can we look to the external environment for clues about what makes us different from everything we're experiencing? And you know the computer has come along and I think corrected some long-standing errors in the history of philosophy. So if you go back to Aristotle and Descartes, they're extremely dismissive of animal cognition, because their thing is, you know, only humans have souls. Let's prove the argument. Okay. So animals can navigate physical space. They can recognize objects. They can recognize each other. They can form these social groups. They can achieve goals. They can do certain rudimentary types of problem solving, all of that must be totally unimportant and unimpressive. Let's pay attention to this very narrowly defined set of abilities. We have the ability to think abstractly and think in this very rigid step-by-step logical way where we're making deductions, we can do mathematics, all the things that, of course, the computer came along in the 20th century and not only proved itself capable of, but much more capable than ourselves. And I think really sort of hit Aristotle and Descartes where it hurts. So it sort of throws us back in the question of what are the impressive things about ourselves? And I think AI is coming back to us with all these really surprising verdicts. Who would have thought, for example, that computers would be landing planes before they'd be riding bikes, whereas for us it's exactly the other way around. That computers are translating the minutes of the United Nations, but they still have a really difficult time being presented a Polaroid of a cup and saying that's a cup, which any 5-year-old can do. That I think the progress but also the obstacles to AI has given us, I think, this very surprising and fascinating window on to what is impressive about ourselves, perhaps more so than we thought. That you know some of what we thought was easy is actually hard. So what we thought was hard is actually easy. And the big bottom line for me is that there's this tremendous unsung complexity to the activities of daily life. That you're walking down the street, you're dodging pedestrians, you see someone that's vaguely familiar and you say that's my buddy Bob from college. Like Bob what's been going on in the past ten years and he's in the strange position of having to compress ten years of experience into one sentence, that we typically don't think of these things as major cognitive achievements as compared to factoring giant numbers. In fact, they are. In fact the mundane kind of daily activities are where a lot of the real complexity of human experience is. I think the Turing test is a real window into that in the verbal communicative sphere, that it's giving us a sense of what really is going on when we talk, what are the things that are truly interesting and subtle and impressive. And it's not just a pat on the back. It is that to some extent, but it's also a call to action, that we don't always know what these things are. We don't always manage to get out of book even despite our best efforts. And so it's an occasion not just to celebrate these qualities, but to actively pursue them. So we build these things in our own image, but the approximation always leaves this gap. And that gap then always has something new to teach us about who we are. So I'll stop there and take questions if you guys have any. [applause]. >>: I can see why you have the most human prize. >> Brian Christian: Thanks. >>: One of the important points in your description of the Turing test as you went through it in England was the time allowed for this interaction. Five minutes. And you said obviously a shorter test makes it a lot easier for a computer to win longer. Can you just talk about the function of time, what is it that's going on that makes that important? >> Brian Christian: I mean, intuitively if the Turing test were two seconds long it would be pretty easy for the computers to pass themselves off. They'd say hi. End of test, and you're left with nothing to go on. And I think that we can pretty much intuit that if the Turing test were eight hours long, that there would be plenty of opportunities to try all sorts of different things that no program would possibly hold up to that kind of test. So you can make the argument that this 5-minute mark is this very arbitrary thing. That instead of looking at what percentage of machines can fool us in five minutes, we should go exactly the other direction and look at what duration of time is the appropriate window in which computers can fool us, and watching that window get larger and larger. So that was one of the things that I was aware of; that the test is to some extent a race against the clock, and so amongst my sort of more nuanced strategies, there was also the very simple strategy of just typing as fast as I possibly could. >>: So when you type correctly, obviously humans make many spelling errors. And you said that the judges could actually see as each character hit the screen. So wasn't that kind of a dead give-away or were the computers programmed to do that as well? >> Brian Christian: In fact, the programs are designed to fake typos, which -- this is one of the criticisms that people level against the Turing test. It's of no use to computer science that we're spending man-hours faking typos. So touché. But I, for example, was aware of my typos, partially originating from the British keyboard layout. So the at sign and quotation mark sign were reversed. So whenever I made that particular typo, I deliberately left it in rather than back spacing it and would say something like argh, UK keyboard, such that my typo was not simply like noise in the signal, but it was actually an artifact of muscle memory operating on a different keyboard layout. So my hope was that that was sort of a more nuanced type of typo. But I don't know if that worked or not. >>: I have a one-word question for you. Watson. >> Brian Christian: Okay. I may need a few more words to answer you. So I think the Watson super computer -- you guys probably all know this -- but IBM developed the super computer that beat the two big human Jeopardy champions. I infer from your question that you are curious, is this like a big deal or not? >>: Just wanted to know your response to it is. >> Brian Christian: So I think it's part of this great sort of counterintuitive narrative of the development of AI, that when we look at someone like Ken Jennings or Brad Rutter we're impressed by how many facts they know. We're not impressed or excited by how many facts exist on Wikipedia or something like that. And, in fact, the really amusing thing for IBM the challenge was not encoding, tagging and then accessing all world knowledge, it was how are we going to deal with Alex Trabek's puns. How are we going to handle figurative language and these sort of oblique clues that don't fully articulate what they're getting at but any human can understand. That was like basically the entire challenge for them; that there was this giant pun team. And for me that's one of these great sort of life-affirming results, that these things that go totally underappreciated in fact turn out to be completely challenging. Yes? >>: Have you seen the new Androids that have been built that are basically -- basically look human, I know there's one in Denmark that we've seen on YouTube, that's where basically this professor built this robot to look like himself, and with the mimicking of the facial for expression and everything else. Just wondering what your reaction is to that, put that together with the communication and bots, what ->> Brian Christian: Yes, there's definitely -- we're starting to come out of the uncanny valley in terms of these Android systems where they are no longer looking like creepy corpses, but they're actually looking like slightly animate people. And in fact Hugh Loebner has set aside a special $100,000 gold award for the first system that's able to pass the Turing test over video. So if you're like Skype videoing, I don't think we're going to be seeing that in the near future. But it's interesting to think about also, you know, it's one thing to create like a really impressive wax statue, and it's another thing to create a really impressive robot hit the smile button and it smiles. But you can imagine almost a purely gestural Turing test, where like with an infant, the way parents communicate with infants you make a face, what face does it make back. That would be this other whole other level of -- I don't know, it's interesting to me to speculate what the different thresholds will be. >>: Do you think are we not being too tough on the computers in a way you look at the Turing test, the judges are really very smart people, scientists. If we put the computers in front of an average representation of the population they would have a far better chance of passing the test, talking about Watson, I think it caught a lot of flack because it said that Ottawa was in the states -- but if you go out on the street and you ask someone about Ottawa, is it in the states, I'm sure that you're going to get a significant percentage of the population that would say, yeah, yeah, it's in the states, of course. So are we not in fact holding the computers to a bar which is higher than what we would expect for your average human? >> Brian Christian: Yeah. I think that's a great question. So one of the sort of AI jokes back in the '60s was a professor claiming that he had invented a program that passes the Turing test by perfectly mimicking a catatonic person. And it sits there and it doesn't type just like a catatonic person would sit there and not type. So part of what the Turing test hinges on is which people are we comparing computers to. I think speaking to the Toronto question, this is actually one of the really interesting things, is that the categories U.S. cities it answers Toronto and we kind of all guffaw. In fact, there's several U.S. cities called Toronto they're just really small rural cities that we sort of automatically dismiss, and that the smugness of being able to guffaw Toronto is obviously -- he's obviously not referring to Toronto Kansas, population 300, that that's the disambiguation of language. It's a great example. So for us to even make fun of it kind of hints at our ability to disambiguate language in this kind of subtle way. >>: I have kind of a bookend to that question. If I took a test and I only had to get a 30, I would think, gee, this is easy. But where did that percentage come from? >> Brian Christian: The percentage -- as far as I know -- totally pulled out of thin air by Alan Turing. What Alan Turing is saying in his original paper is as soon as a program hits this mark, it's going to be magically endowed with intelligence, but he's really talking about popular opinion. So he's saying I think by the year 2000 computers will be fooling about 30 percent of the people and will generally, as a result, there will be this kind of [inaudible] of talking about them like they're intelligent. And since he's Alan Turing and the grandfather of a lot of computer science, those particular numbers have been really carved into stone at these magic numbers. But he's really just talking about sort of a paradigm shift. But, actually, on your comment of only needing to get a 30 percent on a test. One of the best tests that I ever took when I was in school is this test called the American invitational math exam. And to give you an example of how difficult this test is, it's pen and pencil only. It requires nothing more than like sophomore-level geometry and prealgebra. It's three hours long. And the average score is one. One question correct. That what I found really exhilarating, thinking about the difference between a test like that and a test like the SAT for American history, where there's a ton of questions, John Adams, X, Y, Z, this year and everyone goes down the list and checks off all these boxes. That part of the skill in the AMIE is you'll be happy to solve just one problem. So you look at this giant list and you think, okay, where am I going to attack? What seems doable? What seems easy. What seems really fun but maybe actually too hard? Here's a problem where I know that there's a brute force solution, if I sit here and calculate a bunch of multiplication on this piece of paper. But I'm pretty sure there's a more elegant solution, but will thinking up the elegant solution take more time than doing all the arithmetic? You have to make all of these decisions about priorities. And in that sense I think it represents this much more sort of human version of a math test rather than here's a ton of problems, do them all and I'll expect a 100, whereas like here's a bunch of extremely tough problems, find one that seems maybe doable. I think that's much closer to what it's like to work in the real world. So I get excited by these deliberately impossible tests. >>: I have like two questions. The first one was, you were talking about how like a [inaudible] is what makes the human unique compared to the computer. I'm just curious, going from like a standpoint where [inaudible] user tries to do things where we create personas to design by. And I'm curious what's to say you just can't give a computer a persona to live by, to almost imitate a life, like you were saying? >> Brian Christian: Yeah. I don't think that there's a reason you can't, that it can't be done. That I think if we develop systems that are truly Turing test passing like in this totally indistinguishable way, part of it will be that you have to -- instead of training them on a giant dataset you've trained them by just like putting them in a carrier and walking around the world for like ten years. That, you know, everything we know is a product of this sort of continuous experience. So, yeah, I think it's totally possible. The question is, can you marry these two approaches of these sort of more authored bots and the more sort of like wiki bots? Is there a way to get the best of both worlds where you're sort of crowd sourcing, collaborating, but there's also a coherence that I think that becomes one of the big challenges, and I think it's an exciting one. >>: So my other question was there's been this interesting debate going on in the past couple of days about Bill Gates gave a talk and he was saying we should kind of reduce our investment in liberal arts degrees because they don't lead to job creation. And then a couple of days after, the iPad 2 announcement, Steve Jobs came out and said that their products are the merriment of liberal arts. I noticed in your bio you have a dual degree in CS, philosophy and poetry. I'm curious, there's some connection here between the human-human and how it all interrelates. I was curious, since you do seem to have both degrees from both sides of the fence, where do you sit in this debate? Like who do you agree with or disagree with and how does that affect like the human-humankind of idea? >> Brian Christian: I don't necessarily want to take sides in the Gates Jobs face-off. But I do think -- I mean, it strikes me if you look at something like poetry, it seems like it would have no connection whatsoever to something like cyber security. But, in fact, as we were talking at the beginning of the talk about e-mail spam, that part of how we authenticate our e-mails is not in this content-based way of like here's my passcode, here's like my secret pin so you know it's a real e-mail from me, but instead I act like myself or try to say something that seems very me. And it becomes a question of aesthetics. That I think there's really a space where aesthetics and something like IT come together. And so without knowing the full argument that Gates had in mind, I would like to argue that there's still a place for the arts and technology to come together. >>: Two more. >> Brian Christian: Two more. >>: From your experience, if you were now made a judge, do you have a strategy that you would use based on what you see as sort of the current set of weaknesses? >> Brian Christian: Typically -- typically there are two kinds of judges. There are the interrogators, who just sort of barrel tough questions at you. And the small talkers, that just kind of say, hey, what's up, and see where you go with that. It's also interesting if you go back on the history of the test, there were really two eras of the Loebner prize. First 5 years it was done with specific topics. You'd sit down to a terminal it would give you a top pig, talk about the NHL. They had to throw it out partially because there was no rigorous way to define topic. If I start talking about like the Cold War context of the 1980s hockey Olympic match between the U.S. and USSR, are we still talking about hockey or now talking about politics, that they had these judges who would declare certain remarks out of bounds. But there was no rigorous way to do it. So they scrapped it in favor of this totally wide open test. And in theory that makes the test infinitely harder for the programmers, because they have to prepare for everything. In practice, it actually made it easier, because they mostly had to prepare for small talk. So I think that my strategy as a judge would be something where I would go in with a totally random topic, but at the same time allow someone to sort of figure out what they wanted to say about it, where I was getting them out of book but also giving them enough breathing room not to merely answer questions with facts but to see how their mind kind of wandered through a specific domain. I think there's a place for it. >>: You mentioned the challenge, that the direct puns caused for the IBM scientists, have any of the Turing judges ever focused on humor as a way to distinguish? >> Brian Christian: Yeah. Humor has been one of those things like typos where very early programs were kind of preloaded with these jokes that they could crack. And it made a really big impression on a lot of the judges early on. Because we think of humor as one of these characteristically human things. But to some extent there's a great quote by Salvador Dali that I love that says the first person to compare a woman's cheeks to a Rose was obviously a Poet. The second person to make that comparison may well have been an idiot. That to coin a joke requires a sense of humor, but to tell a joke doesn't necessarily. And so for that reason I think people who can sort of draw into this reservoir of one-liners and cracks, in real life I now find that slightly less impressive than the people who are operating in a more sort of witty word play style where they'll take exactly what you said and torque it slightly and you know you're getting an original joke. So I think that's the trick, is it's not just a joke but a complete context-sensitive kind of one-of-a-kind joke. >> Heather Mitchell: Thank you. >> Brian Christian: Thanks so much. [applause]