729G43 Artificiell Intelligens IDA, Institutionen för datavetenskap Jasmina Jahic jasja310 2016-01-08 “I can’t believe it’s not human!” Questioning if chatterbots are intelligent for managing to trick humans, with ELIZA and CleverBot as examples 1 729G43 Artificiell Intelligens IDA, Institutionen för datavetenskap Jasmina Jahic jasja310 2016-01-08 Contents Introduction ............................................................................................................................................. 3 Background ............................................................................................................................................. 3 The Turing test .................................................................................................................................... 3 Natural Language Processing .............................................................................................................. 4 Overview ................................................................................................................................................. 4 ELIZA ................................................................................................................................................. 4 CleverBot ............................................................................................................................................ 5 Behind the chatterbots ............................................................................................................................. 5 ELIZA ................................................................................................................................................. 5 CleverBot ............................................................................................................................................ 7 Reactions to ELIZA................................................................................................................................. 8 Discussion ............................................................................................................................................... 8 Conclusion ............................................................................................................................................. 10 References ............................................................................................................................................. 11 2 729G43 Artificiell Intelligens IDA, Institutionen för datavetenskap Jasmina Jahic jasja310 2016-01-08 Introduction We have always been fascinated by the idea to create something that is just as, or maybe even more, intelligent than us humans. Perhaps we’re playing God, perhaps we want to make life easier for us, or perhaps we simply want to see how far we can go. Whatever the reason is, we have indeed managed to create machines that, through communication of natural languages, have tricked humans into believing that they are conversing with another human. How did the machines manage to do that? In this essay I will talk briefly in general about Natural Language Processing (NLP), and then compare the two chatterbots ELIZA and CleverBot, whose very essence comes from NLP and our desire to create seemingly intelligent beings. I will mention their history and explain how these two chatterbots manage to trick a fair amount of humans. Then I will discuss their differences and similarities, and after that conclude the essay with what the future might bring. Background The Turing test Alan Turing was a mathematician and codebreaker. In 1950 he released the paper “Computing Machinery and Intelligence”, where he for the first time publicly talked about the Turing test, which he called “the imitation game”. In this test a human, which is a judge, holds a conversation with a machine (Turing specifies that ‘machine’ in this case is a ‘digital computer’. I will use the same terminology), and one conversation with a human, preferably via typewriter. If the machine successfully manages to “imitate” a human to the point where the test subject believes that s/he cannot tell which one is the machine and which one is the human, then the machine has passed the test. According to Turing, passing this test would deem a machine as “intelligent” (Turing, 1950). These days this test can be done with one human, acting as the judge, and one conversationalist. The judge does not know whether they are conversing with a machine or not. If the judge believes that their conversation partner is another human, when it’s in fact not, then the machine has passed the test. In general the machine needs to convince at least 50% of the humans it converses with to pass the Turing test (Moore, 1976). In his paper, Turing (1950) himself mentions this kind of test as viva voce. 3 729G43 Artificiell Intelligens IDA, Institutionen för datavetenskap Jasmina Jahic jasja310 2016-01-08 Natural Language Processing Natural language processing is everything and anything that has to do with machines processing natural languages. It can be all from analysing and understanding written text syntactically, semantically or pragmatically, to generating own comprehensible texts. It’s this and everything in between. Most of the scenarios with NLP include human-machine interaction, but it since it’s such a wide field it can also include e.g. analyses of corpora, among other things (Lehnert & Ringle, 2014). The field has its roots in the 1950’s, when the first machines were beginning to come forward. Alan Turing had a part in this, not only for his contributions to early computer science but also because of his “Turing test”, since the test essentially tests a machine’s ability to properly and fluidly communicate with a human being in a natural language. NLP is in many of its applications seen as a branch of artificial intelligence. The field has evolved greatly with the help of other fields in AI, resulting in models and techniques such as machine translation, question answering, statistical natural language processing and variations of Markov models etc (Lehnert & Ringle, 2014). Overview ELIZA Created by Joseph Weizenbaum in 1964-1966 at Massachusetts Institution for Technology (MIT), ELIZA is one of the earliest chatterbots in existence. Originally written in MAD-Slip, the program has since then been rewritten in several different programming languages, including Lisp, BASIC and Python. The actual code that has been used and edited the most is in fact not Weizenbaum’s, but rather Jeff Shrager’s BASIC version that he wrote in 1973. He made use of Weizenbaum’s colleague’s Bernie Cosell’s 1966 Lisp version, since the original code was never released to the public. The reason for the conversion to BASIC was because the first personalised computers appeared during the 70’s, and at the time more people knew about and dabbled in BASIC. It simply made ELIZA more accessible (Shrager, The Genealogy of ELIZA, 2015). ELIZA is the actual program, and the original script that is used with it is called DOCTOR. The script contains all the keywords, conjugations and replies that the program uses in transforming responses (Shrager, ELIZA, 1984). 4 729G43 Artificiell Intelligens IDA, Institutionen för datavetenskap Jasmina Jahic jasja310 2016-01-08 CleverBot Rollo Carpenter is the father of CleverBot, having launched it on the World Wide Web in 1997. Ever since then it has been active, and in February, 2014 it had over 170 million lines of conversation stored in its database. This means that the CleverBot algorithm works with Big Data. Since it’s a web application, it means that people are constantly interacting with CleverBot, always contributing to its gigantic database, and it also means that the algorithm is constantly working with checking the inputs by users with its existing database by using fuzzy string similarity (Casten, 2012). Carpenter and his team, Existor, are constantly working on both CleverBot and their other chatterbots, such as Evie and Jabberwacky. To be able to handle the massive amount of data in a time and cost effective way, they had to get creative (Existor, 2014). More on this under Behind the chatterbots, CleverBot. Behind the chatterbots ELIZA ELIZA is a rule-based program, coded in an ”if-then” model. This can be visualised with a binary tree. Its database (that is, the DOCTOR script) is made out of 36 keywords, 12 conjugations and 112 replies. When ELIZA gets an input, it scans the strings and first removes all apostrophes if any are present. Then it checks for keywords. After it has found a keyword, it keeps scanning the input to see if other keywords are present. If there are, ELIZA chooses the keyword with highest priority. This is possible since every keyword is ‘ranked’ or has a set weight in the script. ELIZA also keeps track of delimiters, namely commas and periods. If a keyword is found before a delimiter, then that keyword is chosen and all text after the delimiter is deleted. If no keyword is found before the delimiter, then it deletes those strings together with the delimiter, and continues to scan the input. Each keyword has a set number of replies connected to it, and ELIZA follows a set of rules to choose replies. Some replies have an asterisk at the end of the string – at this point ELIZA uses a part of the user’s input to make the language more natural and make it seem as if it’s listening to the user (see images 1a and 1b). There is also a keyword labelled “NOKEYDETECTED”. This exists so ELIZA can reply with something non-committal if no keywords are detected in the input (Weizenbaum, 1966) (Shrager, ELIZA, 1984). 5 729G43 Artificiell Intelligens IDA, Institutionen för datavetenskap Jasmina Jahic jasja310 2016-01-08 Image 1a. Note the asterisk at the end of line 1660. Refer the string to image 1b Image 1b. On the last line you can see how ELIZA combines a reply with something that the user has written ELIZA shuts down either when the user writes “goodbye” or if ELIZA detects the word “shut” in the input, as it sees this as “shut up” (this, however, differentiates from version to version). It also checks if the user is repeating him/herself, where ELIZA will respond with ‘PLEASE DO NOT REPEAT YOURSELF’. To make the language more natural, Eliza also conjugates a set of 12 words, such as ‘YOU’ to ‘I’ and ‘ARE’ to ‘AM’ (Shrager, ELIZA, 1984). The rules that the program uses to transform user inputs are decomposition and reassembly rules. It finds the right reply by using a “find the right reply data table” (see image 2). Each keyword has a number of replies that ELIZA can use, and she keeps track of them internally, to not use the same phrase over and over again when encountering the same keyword. ELIZA also has a MEMORY-function, which it uses to create an output with a part of a phrase that the user has used earlier in the conversation. This makes it seem as if ELIZA is listening to the user and connecting the conversation to something that they’ve talked about earlier (Weizenbaum, 1966). Image 2. Every keyword is connected to a set number of possible replies. The scripts that ELIZA uses can contain whatever keywords, conjugations and replies as one sees fit, as long as they’re properly connected to each other. This is because the scripts are not part of the actual program per se, so the actual contents do not matter. However, problems can 6 729G43 Artificiell Intelligens IDA, Institutionen för datavetenskap Jasmina Jahic jasja310 2016-01-08 arise since in most subjects one needs to have external knowledge and facts to be able to hold a proper conversation. It is thanks to the very nature of psychiatric interviews that there is no need to store information about the world in ELIZA, since the conversations are purely focused on the individual – it is not strange to be asked to elaborate on seemingly obvious things, since psychotherapists want their patients to be introspective and hear their perception of the world (Weizenbaum, 1966). Basically, ELIZA uses your own words against you. What makes ELIZA capable of holding a convincing conversation is not actually its programming in its own – but rather its structure together with how humans thought processes work. Humans tend to assume many things. If someone is given more or less appropriate answers in a conversation, then one assumes that the thing one is talking with is indeed human – or at least as intelligent as one. However, if the responses keep being irrational and selfcontradictory, then people will become suspicious and the chatterbot loses its credibility as a human being (Weizenbaum, 1966). CleverBot CleverBot is a web application that can interact with 100.000 users at once, and it keeps collecting the inputs from users to build its database. It then uses the phrases in the database to generate responses, by first taking the phrase by the user and searching its database to find a perfect or similar match by using string similarity. It then uses Markov models to find keywords and predict which words could and should come before, in between and after the keywords. Lastly, it responds to the user with a generated sentence (Casten, 2012). This means that CleverBot doesn’t actually parse the grammar semantically. It handles the Big Data by using parallel processing through three graphics processor units (GPUs). The GPUs that Existor decided to use are the Nvidia Geforce GTX Titan model, with memory capacity of 6GB VRAM. To be able to use the full memory capacity, they had to use the GPU programming language Cuda to create two separate applications that could run 3GB simultaneously. With this technique CleverBot can make use of its entire database when generating replies, whereas the old, sequential model had to filter millions of lines to be able to work without crashing their server. (Existor, 2014) Recently Existor decided to upgrade CleverBot further by starting to implement machine learning techniques. This new version of CleverBot is not yet out, so the actual performance 7 729G43 Artificiell Intelligens IDA, Institutionen för datavetenskap Jasmina Jahic jasja310 2016-01-08 cannot be compared at this moment in time. However, they did publish an article about some things that they had learned about machine leanring – namely recurrent neural networks (Tero, 2015). They do not mention if it’s this neural network that they are going to implement, but whatever they decide to go with, it’s bound to make CleverBot cleverer. It is possible that they are trying to implement machine learning in hopes of making CleverBot understand semantics as well and/or give it the ability for deep learning (Existor, 2014). Reactions to ELIZA To pass a Turing test, people will need to tricked, or convinced, by a machine that it is a human through the use of natural language. That we have clarified. The interesting part, however, is when some of these people are adamant in their belief that they had communicated with a human being and not a machine (Weizenbaum, 1966). This happened to ELIZA, which is fascinating since it’s built on simple grammatical rules of pattern finding. Would their belief be as strong if they had been told since the beginning that they were interacting with a machine? The actual conversations would be the same. Weizenbaum (1966) does mention that, according to him, a big part of why people fall for ELIZA’s tricks is that humans tend to assume things. If they are not explicitly told that they are speaking to something not-human, then they will assume that they are indeed communicating with another human being – after all what else could they possibly be talking to? Discussion According to online dictionary Merriam-Webster, artificial intelligence is defined as “a branch of computer science dealing with the simulation of intelligent behavior in computers”1, while intelligence is defined as “the ability to learn or understand things or to deal with new or difficult situations” 2. Is it then safe to say that chatterbots who manage to pass the Turing test are seen as intelligent? They do, after all, “manage to deal with trying situations” – meaning that they are able to convincingly communicate with humans, as if they were human 1 2 Definition retrieved 2016-01-08 Definition retrieved 2016-01-08 8 729G43 Artificiell Intelligens IDA, Institutionen för datavetenskap Jasmina Jahic jasja310 2016-01-08 themselves. CleverBot, with its use of Markov models and recent implementation of machine learning can arguably be seen as a powerful AI simply because of the techniques that are implemented. However, CleverBot, with its superior techniques and models, still does more or less the same thing that ELIZA does. CleverBot analyses the input from the user, searches its vast database for the exact same phrase or something similar to it, uses the Markov model to predict which words seem appropriate in creating a sentence and then responds with said sentence. Can it still be considered intelligent? Technically it is “dealing with a new or trying situation”, with “new” being every new conversation. However, CleverBot does not actually understand what it is doing, nor does it semantically or pragmatically understand the phrases from the user input. The same goes for ELIZA, who is even simpler in design, with its constrained database made out of scripts. ELIZA more or less does what CleverBot does, but in another way. CleverBot scans an entire phrase and replies accordingly depending on its database, whereas ELIZA looks for keywords and either chooses a reply from its script, or with the help of some grammatical rules generates a response that’s built by one of its script replies together with words that the user has used. ELIZA doesn’t really understand what it’s doing either – it’s simply following its rules to complete its objective (which is partly to simply reply coherently, and partly to convince humans that they are talking to another human). Since ELIZA also reaches its objective and has also passed the Turing test, it would also be deemed as “intelligent”. I wonder if ELIZA and CleverBot can actually even be classified as AI. They do indeed meet the general criteria of an AI, but something that (at least in modern times) has been defining for an AI are internal representations of the world. Neither chatterbots have this, and while ELIZA technically doesn’t need to have it to reach its goal, it’s debatable whether CleverBot deserves the title of AI. Its ability to speedily match and generate phrases within its Big Data is commendable, but if it had an internal representation that it could somehow refer to when matching phrases, then perhaps it could give more context-related responses. ELIZA would also benefit, in certain ways, if it could somehow remember the actual conversation that it is currently engaged in, to prevent repetition and to be able to refer to earlier phrases. ELIZA does in a way “remember” key phrases and uses this to heighten its own credibility, but this memory function is also merely rule-based. ELIZA would also need a much larger database if it remembered the conversation, otherwise it would still need to loop through its possible replies from its script. CleverBot already has some sort of sensor for context built in, but I 9 729G43 Artificiell Intelligens IDA, Institutionen för datavetenskap Jasmina Jahic jasja310 2016-01-08 believe that it can be further improved, since from personal experience CleverBot does still tend to change topics fairly often. Conclusion The Turing test might not be the sure-fire way to decide if a machine is intelligent or not, but both ELIZA and CleverBot are seen as AI – meaning that they, in one way or another, manage to imitate human intelligence. However, the definition and the criteria of what makes an AI might just change in the near future, since it’s a field that keeps growing and evolving, and we learn new things each day. One day, an agent that does not have an internal representation might simply be too dumb to be considered as an AI. These chatterbots are deemed as intelligent, but if they actually are is another question. At this point in time it is for me impossible to give a definite answer, since the very definition of intelligence is not set in stone. However, with the general consensus of the definitions of AI and intelligence, you could say that chatterbots have the property of intelligence. 10 729G43 Artificiell Intelligens IDA, Institutionen för datavetenskap Jasmina Jahic jasja310 2016-01-08 References Casten, J. (2012). Cybernetic Revelation: Deconstructing Artificial Intelligence. Eugene, Oregon: Post Egoism Media. Existor, L. (den 5 February 2014). Deep Context through Parallel Processing. Hämtat från Existor: http://www.existor.com/en/news-parallel.html Lehnert, W. G., & Ringle, M. H. (2014). Strategies for Natural Language Processing. New York, New York: Psychology Press. Moore, J. H. (1976). An analysis of the turing test. Philosophical studies, 249-257. Shrager, J. (1984). ELIZA. i D. H. Ahl, Big Computer Games (ss. 20-24). Morris Plains, New Jersey: Creative Computing Press. Shrager, J. (den 22 November 2015). The Genealogy of ELIZA. Hämtat från The Genealogy of ELIZA: elizagen.org Tero, P. (den 20 August 2015). Machine Learning - Neural Networks Tutorial. Hämtat från Existor: http://www.existor.com/en/news-neural-networks.html Turing, A. M. (1950). Computing Machinery and Intelligence. MIND, 433-460. Weizenbaum, J. (November 1966). ELIZA - A computer program for the study of natural language communication between man and machine. Communications of the ACM, ss. 36-45. 11