Amnon H. Eden James H. Moor Johnny H. Søraker Eric Steinhart • • Editors Singularity Hypotheses A Scientific and Philosophical Assessment 123 Editors Amnon H. Eden School of Computer Science and Electronic Engineering University of Essex Colchester UK Johnny H. Søraker Department of Philosophy University of Twente Enschede The Netherlands Eric Steinhart Department of Philosophy William Paterson University Wayne USA James H. Moor Dartmouth College Hanover USA ISSN 1612-3018 ISBN 978-3-642-32559-5 DOI 10.1007/978-3-642-32560-1 ISBN 978-3-642-32560-1 (eBook) Springer Heidelberg New York Dordrecht London Library of Congress Control Number: 2012946755 ! Springer-Verlag Berlin Heidelberg 2012 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. Exempted from this legal reservation are brief excerpts in connection with reviews or scholarly analysis or material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the work. Duplication of this publication or parts thereof is permitted only under the provisions of the Copyright Law of the Publisher’s location, in its current version, and permission for use must always be obtained from Springer. Permissions for use may be obtained through RightsLink at the Copyright Clearance Center. Violations are liable to prosecution under the respective Copyright Law. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. While the advice and information in this book are believed to be true and accurate at the date of publication, neither the authors nor the editors nor the publisher can accept any legal responsibility for any errors or omissions that may be made. The publisher makes no warranty, express or implied, with respect to the material contained herein. Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com) 3 The Threat of a Reward-Driven Adversarial Artificial General Intelligence 59 Chapter 3A William J. Rapaport on Arel’s ‘‘The Threat of a Reward-Driven Adversarial Artificial General Intelligence’’ Can’t we just talk? Itamar Arel (2013) argues that: 1. artificial general intelligence (AGI) ‘‘is inevitable’’ (p. 44), 2. techniques including a ‘‘fusion between deep learning, … a scalable situation inference engine, and reinforcement learning [RL] as a decision-making system may hold the key to place us on the path to AGI’’ (p. 45), and 3. ‘‘a potentially devastating conflict between a reward-driven AGI system and the human race… is inescapable, given the assumption that an RL-based AGI will be allowed to evolve’’ (p. 45). Why ‘‘inescapable’’? If I understand Arel correctly, it is a mathematical certainty: [F]rom Eqs. (3.2) and (3.4) [Arel 2013, pp. 50, 55, the details of which are irrelevant to my argument], it follows that the agent continuously attempts to maximize its ‘‘positive’’ surprises [i.e. ‘‘its wellbeing’’]… while minimizing ‘‘negative’’ surprises. This process… is unbounded…. [O]nce such a bonus is received on a regular basis, it becomes the new norm and no longer yields the same level of satisfaction. This is the core danger in designing systems that are driven by rewards and have large cognitive capacity; by continuously striving to gain positive (relative) reinforcement, they will inevitably pose a danger to humanity (pp. 55–56). Let’s suppose so. But why should it be ‘‘inevitable’’? Despite Arel’s faith in the inevitability of AGI (which I share), he seems to be committing the fallacy of thinking that AGIs must differ in crucial respects from humans. This is the fallacy that John Searle commits when claiming that the inhabitant of his Chinese Room (Searle 1980) doesn’t ‘‘understand a word of Chinese and neither does any other digital computer because all the computer has is what [the inhabitant] ha[s]: a formal program that attaches no meaning, interpretation, or content to any of the symbols’’ (Searle 1982, p. 5). As I have pointed out elsewhere, this assumes ‘‘that external links are needed for the program to ‘attach’ meaning to its symbols’’ (Rapaport 2000, §3.2.2). The fallacy can be seen by realizing that ‘‘if external links are needed, then surely a computer could have them as well as—and presumably in the same way that—humans have them’’ (Rapaport 2000, §3.2.2). Why do I think that Arel is committing this fallacy? Because, presumably, humans also ‘‘attempt to maximize [their] wellbeing’’. Now, I can agree that humans themselves have been known, from time to time, to ‘‘pose a danger to humanity’’ (for a discussion of this, see Dietrich 2001, 2007). But we have also devised methods for alleviating such dangers. Clearly, then, rather than wringing our hands over the ‘‘inevitability’’ of AGIs wreaking havoc on their creators, we should give them some of those methods. 60 I. Arel And, indeed, Arel sketches out some possibilities along these lines: education and ‘‘limit[ing] such [a] system’s mental capacity’’ (p. 56). But he seems to neglect one obvious possibility, one that is, in fact, a necessity for any AGI: For an AGI to really have GI—general intelligence—it must have cognition: (1) It must be able to use and understand language—and, presumably, our language, so that we can communicate with it, and vice versa (see Winston 1975 and my discussion of ‘‘Winston’s problem’’ in Rapaport 2003)—and (2) it must be able to reason consciously (e.g. via an explicit knowledge-representation-and-reasoning system, as opposed to tacit reasoning by, say, an artificial neural network). If we can reason with it in natural language, then we can hope to be able to collaborate and negotiate with it, rather than compete with it. Such natural-language and reasoning competence is, in any case, a prerequisite (or at least a product) of education, but it requires no limitation on the AGI’s mental capacity. References Arel, D. R., & Coop, R. (November, 2009). DeSTIN: A scalable deep learning architecture with application to high-dimensional robust pattern recognition. In Proceedings of the AAAI 2009 Fall Symposium on Biologically Inspired Cognitive Architectures. Arel, I. (2013). The threat of a reward-driven adversarial artificial general intelligence. In A. H. Eden, J. H. Moor, J. H. Søraker, & E. Steinhart (Eds.), The singularity hypothesis: a scientific and philosophical analysis (pp. 43–58). Heidelberg: Springer. Dietrich, E. (2001, October). Homo sapiens 2.0: why we should build the better robots of our nature. Journal of Experimental and Theoretical Artificial Intelligence, 13(4), 323–328. Dietrich, E. (2007). After the humans are gone. Journal of Experimental and Theoretical Artificial Intelligence, 19(1), 55–67. Rapaport, W. J. (2000). How to pass a turing test: syntactic semantics, natural—language understanding, and first-person cognition. Journal of Logic, Language, and Information, 9(4): 467–490. (Reprinted in The turing test: the elusive standard of artificial intelligence, pp. 161–14, by H. M. James, Ed., 2003, Dordrecht: Kluwer). Rapaport, W. J. (2003). What did you mean by that? Misunderstanding, negotiation, and syntactic semantics. Minds and Machines, 13(3), 397–427. Searle, J. R. (1980). Minds, brains, and programs. Behavioral and Brain Sciences, 3, 417–457. Searle, J. R. (1982). The myth of the computer. New York Review of Books (29 April 1982): 3–6; cf. correspondence, same journal (24 June 1982): 56–57. Winston, P. H. (1975). Learning structural descriptions from examples. in Patrick HenryWinston (ed.), The Psychology of Computer Vision (New York: McGraw-Hill): 157– 209. (Reprinted in Readings in knowledge representation, pp. 141–168, by J. B. Ronald & J. L. Hector, Eds., 1985, Los Altos, CA: Morgan Kaufmann).