123 Singularity Hypotheses A Scientific and Philosophical Assessment Amnon H. Eden

advertisement
Amnon H. Eden James H. Moor
Johnny H. Søraker Eric Steinhart
•
•
Editors
Singularity Hypotheses
A Scientific and Philosophical Assessment
123
Editors
Amnon H. Eden
School of Computer Science
and Electronic Engineering
University of Essex
Colchester
UK
Johnny H. Søraker
Department of Philosophy
University of Twente
Enschede
The Netherlands
Eric Steinhart
Department of Philosophy
William Paterson University
Wayne
USA
James H. Moor
Dartmouth College
Hanover
USA
ISSN 1612-3018
ISBN 978-3-642-32559-5
DOI 10.1007/978-3-642-32560-1
ISBN 978-3-642-32560-1
(eBook)
Springer Heidelberg New York Dordrecht London
Library of Congress Control Number: 2012946755
! Springer-Verlag Berlin Heidelberg 2012
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of
the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or
information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed. Exempted from this legal reservation are brief
excerpts in connection with reviews or scholarly analysis or material supplied specifically for the
purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the
work. Duplication of this publication or parts thereof is permitted only under the provisions of
the Copyright Law of the Publisher’s location, in its current version, and permission for use must always
be obtained from Springer. Permissions for use may be obtained through RightsLink at the Copyright
Clearance Center. Violations are liable to prosecution under the respective Copyright Law.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt
from the relevant protective laws and regulations and therefore free for general use.
While the advice and information in this book are believed to be true and accurate at the date of
publication, neither the authors nor the editors nor the publisher can accept any legal responsibility for
any errors or omissions that may be made. The publisher makes no warranty, express or implied, with
respect to the material contained herein.
Printed on acid-free paper
Springer is part of Springer Science+Business Media (www.springer.com)
3 The Threat of a Reward-Driven Adversarial Artificial General Intelligence
59
Chapter 3A
William J. Rapaport on Arel’s ‘‘The Threat of a Reward-Driven
Adversarial Artificial General Intelligence’’
Can’t we just talk?
Itamar Arel (2013) argues that:
1. artificial general intelligence (AGI) ‘‘is inevitable’’ (p. 44),
2. techniques including a ‘‘fusion between deep learning, … a scalable situation
inference engine, and reinforcement learning [RL] as a decision-making system
may hold the key to place us on the path to AGI’’ (p. 45), and
3. ‘‘a potentially devastating conflict between a reward-driven AGI system and the
human race… is inescapable, given the assumption that an RL-based AGI will
be allowed to evolve’’ (p. 45).
Why ‘‘inescapable’’? If I understand Arel correctly, it is a mathematical
certainty:
[F]rom Eqs. (3.2) and (3.4) [Arel 2013, pp. 50, 55, the details of which are irrelevant to my
argument], it follows that the agent continuously attempts to maximize its ‘‘positive’’
surprises [i.e. ‘‘its wellbeing’’]… while minimizing ‘‘negative’’ surprises. This process…
is unbounded…. [O]nce such a bonus is received on a regular basis, it becomes the new
norm and no longer yields the same level of satisfaction. This is the core danger in
designing systems that are driven by rewards and have large cognitive capacity; by
continuously striving to gain positive (relative) reinforcement, they will inevitably pose a
danger to humanity (pp. 55–56).
Let’s suppose so. But why should it be ‘‘inevitable’’? Despite Arel’s faith in the
inevitability of AGI (which I share), he seems to be committing the fallacy of
thinking that AGIs must differ in crucial respects from humans.
This is the fallacy that John Searle commits when claiming that the inhabitant
of his Chinese Room (Searle 1980) doesn’t ‘‘understand a word of Chinese and
neither does any other digital computer because all the computer has is what [the
inhabitant] ha[s]: a formal program that attaches no meaning, interpretation, or
content to any of the symbols’’ (Searle 1982, p. 5). As I have pointed out elsewhere, this assumes ‘‘that external links are needed for the program to ‘attach’
meaning to its symbols’’ (Rapaport 2000, §3.2.2). The fallacy can be seen by
realizing that ‘‘if external links are needed, then surely a computer could have
them as well as—and presumably in the same way that—humans have them’’
(Rapaport 2000, §3.2.2).
Why do I think that Arel is committing this fallacy? Because, presumably,
humans also ‘‘attempt to maximize [their] wellbeing’’. Now, I can agree that
humans themselves have been known, from time to time, to ‘‘pose a danger to
humanity’’ (for a discussion of this, see Dietrich 2001, 2007). But we have also
devised methods for alleviating such dangers. Clearly, then, rather than wringing
our hands over the ‘‘inevitability’’ of AGIs wreaking havoc on their creators, we
should give them some of those methods.
60
I. Arel
And, indeed, Arel sketches out some possibilities along these lines: education
and ‘‘limit[ing] such [a] system’s mental capacity’’ (p. 56). But he seems to neglect
one obvious possibility, one that is, in fact, a necessity for any AGI: For an AGI to
really have GI—general intelligence—it must have cognition: (1) It must be able
to use and understand language—and, presumably, our language, so that we can
communicate with it, and vice versa (see Winston 1975 and my discussion of
‘‘Winston’s problem’’ in Rapaport 2003)—and (2) it must be able to reason
consciously (e.g. via an explicit knowledge-representation-and-reasoning system,
as opposed to tacit reasoning by, say, an artificial neural network). If we can reason
with it in natural language, then we can hope to be able to collaborate and
negotiate with it, rather than compete with it. Such natural-language and reasoning
competence is, in any case, a prerequisite (or at least a product) of education, but it
requires no limitation on the AGI’s mental capacity.
References
Arel, D. R., & Coop, R. (November, 2009). DeSTIN: A scalable deep learning architecture with
application to high-dimensional robust pattern recognition. In Proceedings of the AAAI 2009
Fall Symposium on Biologically Inspired Cognitive Architectures.
Arel, I. (2013). The threat of a reward-driven adversarial artificial general intelligence. In A.
H. Eden, J. H. Moor, J. H. Søraker, & E. Steinhart (Eds.), The singularity hypothesis: a
scientific and philosophical analysis (pp. 43–58). Heidelberg: Springer.
Dietrich, E. (2001, October). Homo sapiens 2.0: why we should build the better robots of our
nature. Journal of Experimental and Theoretical Artificial Intelligence, 13(4), 323–328.
Dietrich, E. (2007). After the humans are gone. Journal of Experimental and Theoretical
Artificial Intelligence, 19(1), 55–67.
Rapaport, W. J. (2000). How to pass a turing test: syntactic semantics, natural—language
understanding, and first-person cognition. Journal of Logic, Language, and Information, 9(4):
467–490. (Reprinted in The turing test: the elusive standard of artificial intelligence,
pp. 161–14, by H. M. James, Ed., 2003, Dordrecht: Kluwer).
Rapaport, W. J. (2003). What did you mean by that? Misunderstanding, negotiation, and syntactic
semantics. Minds and Machines, 13(3), 397–427.
Searle, J. R. (1980). Minds, brains, and programs. Behavioral and Brain Sciences, 3, 417–457.
Searle, J. R. (1982). The myth of the computer. New York Review of Books (29 April 1982): 3–6;
cf. correspondence, same journal (24 June 1982): 56–57.
Winston, P. H. (1975). Learning structural descriptions from examples. in Patrick HenryWinston
(ed.), The Psychology of Computer Vision (New York: McGraw-Hill): 157– 209. (Reprinted
in Readings in knowledge representation, pp. 141–168, by J. B. Ronald & J. L. Hector, Eds.,
1985, Los Altos, CA: Morgan Kaufmann).
Download