Implementing Scheme in a Virtual World Lance R. Williams Dept. of Computer Science University of New Mexico Outline Introduction to Second Life Motivation (Scientific, Educational, and Artistic) Overview of Scheme Overview of Linden Scripting Language A Heap Consisting of Many Objects A Heap Implemented as One Object with Many Parts A Bytecode Compiler for Scheme Future Work Second Life Massive multiplayer online game (MMOG) At any given time, about 50,000 players are online Immersive 3D environment where players are represented by avatars Second Life (contd.) Objects in the virtual world are subject to a rudimentary physics which includes gravity, transfer of kinetic energy, friction, and inertia The representation of the virtual landscape and its contents and the simulation of their physics are performed by a network of servers Views of the virtual world from the vantages of individual avatars are rendered on users’ clients Second Life (contd.) Not a game in the conventional sense because it has no explicit goals. Players interact with each other within a virtual world of their own creation. Players create objects, e.g., clothing, buildings, vehicles, guns, which they sell to each other for virtual money. Scientific Motivation Second Life supports a programming model where a large number of scripts execute in parallel and asynchronously. Communication between scripts is by means of message passing. Large distributed computations can be implemented and visualized Scientific Motivation (contd.) Second Life is an ideal testbed for exploring: Parallel algorithms Distributed sensor networks Swarm robotics Emergent behavior and self-organization Self-replicating machines Artificial intelligence Artificial life Educational Motivation The use of virtual environments in teaching is most useful when immersion in an actual environment is impractical. Teaching in a virtual environment for its own sake makes little sense. Educational Motivation (contd.) Second Life is unique among MMOG’s because the players themselves create the content of the virtual world they inhabit (at least partly) by means of computer programming. Second Life is especially well suited to teaching computer science because inside the virtual world, algorithms and data structures are reified. Artistic Motivation In almost all interdisciplinary work, methods from computer science are applied to help solve a problem important in the other discipline. Methods from other disciplines are rarely applied to solve problems in computer science. Artistic Motivation (contd.) The use of computers in the creation of art is widespread. However, there is a lack of art which explores ideas from computer science itself. Scheme A small and elegant dialect of Lisp developed by Sussman and Steele in 1975. Programs are represented as lists Lexically scoped Dynamically typed Functions are first-class Optimization of tail recursion Functional Programming Programming without side effects or mutation No assignment or variables Programs are definitions of the solutions to problems not sequences of statements which transform state Scheme is good for functional programming because functions are first-class and tail recursion is optimized Fibonacci Series in Scheme 21 13 3 2 8 5 (define fib (lambda (n) (if (< n 2) 35 n (+ (fib (- n 1)) (fib (- n 2))))) Building in Second Life Objects are constructed from elemental shapes called prims Basic prims include cubes, spheres, cones, cylinders, prisms, pyramids, and torii A much richer set of elemental shapes can be derived by transforming basic prims via scaling, clipping, twisting, drilling, etc. Composite objects can be assembled by grouping multiple prims into linksets Scripting in Second Life Behaviors beyond those due to simple physics are achieved by adding scripts to objects Scripts are small programs written in Linden Scripting Language Scripts are compiled by the client Scripts are executed asynchronously and in parallel on the server Scripted objects communicate with each other and with avatars via message passing Linden Scripting Language (LSL) An object is composed of one or more prims A prim can contain one or more scripts A script is composed of one or more states A state is composed of one or more event handlers Event handlers are code fragments which specify how a script will respond to an event in a given state LSL (contd.) Examples of events include state_entry(), touch_start(), listen(), timer(), on_rez(), and money() Event handlers can contain calls to both user defined functions and to functions from a standard library Examples of library functions include, llGetPos(), llSay(), llRezObject(), llGiveInventory(), and llGiveMoney() LSL (contd.) In most other ways, LSL is pretty ordinary Its syntax, operators, and control structures are similar to C Standard datatypes including integer, float, and string Specialized datatypes include vector for representing positions and velocities in 3D, and rotation Limitations of LSL No pointers, mutable arrays, or structures Scripts must run within 16K of space including stack, heap, and bytecode A non-homogeneous list datatype is provided, but lists cannot contain lists No function pointers Limitations of LSL (contd.) Calls to some library functions, e.g., llSetPrimitiveParams(), result in deliberate delays, e.g., by 0.2 second, of the calling script Excessive use of other library functions, e.g., llRezObject(), can result in silent failure These limitations combat lag and act as a grey goo fence Elevator This script moves the prim which contains it (and any avatar which happens to be sitting on it) up one meter each time the prim is touched: default { touch_start(integer count) { llSetPos(llGetPos( ) + <0,0,1>); } } Red-Green This script toggles between two states. In the default state, the color of the prim containing the script is set to red. default { state_entry( ) { llSetColor(<1,0,0>); } touch_start(integer count) { state green; } } Red-Green (contd.) In the green state, the color of the prim containing the prim is set to green. state green { state_entry( ) { llSetColor(<0,1,0>); } touch_start(integer count) { state default; } } Interprocess Communication Between scripts in different objects on an open channel using the llSay() function to send and listen() to receive Between scripts in prims within the same object on a private channel using the llMessageLinked() function to send and link_message() to receive Actor model A model of concurrent computation developed by Hewitt in ‘73 An actor is a process Actors execute asynchronously and in parallel An actor can send/receive messages to/from other actors An actor can create new actors LSL is an implementation of the Actor model Scheme Interpreter The interpreter consists of the parser, evaluator, printer, and garbage collector: parser evaluator printer garbage collector Scheme Interpreter (contd.) Each component is a recursive procedure which accesses the heap in dozens of places. There is no way that a program this complex will run in 16K of memory: heap COLLISION stack bytecode 16K Big Problem The parser, evaluator, printer, and garbage collector are recursive processes which access the heap in dozens of places. There is no way‡ to suspend execution after the llMessageLinked( ) function call and then resume it from inside the link_message( ) event handler. ‡Ironically, this would be easy in Scheme since it has first-class continuations. A Back Channel There are many functions in LSL which allow a script to change the attributes of the prim which contains it, e.g., llSetPrimitiveParams( ), llSetColor( ), llSetPos( ), llSetName( )… However, there is only one function which returns an attribute of any prim in a linkset, llGetLinkName( ). A Back Channel (contd.) llSetName( ) and llGetLinkName( ) can be used to construct a communication channel between a process which cannot be suspended and resumed and objects in the world This communication channel can be used as the interface for a distributed heap A Heap Consisting of One Object with Many Parts prim 2…N link_message( ) llMessageLinked( llGetLinkName( ) ) prim 1 main process S-expression Representation 14 bits (car) 32 bit integer 14 bits (cdr) 4 bits (type) Bytecode Compiler The evaluator can be replaced by a bytecode compiler and virtual machine: parser bytecode compiler virtual machine printer Virtual Machine Based on compilation model described in Dybvig ‘91 • • • • • accumulator (acc) bytecodes environment pointer (env) program counter (pc) evaluation frame stack (frame) argument stack (args) apply argument assign close constant frame halt refer return test Bytecode Compiler (contd.) (define sqr (lambda (x) (* x x))) COMPILATION {close ((x) . {refer x {argument {refer x {argument {refer * {apply}}}}}}) {assign sqr {halt}}} Bytecode Compiler (contd.) Unlike the previous evaluator, the virtual machine is not recursive…so the system stack does not grow The virtual machine uses its own stack and this can be stored in the distributed heap This permits much larger programs to be run A Heap Consisting of Many Objects The purpose of this entire exercise was to write a distributed implementation of Scheme. Why not start by distributing the heap among different objects in the world? One object per s-expression? Communication via llSay( ) and listen( )? A Heap Consisting of Many Objects listen( ) [ on_rez( ) ] fish (one of many) llSay( ) [ llRezObject( llSay( )] ) listen( ) prim 2 link_message( ) llMessageLinked( llGetLinkName( ) ) prim 1 main process ring S-expression Representation 32 bit integer 10 bits (car) 10 bits (car) 10 bits (cdr) 10 bits (cdr) 10 bits (self) 4 bits (type) Distributed Virtual Machine A strategy for distributing the stack and heap among multiple objects has been described However, the evaluation process itself runs inside of a single object…it is not distributed at all How can the evaluation process also be distributed? Continuation Passing Style Represent the bytecodes of the compiled program as objects Add a listen( ) event handler to each bytecode object which performs the appropriate transformation of the virtual machine state Encapsulate the virtual machine state in a message, i.e., a continuation, which is passed from bytecode to bytecode using llSay( ) Distributed Virtual Machine put (define sqr (lambda (x) (* x x))) COMPILATION refer argument pair refer get pair get put pair put close closure assign halt argument symbol def symbol pair ref refer get get pair apply get number pair apply primitive put number number get A Fishy Scheme The parser, compiler, printer, and garbage collector are implemented by a single script in the band The interface to the heap is implemented by a script in the stone‡ ‡ The logo is from the PLT Group at Rice University, the developers of MZ Scheme (apologies to The Grateful Dead). Degree of Distributedness data structures algorithms naive cube sphere fish heap N D D D stack N N D D parser N N N N interpreter N N - - compiler - - N N - - N D N N N N virtual machine printer Church-Turing Thesis All models of effective computation are equivalently powerful. However, certain models offer advantages with respect to particular modes of physical implementation. The von Neumann machine is well-suited to implementation using electronic components. Signaling Networks The von Neumann machine is less useful as a model for computations implemented using signaling networks of neurons, molecules, nanomachines, or bacteria. The ACTOR model (Hewitt ‘73) is potentially better suited to this mode of implementation. Distributed Virtual Machines Compile programs written in a simple and expressive programming language into a network of actors Actors represent virtual machine bytecodes and basic datatypes. The network of actors constitutes the code, stack, and heap of a distributed virtual machine. The Cell It is a popular misconception that DNA is a self-replicating molecule. DNA is a molecular representation. DNA can be used to represent a description of a self-replicating machine called the cell. The cell translates the DNA into a copy of the cell and a copy of the DNA. Quines A quine is a program which prints its own source code. A quine consists of program and data. The program uses the data to construct copies of the program and the data. Self-Replicating Distributed Virtual Machines A compiler can be implemented as a distributed virtual machine. The distributed virtual machine can copy itself by compiling its own source code. A distributed virtual machine which copies itself and then copies its own source code is (in effect) a non-biological cell. Bacterial Implementation bacterium molecular key bytecode n+1 bytecode n continuation (plasmid) molecular lock bacterium Conclusion Second Life is unique among MMORG’s because computer programming is a major component of the game Abstractions like algorithms and data structures are manifest in the virtual world Second Life can therefore be of great use to computer science educators Conclusion (contd.) Second Life’s programming model makes it an ideal testbed for research in distributed computing To explore these ideas, I have constructed a series of distributed evaluators for the Scheme programming language inside the virtual world Conclusion (contd.) ACTOR model is better suited to describing signaling network embodied computations than the von Neumann machine. A compiler implemented as a distributed virtual machine can copy itself simply by compiling its own source code. A self-replicating distributed virtual machine embodied using bacteria would be a meta-cell.