Gregory Pavlov Computer Science Advisor: Prof. Robert Signorile Intelligent Entities in a Distributed Simulation Environment Abstract In addressing the growing model sizes used in simulation environments, I examine adding machine learning techniques to the entities in a model in an effort to produce such side effects as emergent behavior. Distributing the environment in order to increase the efficiency of the simulation also plays an important role in this thesis. The added intelligence of entities may have some affect upon the speed at which a model may be executed, as additional computation will be required. Taking a model-based approach, I attempt to solve some of the problems of interaction between components in a distributed simulation. Introduction Simulation is the product of science attempting to model the real world as accurately as possible in a virtual world. The goal being to make predictions regarding the result of a given decision or choice, and then see the effects upon the created realm, as the first computers were used to compute artillery shell trajectories (Fujimoto 7). As the required workload given to a specific program operating on a single machine began to grow such that a simulation could no longer run in real-time, breaking up the model and sharing the work among multiple machines became much more common. The design of each system used the modules to order the entities to different locations in the model, rather than the entities making a certain choice when required. Discrete-event simulation uses modules to perform some task or function on an entity in the model upon arrival at the given module. The model for such a simulation appears similar to a workflow diagram, where some modules create entities, others delay, others decide the path a 1 given entity should take, while other remove entities from the model. A single event calendar exists such that an event is placed into an appropriate spot into a queue. The queue order is determined by the time an event should occur in regards to the simulation clock. An example of the queue would be the arrival of an entity into a given module at time t, the work this module should perform on this entity at time t+1, and the exit of this entity at time t+2. In a continuous simulation, cellular automata are used in order to represent the real world in some fashion. Agents move from cell to cell at specific defined time intervals and depending upon the designed behavior of an agent. Each cell may perform functions that use agents, only operate when an agent is absent, or when a specific number of agents are present on the cell, similar to a module in a discrete-event simulation. Agents may also use the cell to perform some function, and may have such behavior to ‘learn’ to better perform the desired task. The ‘learning’ performed by agents generally occurs due to a specific way in which an agent’s behavior is defined. In such a definition, an agent will use some characteristic of the cell it is currently on as input into a function that exists as part of this agent, then based upon the result of the function move in some direction or perhaps perform another function. Agents will then update the function based upon the effect of the determined result. The typical model for a continuous simulation would be a flocking demonstration. Here, each agent would be instructed to turn to the right or left a certain amount based upon another agent entering to within a predefined radius, say two cells. Using a simple model such as this with approximately ten agents, flocking behavior will emerge. Distributed processing as used in simulation resulted from the ever growing representations that models provided. By breaking the model into smaller pieces that may be executed and spreading the workload out among multiple machines, the overall runtime of a given model can be decreased significantly. An example of such a system would be the NEC 2 Earth Simulator, when multiple processors and memories, the core of any computer, are used to simulate the Earth’s climate. Simulation’s constant goal remains the decrease in runtime of models with the highest possible granularity. According to Ferscha, regardless of the granularity the runtime of the model may be affected by the objective or nature of the simulation (4). A model may not be suited for the simulation environment in which it has been created, and thus may experience a longer runtime. One of the largest problems encountering simulation is granularity, or how specific and detailed to make a modeling environment. If an environment consists of small granules, then models created for this environment can be extremely specific, creating accurate results. However, a small granularity also means that the runtime of the simulation increases, as more processing is required. A large granule yields the opposite results, a quick runtime, but more vague results. Thus the granularity of a simulation environment remains a continuous issue. The increasing size of models also remains an issue in the forefront. Distributed simulation attempts to solve this problem by breaking up the model and using the processing potential of multiple machines over a network, possibly the Internet. The overhead associated with this distribution may outweigh the possible gain, as the delay associated with sending information between machines may increase. Minimizing the amount of information sent between computers thus helps to increase the runtime. In the past, running a simulation on computers from different manufactures could lead to different results. With the goal of distributed simulation being to reduce the runtime of a system, using different machines becomes an integral part. An object-oriented approach reduces the reliance upon a single type of machine from one manufacturer, thus increasing the flexibility of the system. 3 Fault tolerance represents “another potential benefit of utilizing multiple processors” (Fujimoto 5). A failure in one process may not represent the end of the simulation, as would be the case in a single-processor simulation system. If a distributed simulation environment is correctly designed and setup, the tolerance of the system can increase the overall effectiveness and efficiency of the system. The first integral component to any simulation is the simulation clock. This represents how time will chance when the simulation executes. Users generally define a base unit of time for the simulation, anywhere from seconds to years to perhaps centuries or even larger units depending upon the type of simulation (Kelton 7). When a user creates a model, he/she must keep the model parameters within the scope of the model; using hours to simulate a system that is only valid in real life for seconds, say simulating the flight of a baseball would not be recommended. The distribution of this clock represents a large problem for distributed simulation. The model represents the next component of any simulation. A model is composed of modules, each of which represents some function of the model, such as delaying an entity. If a model represented an assembly line, the modules would represent the pieces of the assembly line where work occurs, such as a machine that welds two pieces of metal. Entities would then represent the pieces of metal which travel through the assembly line. The more precise a model is the smaller the granularity of the system resulting in an increased runtime of the system. The state of a simulation is preserved via the variables assigned to the modules and entities in a model. Each module stores its specific function, delay time of entities, current entities in the model, destination options, and unique name. Entities store their creation time, delayed time, and current waiting time. With each clock event, the state of the simulation is updated as entities delay or move from one module to another. 4 Distributed simulations usually take two different approaches to dealing with models: breaking the model up and running the entire model in parallel, or replicating the model on each machine the in system and executing the model and comparing the results. Taking the model and breaking it up has become the preferred method because this results in a decrease in runtime of the overall simulation. Using the foundations of machine learning, distributed processing, discrete-event and continuous simulation, my thesis integrated some of the most attractive features of all four. Entities in a continuous simulation working at predefined set time intervals with the ability to ‘learn’ and the structure set by models in a discrete-event simulation while using parallel processing. The merging of these key concepts from provides the basis of my thesis. Previous Work The distribution of the simulation clock represents a key problem for simulation using multiple machines. The simulation time must be accurately and almost simultaneously relayed to every machine in the system. If each machine does not receive the time correctly, it may miss an event and fail to update its state correctly. Fujimoto suggest that it may be more valuable and “efficient to only update the variables when ‘something interesting’ occurs (32). This would result in machines only updating the states of the modules currently belonging to that machine only when required as opposed to at every event. This updating is known as a logical process, and distribution of these allows for asynchronously running a synchronous simulation. This streamlining can thus improve the efficiency of the simulation and thus decrease the runtime. The ‘grain-size’ of the simulation has the significant impact of affecting both the runtime and the relevancy of the produced results. The appropriate grain size depends upon the application and practical considerations. Grain size is also directly impacted by the availability 5 and use of existing simulation code (Logan 3). Reusing a portion of code may result in an inappropriate grain size for a model, thus increasing the difficulty in distributing a model. The distribution of the clock also leads to the problem of synchronization of an entire distributed simulation. If the differences in time between when any two given machines in the simulation grows too large, and the simulation may begin to experience errors as a result. On a single machine, this problem remains irrelevant as the events of the simulation occur sequentially. The simplest solution to this problem as explained by Fujimoto is to use a local causality constraint, where the messages sent between the various machines time stamp each message to help maintain synchronization (52). This system is also prone to deadlock because if a cycle of messages occurs with differences in the timestamps smaller than the unit of time in the simulation, each process sending a message may wait for the other processes to complete (Fugimoto 3). Another issue being faced is the interaction of learning entities with their environments, especially in environments which change. As an entity learns, based on some goal-directed task, the entity further specifies how it will interact with the environment, which is ideal when the environment is static. Entities thus must adapt to a dynamic environment however, as previous knowledge or results may not be relevant to the current environment. Voinea takes the approach called ‘learning_while_interacting’ which is based on “the assumption that in a multiagent system composed by many embodied individual agent there are simultaneous active behaviors” (66). These entities are competing or cooperating for shared resources, and thus must adapt to a changing environment. Methodology At the outset of my thesis, I choose a layered and model-based approach to the development, using a centralized machine, known as a Baron, for developing and distributing the 6 components of a model, and a network of machines, known as Peons, for the execution of the simulation. The distribution of the clock represents the first major hurdle that must be overcome when dealing with distributed systems. Regardless of how the system operates, a simulation clock remains a vital component. Since a layered approach was used, where the model of the simulation would remain locally on the Baron with its components split between Peons, a centralized clock would be used to maintain synchronization. As a simulation would start, the Baron would send out a UDP packet via a multicast broadcast to all Peons, signaling the start of the simulation and the given runtime. As the simulation then ran, a message would be sent out again using UDP and multicast signaling for each Peon to work, and then followed by a stop message. This maintains synchronization with each machine throughout the runtime of the simulation. The real world is extremely complex; so complex that we can only hope to realize the scale and chaos. A model represents the simplification of the real world into something that we as humans can understand. The model of the atom with a nucleus and electrons in finite states as proposed by Bohr is used to help remove some of the layers of complexity, which even today are not fully understood, in order to help us understand the affects. Using a model to represent some simplification of the real world assists simulation as well. Since we create the models used in simulations, the results of a given simulation contain greater relevance. The model-based approach thus contains many benefits relating to the components of a simulation, with a few drawbacks. Breaking down a model into modules, resources, and entities helps to maintain the state of the system. Each component can easily keep track of its own local variables without the concern of the state of others. This allows each Peon in the simulation to only be concerned with 7 its local modules and entities. However, Peons must be responsible for the resources of other Peons, and thus must support the same functions that a Peon could perform on local resources on distributed resources. Modules represent the interaction of the simulation environment with the entities. Entities travel through the model by moving from one module to the next. At each module, some delay may occur to represent some portion of the model. Entities are queued in the model in a first-come-first-served manner. If we choose to model an assembly line, the entities would represent the initial pieces on the line. As they travel down the line, they visit stations, which would be represented as modules, such as a welding station. The entity would then delay for some function of the simulation time in the module to represent the two pieces of metal being welded together. The module may have to ‘seize’ some shared resource in order to perform the delay; otherwise the entity is forced to wait in a queue. Typically, models use a specific type of module to represent the delay that occurs with a resource, called a process module. The process module has access to a delay function and a resource, whether the resource is stored locally to the Peon the process module is assigned to or not. Since resources may be shared, used by multiple modules at the same time, the Peon must support the ability to access distributed resources. This is performed by assigning the resource to a specific Peon in the same manner that a module is. When a choice between options in a model must be represented, a decide module is used. This module does not possess a delay function, but instead an expression representing how the module should route the current entity to the possible destinations. The representations I use are simply by chance and by alternating. Deciding by chance means that all possible output options have a probability of being picked assigned, and when an entity enters the module it received a random value. Depending upon which range of probabilities this value falls in determines the 8 destination of the entity. If instead alternating is used, the decide module simply assigns the entity to the next destination in the list of output options, using the first item in the list as the next option when the last destination option was last selected, thus creating a circular list. Entities as discussed earlier travel from module to module based upon the assigned destination of the current module. My implementation of the entity allows it to choose its output option when entering a decide module as well as have its destination assigned depending upon the type of entity, learning or typical. Learning entities have the goal of traveling through the model in the shortest time possible, where typical entities perform as normal, obeying the modules. Entities perform this decision making process using a perceptron (Mitchell 86). A perceptron takes as input some information about the output option, such as previous queue length, and has a single output. Each input has a given weight associated with it, and when asked to make a decision, the perceptron multiplies each weight by the input value and computes either a one or negative one. A one represents a positive decision and a negative one a negative decision. After all of the output options have each been decided upon, the first option with a positive decision is the destination. If no destination receives a positive result, then the entity is assigned to an option as a typical entity would be for that decide module. When an entity reaches an end module, which records the statistics of all the entities of the model, it is then told to learn. This means that it compares its current time through the model to its previous time through the model. If the current time is less than the previous time, then the entity learns with a positive result. Otherwise, it learns with a negative result. Learning with a positive result refines the weights associated with the given output options appropriately; a negative means that weights are reduced in hope of an acceptable option being selected on the 9 next pass. The entity will then be sent back to the create module, that handles the arrival of entities to the model, in order to run through the model again. Since learning entities run through the model multiple times, and typical entities could also exist in the simulation, while the location of given modules and resources do not change, entities may ‘flow’ through the model differently each time through. This results in a dynamic situation where the previous quickest path through a model may no longer be the quickest because of an increase of queue lengths from entities selecting specific output options. Entities may no longer have optimal information for making decisions. Each Peon keeps the simulation time locally based upon the messages received from the Baron. When the simulation time reaches the appropriate runtime of the simulation, then the Peon stops executing the simulation, and begins to finalize statistics. All destroy modules then update the statistics that are gathered and send a multicast message to the Baron, who combines the statistics dynamically. This allows for statistics to be computed without the need to buffer all of the previous data send by Peons. The Baron can thus update the statistics one Peon at a time. Results The result of using a multicast communication system for the simulation clock means that there is no guarantee that a given Peon receives a message from the Baron. This implies that synchronization cannot truly occur, and will only appear in a stable network. This problem became apparent when also dealing with the clock speed of the simulation environment. If the clock speed was increased to below 250 milliseconds per cycle, the simulation become asynchronous as some Peons processed more information than others depending upon the components of the model currently residing there. With this system running on two machines, meaning a Baron and single Peon, the clock speed must be increased to 300 milliseconds to maintain reliability. 10 Since entities must maintain the necessary information for learning and deciding, this information must be passed with the entity as it moves from module to module. The current implementation uses multicast with UDP packets to send entities from one Peon to another. As the number of decide modules in a model increases, the amount of information pertaining to the weights for the perceptrons that must be stored and transmitted with each entity. As such, an entity may not fit into the conventional size of a multicast packet, producing errors for all Peons and the Baron in the system, and thus ending the current simulation unexpectedly. Unfortunately, no emergent behavior could be observed in any of the models used due to the size restrictions on the number of decide modules. Entities would possess too much information if more than two decide modules exists, and if each decide module has more than four output options. Also, due to the lack of a graphical user interface, viewing the simulation as it runs becomes increasingly difficult. Thus, identifying emergent behavior cannot be easily performed. An increase in the cycle time was found when large models were used with a number of Peons similar to the number of modules. This means that a large amount of information would be sent during each cycle, requiring the processing time of each machine for data transfer and not necessarily execution of the simulation. A decrease in the amount of overhead needed would help to alleviate this problem, although the necessary clock increase was only 50 milliseconds from the previous (250 to 300). Using a model consisting of a create module which produced two intelligent entities, a create module which produces a typical entity every four simulation clock cycles, two decide modules, one alternating and one by chance, four process modules, and a single destroy module, I was able to produce results that show that the intelligent agents proceed through the model faster 11 than the typical agents. Intelligent agents are recycled by the destroy module, thus allowing them to run through the model more than once. Entities, both typical and intelligent took approximately fifty five simulation time units to travel through the model which was executed for approximately three hundred simulation time units with fifty six entities on the average. Intelligent entities would travel through the model ten simulation time units quicker on the average. The number of entities includes the number of typical entities along with the number of passes by intelligent entities, as statistics are gathered from an intelligent agent each pass, thus the simulation perceives this as a new entity. The intelligent entities would have an average of nine and a half runs through the model during the course of the simulation. These results where found to be consistent using anywhere from one to nine Peons. Using more than nine Peons produced the same results as nine Peons, as any other Peons beyond the ninth did not have any modules assigned to them and thus would not compute any part of the simulation. Entities would leave their respective create modules, and immediately enter a decide module. This module alternated between two process modules, one with a two simulation time unit delay, and the other with a four simulation time unit delay. This results in approximately equal queue lengths for each module. After leaving either process module, entities enter a second decide module. In this model, the second decide module processes by chance and has a much higher probability, eighty percent, of picking a process module with a large delay, six simulation time units, and a twenty percent probability of picking a process module with a small delay, two simulation time units. The result of this is that large queues would occur on the process module with a large delay. Intelligent entities should therefore have, and did learn to choose modules with smaller delay and wait times. 12 By selecting the process modules with smaller delay times and shorter queues, the intelligent entities could then minimize the total time through the model, the goal given to each. These entities saved an average of twelve simulation time units in both total time in queues and total time waiting, defined as time delayed and time queued, when compared to the times of both intelligent and typical entities. This shows that entities can make their own decisions in a dynamic simulation environment. Future Work If work on this project continued in the future, I would like to see the following items addressed for improvement: 1. A reliable means of synchronization. Perhaps if IPv6 was used where reliable a reliable multicast protocol currently exists, as opposed to the current IPv4 protocol which only operates on a local area network. 2. Decide modules having more sophisticated means of selecting between output options. One such method could be based upon the queue lengths of possible output options. This would require an increase in overhead as an increase in messages between Peons would result as the decide module now requires information that may or may not be stored locally. This would also require that expressions be stored and validated in some manner, most easily in a stack-based method. 3. Increased support for expressions, allowing create modules to produce entities based upon sophisticated functions, such as exponential or logarithmic; or process modules to delay based upon these same options. 4. Entities to be sent in a more efficient manner, specifically via TCP rather than multicast. This is to allow the model sizes to grow, specifically with an increase in the number of 13 decide modules as each entity must keep the local information pertaining to the learning functionality and this information overflows the current size of a UDP packet. 5. Dynamic clock speeds. This is to allow the clock speed to be adjusted based upon the number of Peons, modules (specifically decide modules), shared resources, and entity creation rate. 6. Graphical user interface. The current implementation uses a simple text based user interface, which can be extremely clumsy and frustrating, especially when dealing with large models. 7. Support for submodels. This would allow for the creation of larger models from small, already existing models, thus decreasing the development time for the user (Ryde 1). This also allows for inter-enterprise simulation; businesses may then simulate the results of possible partnerships upon the global market and how each business would be affected. 8. Porting the simulation application to run on J2ME, thus allowing for the use of computation power of embedded processors along with easier networking and setup. The growth in the number of traditional computers manufactured each year pales in comparison to the number of embedded systems produced, resulting in a lower cost of an embedded system, and thus the ability to produce a distributed simulation environment at a lower cost (Bruzzone 690). Conclusion Using distributed simulation and machine learning techniques, I was able to show an increased efficiency in simulation of small models, along with the ability of intelligent entities to reduce their total time through a model based upon their own decisions. This was only functional with small models, having two or fewer decide modules, as it was not possible to simulate models with a larger number of decide modules due to the constraints of the system. 14 Emergent behavior can also be extremely difficult to identify, especially without a graphical user interface. Taking into account my suggestions for future work, this simulation system could be improved and prove extremely viable for large-scale models. Sources Blair, Eric; Wieland, Frederick; Zukas, Tony. (1995) Parallel Discrete-Event Simulation (PDES): A case Study in Design, Development, and Performance using SPEEDES. McLean, VA. IEEE Bruzzone, Agostino; Fujimoto, Richard; Gan, Boon Ping; Paul, Ray J.; StraBburger, Steffen; Taylor, Simon J.E. (2002) Distributed Simulation and Industry: Potentials and Pitfalls. Uxbridge, UK Fersha, A. (1995) Probabilistic Adaptive Direct Optimism Control in Time Warp. Vienna, Austria. IEEE Ferscha, A. (1995) Parallel and Distributed Simulation of Discrete Event Systems. Vienna, Austria. McGraw-Hill Fujimoto, Richard M. (2000) Parallel and Distributed Simulation Systems. New York, New York. John Wiley & Sons Law, Averill M; Kelton, W. David (1982) Simulation Modeling and Analysis. Boston, MA. McGraw-Hill Logan, Brian; Theodoropoulos, Georgios (1999) A Framework for the Distributed Simulation of Agent-based Systems. Birmingham, UK. European Simulation Multiconference Mitchell, Tom M. (1997) Machine Learning. Boston, MA. McGraw-Hill Ryde, Michael D.; Taylor, Simon J.E. (2002) Issues Using COTS Simulation Software Packages For the Interoperation of Models. Uxbridge, UK. Voniea, Camelia F. (2001) Attitude Learning in Autonomous Agents. Bucharest, Romania. 2nd Workshop on Agent-based Simulation; Passau, Germany 15