Demo for AAMAS-2012 GaTAC: A Scalable and Realistic Testbed for Multiagent Decision Making Ekhlas Sonu, Prashant Doshi Dept. of Computer Science University of Georgia Athens, GA, 30602, USA [sonu,pdoshi]@cs.uga.edu Objective To design and implement a realistic testbed to evaluate the performance of decision making algorithms in a problem domain that is: Relevant in cooperative, competitive and mixed settings i.e. across different frameworks such as Dec-POMDP, IPOMDP, etc. Scalable in problem size No. of Physical States Flexible in agent capabilities Number and type of actions and observations Extensible in number of agents and adaptable to agent types Motivation Recently, there have been substantial development in multi-agent decision making algorithms that has driven researchers to go beyond the traditional toy problem domains such as the Tiger Problem, Machine Maintenance Problem, Grid meeting, etc. Some larger problem domains include Cooperative Box-Pushing, Mars Rover, etc.: Applied in cooperative settings A Desirable Problem Domain A desirable problem domain for multi-agent decision making must be: Scalable in physical states Flexible in agent capabilities actions & observations Extensible in number of agents Relevant to cooperative, competitive and mixed settings Able to produce solutions rich in structure Realistic with a popular appeal Proposed Scenario: Autonomous Unmanned Aerial Vehicles Application: Law enforcement [Murphy, Cycon; 1998] Fighting forest fires [Casbeer, et.al.; 2005] Border surveillance [Haddal, Gertler; 2010] Wartime reconnaissance Uncertainty in AUAVs due to: Uncertainty about physical state Noisy actuators and sensors Added Complexity: Presence of other agents May be cooperative or competitive Related Research Focuses on formulating flight trajectories [R. Bernard, et.al.,2002, 2003. S.M. Li, et.al 2002] An example decision making scenario with AUAVs We propose a problem domain involving a Autonomous Uninhabited Aerial Vehicles The operating theatre may be divided into various sectors (as is a common practice) and may be represented as a grid of a predetermined size An example decision making scenario with AUAVs An example UAV recon problem may involve a UAV (I) (or a team of UAVs) trying to apprehend a target (T) (or a team of moving targets) while another team of UAVs (J) tries to help the target(s) escape to a safe house Of course the exact problem description is flexible S.H. GaTAC: Overview Georgia Testbed for Autonomous Control of vehicles (GaTAC): computer simulation framework for evaluating solution to a UAV reconnaissance problem. It provides: Hyperrealistic 3D rendering of AUAV acting in real world scenario Scalability in problem size and number of agents Flexibility in designing actions and observations of each agent Input: Agent control function (policies) for all agents generated by any (multi-agent) decision making algorithm Output: Simulation of policies on a flight simulator Results of simulations may be compared for policies generated by different algorithms using metrics such as number of captures, cumulative reward, etc. GaTAC: Workflow We begin with a formal description for any UAV decision making problem Formulate problem as .dpomdp/.ipomdp file Configure GaTAC for simulation (i.e. setup environment) GaTAC GaTAC: Workflow .dpomdp/.ipomdp Solve using algo. of choice Obtain policies Policies for each agent are fed to GaTAC to be simulated and evaluated GaTAC GaTAC: Workflow .dpomdp/ .ipomdp Solve Simulate policies and evaluate results using metrics such as number of success, cumulative rewards, etc. GaTAC GaTAC Components Each instance of GaTAC has three components: Flight Simulator Off-shelf open source flight simulator on which policies are simulated One instance of flight simulator for each agent Autonomous Control Module Control each aircraft and make it behave according to the policy on the flight simulator Communication Module Send aircraft behavior from ACM to flight simulator Communicate with other agents (if required) GaTAC instances may run on different machines Connected using communication module Architecture Communication between agents Flight Simulator Communication Autonomous Module Control Module Flight Simulator FlightGear: Open-source (written in C++) Multi-platform Hyperrealistic 3D graphics 3D virtual map Flexible with choices of Multiple models of aircrafts Locations to act as operating environment Weather condition, time of day, etc. 6 DOF flight dynamics model Simulates effects of airflow on different parts of aircraft FlightGear in Operating Scenario FG utilizes realistic 3D scenery available from TerraGear Provides multiple view of the flying aircraft Cockpit view, tail view, etc. Multiple instances of FG may be linked together through external serversideal for multi-agent settings Autonomous Control Module Used to algorithmically control the aircraft and make it behave according to policy: 3 levels of hierarchy Agent Actions on Grid High Level Actions Takeoff, Fly-Straight, Turn, Change Altitude Low Level Actions Control Rudder, Throttle, Aileron, Roll, Pitch, etc. Perform Perform simple low level tasks actions that represent to control simple aircraft by adjusting behaviors Actions constructed using high level actions toaircraft represent actions parameters the at 6DoF of agents in thealong problem hand Communication Module Establish a communication channel between: Autonomous Control Module and FlightGear Between each agent (if required e.g. in team settings) Communication channels use UDP, httpd and XML Communicate low-level flight control data from an instance of autonomous control module to respective instance of FlightGear Communicate aircraft position to all other instances of GaTAC in real time (used to formulate observations) Communication Module Functions Send control data from ACM to FG May adjust flight parameters such as controlling thrust, rudder, aileron, altitude, etc. Receive the aircraft’s flight dynamics in real time from FG and send to ACM for path correction Position , aircraft orientation on 6 DoF, flight speed, altitude, etc. May be used to pass messages between GaTAC instances (when communication between agents is required) GaTAC Control Algorithm Get Observations/ Next Action Read policy from file Fly according to policy Observation =Successful? Start FlightGear No Obtain Agentaction actiontosystematically perform Next Repeat action untilmay termination be obtained from broken from down theinto policy high-level and policy condition using reached the observation then low-level actions to control the aircraft algorithmically Yes Mission Accomplished Conclusions GaTAC: Can act as an open-source testbed for decision theoretic agents May be used to compare different algorithms irrespective of decision making framework (Dec-POMDP, I-POMDP, MTDP, etc.) Is extensible: no upper bound on size of problem No. of physical states, no. of agents, no. and types of actions & observations Facilitates deployment of decision theoretic agents in hyper-realistic real world settings (cooperative, competitive, or mixed) Easily configurable for simulating any UAV problem Provides for communication between agents May be extended to include choice of locations and aircrafts Demo