Model-Based Rescue Robot Control with ECLiPSe Framework Andrea Orlandini Dipartimento di Informatica e Automazione Università degli Studi ROMA TRE Via della Vasca Navale, 79 - 00144 - Roma, Italy orlandin@dia.uniroma3.it Abstract. In this work we describe a model-based approach to the execution and control of an autonomous rescue rover. We show how this control architecture naturally supports human-robot interaction in the diverse activities needed in rescue and search. We deploy high-level agent programming in Temporal Concurrent Golog (TCGolog) which provides both a declarative language (i.e. Temporal Concurrent Situation Calculus) to represent the system properties and the planning engine to generate the control sequences. We appeal to the ECRC Common Logic Programming System ECLiPSe 5.7 to perform the reasoning in the temporal domain. 1 Introduction Rescue competitions have been established since 2000 to foster robot autonomy in a completely unknown and unsettled environment, and to promote the use of robots in high risk areas, for helping human rescue teams in the aftermath of disastrous events (early reports on rescue purposes are [1–3]). Indeed, in rescue competitions the task is not to prevent calamitous events but to support operators for people rescue where human accessibility is limited or most probably interdicted. This typology of security tasks are crucial when the environment cannot be accessed by rescue operators and the aid of robots endowed with good perceptual abilities can help to save human lives. Autonomous robots have to accomplish such a tasks in complete autonomy, that is exploring and producing a map of the environment, recognizing via different perceptual skills the victims, correctly labelling the map with the victim position and, possibly, status and conditions. In this paper, we describe DORO cognitive architecture, purposely designed for the autonomous exploration and finding tasks required in rescue competitions and we focus on how we exploit the ECLiPSe framework in order to implement its model-based executive controller. The role of a model-based monitoring system is to enhance the system safeness, flexibility, and pro-activity. In this approach, the monitoring system is endowed with a declarative representation of the causal and temporal properties of the controlled processes. Given this explicit model, the executive control is provided by a reactive planning engine which harmonizes the mission goals, the reactive activity of the functional modules, and the operator interventions. In this way, the execution state of the robot can be continuously compared with a declarative model of the system behaviour: the executive controller can track relevant parallel activities (at different levels of granularity) integrating them into a global view and subtle resources and time constraints violations can be detected. The planning system is to compensate these failures/misalignments generating on-the-fly recovery sequences. Such features are designed and implemented by deploying a high-level agent programming paradigm. Following this approach, emerging behaviors are allowed only if they are consistent with respect to the declarative model (e.g. do not violate integrity constraints) and can be encapsulated into some flexible behaviors (i.e. high-level programs) specifying the robot control knowledge. Relying our cognitive architecture on the Temporal Concurrent Situation Calculus-Golog approach, we exploit the constraint programming ECLiPSe framework to implement executive controller and planning engine. We use such a framework in order to perform the reasoning in the temporal domain. 2 Rescue Scenario The National Institute of Standard Technology (NIST) has developed physical test scenarios for rescue competitions. There are three NIST arenas, denoted by yellow, orange, and red of varying degree of difficulty. Yellow arena represents an indoor flat environment with minor structural damage (e.g. overturned furniture), the orange arena is multilevel and have more rubble (e.g. bricks), the red one represents a very damaged environment, unstructured: multilevel, large holes, rubber tubing etc. The arenas are accessible only by mobile robots controlled by one or more operators from a separated place. The main task is to locate as many victims as possible in the whole arena. Urban search and rescue arena competition is a very hard test bed for robots and their architectures. In fact, the operator-robot has to coordinate several activities: explore and map the environment, avoid obstacles (bumping is severely penalized), localize itself, search for victims, correctly locate them on the map, identify them through a numbered tag, and finally describe her status and conditions. For each mission there is a time limit, to simulate the time pressure in a real rescue environment. In this contest human-robot interaction has a direct impact on the effectiveness of the rescue team performance. We implemented our architecture on our robotic platform (DORO) and partecipated to three RoboCup Competitions: 2004 in Lisbon (Portugal), 2005 in Osaka (Japan) and 2006 in Bremen (Germany). In particular, in 2004 DORO was the third award winner. The main modules involved are: Map, managing the algorithm of map construction and localization; Navigation, guiding the robot through the arena with exploration behaviour and obstacle’s avoidance procedures; Vision, used in order to automatically locate victims around the arena. In this context, [4] propose a high level sequence tasks cycle as reference for the rescue system behaviour such as Localize, Observe general surroundings, look specially for Victims, Report (LOVR). Our cycle’s interpretation corresponds to the following tasks sequence: map construction, visual observation, vision process execution and victim’s presence report. 3 Control Architecture Several activities need to be coordinated and controlled during a mission. A model of execution is thus a formal framework allowing for a consistent description of the correct timing of any kind of behaviour the system has to perform to successfully conclude a mission. However as the domain is uncertain, the result of any action can be unexpected, and the time and resources needed cannot be rigidly scheduled. Thus, it is necessary to account for flexible behaviours, which means managing dynamic change of time and resource allocation. In this section we describe the model underlying the flexible behaviours approach, mentioning the coordination and control of processes that are described in more details in the next sections, to give implementation concreteness to our formal framework. A model-based executive control system [5, 6] supervises and integrates both the robot modules activities and the operator interventions. Following this approach, the main robot and operator processes (e.g. mapping, laser scanning, navigation, etc.) are explicitly represented by a declarative temporal model (see Section 4) which permits a global interpretation of the execution context. Given this model, a reactive planner can monitor the system status and generate the control on the fly continuously performing sense-plan-act cycles. At each cycle the reactive planner is to: generate the robot activities up to a planning horizon and monitor the consistency of the running activities (w.r.t. the declarative model) managing failures. The short-range planning activity can balance reactivity and goal-oriented behaviour: short-term goals/tasks and external/internal events can be combined while the reactive planner tries to solve conflicts. In this setting, also the human operator can interact with the control system influencing the planning activity in a mixed initiative manner (analogously to [7]). Figure 1 illustrates the overall control architecture designed for DORO. The physical layer is composed of all the robot devices, i.e. motors, sonars, and payload. In particular, the DORO payload consists of two stereo cameras, one laser telemeter, two microphones, and the pan-tilt unit. The robot embedded components are accessible through the software provided by the vendor (ActivMedia Robotics Interface Application (ARIA) Libraries), while the payload software is custom. The functional level collects all the robot basic capabilities. The functional modules are to provide to upper levels services and information. In this sense, a functional module performs some actions in order to accomplish basic tasks, e.g. collect sensors measures, acquire image from the arena, and similar activities. In particular, DORO system is endowed with the following modules: Navigation, Acquisitions, Joypad, Laser, and PTU. The Navigation module controls the ro- bot movements. Elementary reactive behaviours are provided by this component, e.g. obstacle avoidance, wandering, moving toward a target location. The Acquisitions module manages the procedures needed to collect data from sonars, laser, and cameras and present them to the other modules. The Joypad is the functional interface between the robotic system and the human operator, e.g. it allows devices teleoperation and direct motion commands. Laser, and PTU modules control the corresponding physical devices (Telemeter and pan-tilt unit) by means of atomic actions. Decision daemons are responsible for the overall behavior, they are special modules collecting data from the functional layer and providing interpretations and some (mostly low-level) decisions. Differently from the functional modules which are reactive components totally controlled by the upper layers, the decision daemons provide not only passive services, but also pro-active behaviour taking decisions/initiatives, suggesting actions, and polarizing the activity of the system. Moreover, the modules at this stage can directly interact among them bypassing the executive control and providing a behaviour-based control layer. In particular, the decision daemons illustrated in Figure 1 are responsible for the following activities. The SLAM module builds a map of the explored area collecting the sensors readings and continuously localizing the robot position. Furthermore, some suggestions are provided to support the exploration decisions, e.g. interesting observation points, topologically salient regions of the map, interesting and practicable itineraries. The Exploration module is responsible for defining the exploration strategy taking into account the information passed by the other modules. Visual processing is performed by the Stereopsis and the Pattern Classification modules. The Stereopsis provides the processes needed for stereo vision (e.g. image stereo-correlation and 3D map) and cooperates with the Pattern Classification task to detect salient regions of the visual images. Pattern Classification deploys several methods to perform victim detection and identification. The state manager gets from each single daemon its current status so that it is possible query the state manager about the status of any daemons. The state manager updates its information every 200 msec. The task dispatcher sends tasks activation signals to the daemons (e.g. start map) upon receiving requests from the planner. The overall computational cycle works as follows: the planner gets the daemons status querying the state manager. Once the state manager provides the execution context, the planner is to produce a plan of actions (planning phase about 0.5 sec.) and yields the first set of commands to the task dispatcher. In the execution phase (about 0.5 sec.), each daemons reads the signals and starts its task modifying its state and using the functional modules data. When the next cycle starts, the planner reads the updated status through the state manager and checks whether the tasks were correctly delivered. If the status is not updated as expected a failure is detected, the current plan is aborted and a suitable recovery procedure is called. Executive Layer Task Library System Model Reactive Planner State Manager U S E R Task Dispatcher Decision Daemons SLAM Stereopsis Functional Layer JoyPad Navigation PTU Laser Exploration Pattern Classification Acquisitions ARIA M E S S A G E P A S S E R Physical Layer CCD Cameras Pan Tilt Laser Payload Microphone Sonars Motors Robot Encoders Fig. 1. Control architecture 4 Model-Based Monitoring with ECLiPSe Framework A model-based monitoring system is to enhance both the system safeness and the operator situation awareness. Given a declarative representation of the system causal and temporal properties, the flexible executive control is provided by a reactive planning engine which is to harmonize the operator activity (commands, tasks, etc.) with the mission goals and the reactive activity of the functional modules. Since the execution state of the robot is continuously compared with a declarative model of the system, all the main parallel activities are integrated into a global view and subtle resources and time constraints violations can be detected. In this case the planner can also start or suggest recovery procedures the operator can modify, neglect, or respect. In order to get such a features we deploy high-level agent programming in Temporal Concurrent Golog (TCGolog) [8] which provides both a declarative language (i.e. Temporal Concurrent Situation Calculus [9–11]) to represent the system properties and the planning engine to generate the control sequences. Temporal Concurrent Situation Calculus. The Situation Calculus (SC) [12] is a sorted first-order language representing dynamic domains by means of actions, situations, i.e. sequences of actions, and fluents, i.e. situation dependent properties. Temporal Concurrent Situation Calculus (TCSC) extends the SC with time and concurrent actions. In this framework, concurrent durative processes [9–11] can be represented by fluent properties started and ended by durationless actions. For example, the process going(p1 , p2 ) is started by the action startGo(p1 , t) and it is ended by endGo(p2 , t0 ). Declarative Model in TCSC. The main DORO processes and states are explicitly represented by a declarative dynamic-temporal model specified in the Temporal Concurrent Situation Calculus (TCSC). This model represents the cause-effects relations and the temporal constraints among the activities: the system is modeled as a set of components whose state changes over time. Each component (including the operator’s operations) is a concurrent thread, describing its history over time as a sequence of states and activities. For example, in the rescue domain some components are: pant-tilt, slam, navigation, visualPerception, etc. Each of these is associated with a set of processes, e.g. navigation can be nav W and (wandering the arena), nav GoT o (navigate to reach a given position), or nav Stop (robot stopped); pan-tilt can be: ptIdle(x) (idling in position x), ptP oint(x) (moving toward x ), or ptScan(x) (scanning x). The history of states for a component over a period of time is a timeline. E.g. Figure 2 illustrates the evolution of navigation, camera, and pan − tilt up to a planning horizon. Fig. 2. Timelines evolution Hard time constraints among the activities can be defined by a temporal model using Allen-like temporal relations, e.g.: ptP oint(x) precedes ptScan(x), ptScan(x) during nav Stop, etc. Temporal Concurrent Golog. Golog is a situation calculus-based programming language which allows to denote procedural scripts composed of the primitive actions explicitly represented in a SC action theory. This hybrid framework integrates both procedural programming and reasoning about the domain properties. Golog programs are defined by means of standard (and not so-standard) Algollike control constructs: i. action sequence: p1 ; p2 , ii. test: φ?, iii. nondeterministic action choice p1 |p2 , iv. conditionals, while loops, and procedure calls. Temporal Concurrent Golog (TCGolog) is the Golog version suitable for durative and parallel actions, it is based on TCSC and allows parallel action execution: akb. An example of a TCGolog procedure is: proc(observe(x), while (nvStop ∧ ¬obs(x)) do π(t1 , ?start(t1 ) : [if (ptIdle(0)) do π(t2 , startP oint(x, t1 ) :?(t2 − t1 <3))| if (ptIdle(x)) do π(t3 , startScan(x, t3 ) :?(t3 − t1 <5)))). Here the nondeterministic choice between startP oint and startScan is left to the Golog interpreter which has to decide depending on the execution context. Note also that, time constraints can be encoded within the procedure itself. In this case the procedure definition leaves few nodetermistic choices to the interpreter. More in general, a Golog script can range from a completly defined procedural program to an abstract general pourpose planning algorithm like the following: proc(plan(n), true? | π(a, (primitive action(a))? : a) : plan(n − 1)) The semantics of a Golog program δ is a situation calculus formula Do(δ, s, s0 ) meaning that s0 is a possible situation reached by δ once executed from the situation s. For example, the meaning of the a|b execution is captured by the . logical definition Do(a|b, s, s0 ) = Do(a, s, s0 ) ∨ Do(a, s, s0 ). Flexible behaviours. Our monitoring system is based on a library of Temporal Concurrent Golog scripts representing a set of flexible behaviour fragments. Each of these is associated to a task and can be selected if it is compatible with the execution context. For example a possible behaviour fragment can be written as follows: proc(explore(d), [π(t1 , startM ap(t1 ))kπ(t2 , startW and(t2 ) : π(t3 , endW and(t3 ) : π(x, startGoto(x, t3 )) :?(t3 − t2 <d)))]. This TCGolog script is associated with the exploration task, it starts both mapping and wandering activities, the wandering phase has a timeout d, after this the rover has to go somewhere. This timeout d will be provided by the calling process that can be either another Golog procedure or an operator decision. Reactive Planner/Interpreter. As illustrated before, for each execution cycle, once the status is updated (sensing phase), the Golog interpreter (planning phase) is called to extend the current control sequence up to the planning horizon. When some task ends or fails, new tasks are selected from the task library and compiled into flexible temporal plans filling the timelines. Under nominal control, the robot’s activities are scheduled according to a closed-loop similar to the LOVR (Localize, Observe general surroundings, look specially for Victims, Report) sequence in [4]. Some of these activities can require the operator initiative that is always allowed. Failure detection and management. Any system malfunctioning or bad behaviour can be detected by the reactive planner (i.e. the Golog interpreter) when world inconsistencies have to be handled. In this case, after an idle cycle a recovery task has to be selected and compiled w.r.t the new execution status. For each component we have classified a set of relevant failures and appropriate flexible (high-level) recovery behaviours. For example, in the visual model, if the scanning processes fails because of a timeout, in the recovery task the pan-tilt unit must be reset taking into account the constraints imposed by the current system status. This can be defined by a very abstract Golog procedure, e.g. proc(planT oP tuInit, π(t, ?time(t) : plan(2) : π(t1 , P tIdle(0) : ?time(t1 ) :?(t1 − t < 3)))). In this case, the Golog interpreter is to find a way to compile this procedure getting the pan-tilt idle in less than two steps and three seconds. The planner/Golog interpreter can fail itself its plan generation task, in case we have a planner timeout. Since the reactive planner is the engine of our control architecture, this failure is critical. We identified three classes of recoveries depending on the priority level of the execution. If the priority is high, a safe mode has to be immediately reached by means of fast reactive procedures (e.g. goT oStandBy). In medium priority, some extra time for planning can be obtained by interleaving planning and execution: a greedy action is executed so that the interpreter can use the next time-slot to end its work. In the case of low priority, the failure is handled by replanning: a new task is selected and compiled. In medium and low level priority the operator can be explicitly involved in the decision process in a syncronous way. During a high-priority recovery we have no mixed initiative, if the operator wants to take care of it the monitoring system is bypassed. ECLiPSe Implementation. We provided a constraint logic programming (CLP) [13] implementation of the TCGolog based control system for the rescue domain. Since in this setting the TCGolog interpreter is to generate flexible temporal plans, it must be endowed with a constraint problem solver. Analogous to [11] we rely on a logic programming language with a built-in solver for linear constraints over the reals (CLP(R)). In this setting logical formulas, allowed for the definition of predicates, are restricted to be horn clauses of the form: A ← c1 , . . . , cm |A1 , . . . , An , where ci are constraints and Aj are atoms. Namely, we appeal to the ECRC Common Logic Programming System ECLiPSe 5.71 . In this way our planner and domain axioms make use of linear temporal relations like 2 ∗ T1 + T2 = 5 and 3 ∗ T2 − 5 ≤ 2 ∗ T3 , and we rely on ECLiPSe to performing 1 http://eclipse.crosscoreop.com/ the reasoning in the temporal domain. The relations managed by the ECLiPSe built-in constraint solver have # as a prefix, for example, a temporal constraint represented in the Golog interpreter is: do(C : A,S,S1) :- concurrent_action(C), poss(C,S), start(S,T1), time(C,T2), T1 #=< T2, do(A,do(C,S),S1). Other temporal constraints are expressed in the action preconditions, for example, considering the pan-tilt processes: poss(pt_pos_start(X,T),S) :pt_idle(X,T1,S),T1 #< T, start(S,T2),T2 #>= T, nv_stop(T11,S), T11 #< T. An example of the successor state axioms is the following. pt_idle(X,T1,do(C,S)) :pt_idle(X,T1,S) not member(pt_pos_start(_,T2),S); member(pt_pos_end(X,T1),S). Given the the model specification introduced above, for each timeline it is possible to specify a control procedure: proc(pt_go(X), pi(t1, [pt_pos_start(X,t1)]: pi(t2, [pt_pos_end(X,t2)] )) ). Once the flexible temporal plan is compiled, it can be executed. We assume an execution monitor cycleExec which sends and receives commands at each time tick so that constraints can be, step by step solved and/or propagated. A dummy implementation of cycleExec is shown below. planExec :do(c-plan,s0,S),!,cycleExec(1,s0,S,S1). cycleExec(T,S0,S0,S0) :- !. cycleExec(T,S0,S,S1) :checkMsg(T), exec(T,S0,S,S1), checkMsg(T), T1 is T+1,!, cycleExec(T1,S1,S,S2). 5 Conclusion We presented a model-based approach to the execution and control of an rescue robotic system. We also showed how this model supports human-robot interaction during the rescue competition. In fact, the system improves the operator situation awareness providing a better perception of the mission status. We briefly detailed system’s model in order to highlight the support given to the rescue operator and we reported some positive results obtained during several RoboCup real rescue competitions since 2004. We implemented our architecture on our robotic platform (DORO) and we show that we exploit the ECRC Common Logic Programming System ECLiPSe 5.7 to perform the reasoning in the temporal domain. References 1. Tadokoro, S., Kitano, H., Takahashi, T., Noda, I., Matsubara, H., Shinjoh, A., Koto, T., Takeuchi, I., Takahashi, H., Matsuno, F., Hatayama, M., Nobe, J., Shimada, S.: The robocup-rescue project: A robotic approach to the disaster mitigation problem. In: ICRA-2000. (2000) 4089–95 2. Jacoff, A., Messina, E., Evans, J.: A reference test course for urban search and rescue robots. In: FLAIRS Conference 2001. (2001) 499–503 3. Maxwell, B.A., Smart, W.D., Jacoff, A., Casper, J., Weiss, B., Scholtz, J., Yanco, H.A., Micire, M., Stroupe, A.W., Stormont, D.P., Lauwers, T.: 2003 aaai robot competition and exhibition. AI Magazine 25(2) (2004) 68–80 4. Murphy, R.: Human-robot interaction in rescue robotics. IEEE Transactions on Systems, Man and Cybernetics, Part C 34(2) (2004) 138–153 5. Muscettola, N., Dorais, G.A., Fry, C., Levinson, R., Plaunt, C.: Idea: Planning at the core of autonomous reactive agents. In: Proc. of NASA Workshop on Planning and Scheduling for Space. (2002) 6. Williams, B., Ingham, M., Chung, S., Elliott, P., Hofbaur, M., Sullivan, G.: Modelbased programming of fault-aware systems. AI Magazine (2003) 7. Ai-Chang, M., Bresina, J., Charest, L., Chase, A., Hsu, J.J., Jonsson, A., Kanefsky, B., Morris, P., Rajan, K., Yglesias, J., Chafin, B., Dias, W., Maldague, P.: Mapgen: mixed-initiative planning and scheduling for the mars exploration rover mission. Intelligent Systems, IEEE 19(1) (2004) 8– 12 8. Reiter, R.: Knowledge in action : logical foundations for specifying and implementing dynamical systems. MIT Press (2001) 9. Pinto, J., Reiter, R.: Reasoning about time in the situation calculus. Annals of Mathematics and Artificial Intelligence 14(2-4) (1995) 251–268 10. Reiter, R.: Natural actions, concurrency and continuous time in the situation calculus. In: Proceedings of KR’96. (1996) 2–13 11. Pirri, F., Reiter, R.: Planning with natural actions in the situation calculus. Logicbased artificial intelligence (2000) 213–231 12. McCarthy, J.: Situations, actions and causal laws. Technical report, Stanford University (1963) Reprinted in Semantic Information Processing (M. Minsky ed.), MIT Press, Cambridge, Mass., 1968, pp. 410-417. 13. Jaffar, J., Maher, M.J.: Constraint logic programming: A survey. Journal of Logic Programming 19/20 (1994) 503–581