Massachusetts Institute of Technology 16.412J/6.834J Cognitive Robotics Project Proposal Continuously Planning for Autonomous Navigation Using Conflict-Directed A* to Generate Temporal Flexible State Plans Lawrence Bush, Brian Bairstow and Tony Jimenez April 11, 2005 Massachusetts Institute of Technology 16.412J/6.834J Cognitive Robotics Project Proposal Lawrence Bush, Brian Bairstow and Tony Jimenez April 11, 2005 Introduction: Problem Statement: Autonomous control of systems is an important topic, as autonomous systems can perform tasks that are dangerous, monotonous, or even impossible for humans. For example, Unmanned Aerial Vehicles (UAVs) can perform tasks such as reconnaissance, fire-fighting, and Mars exploration. Issues include the planning problem of where to go, and the control problem of how to actuate that movement. Bradley Hasegawa [1] posed the planning problem as a Selective Traveling Salesman Problem (STSP). Each point of interest (for example, a science site) is assigned a value, and each edge connecting points is assigned a cost. Then the most valuable and feasible ordered set of waypoints is chosen. Thomas Leaute [2] worked on the other end of the problem; starting from a plan and running the control system in a simulator. We hope to integrate the two techniques to get an autonomous planner and controller that can take a list of sites, decide on a plan, and then control the system to meet that plan. This will be a complete system that handles high and low level planning to achieve a goal. Our project addresses the design of an autonomous exploratory planner for a UAV. This problem involves extending a continuous observation planning system [1] with an improved algorithm and merging this system with a kinodynamic path planner [2]. Previous Work: Our project is premised upon 3 main bodies of work, namely [1], [2] and [3]. The first step of our project involves integrating [1] and [3]. Continuous Observation Planning for Autonomous Exploration [1] The thesis [1] presents a new approach for solving a robotic navigation path-planning problem. The approach first formulates the problem as a selective traveling salesman problem (S-TSP), then converts it to an optimal constraint satisfaction problem and solves it using the Constraint Based A* algorithm. The Solver, shown in the system architecture diagram in Figure 1, performs this key ability. Figure 1: The above diagram is the system architecture for [1]. The navigation architecture starts with a partially complete map. Candidates and obstacles are extracted from the map, which are used to construct a visibility graph. The D* search is used to update the candidates. The candidates are passed to the solver, which creates a plan (ordered candidate subset). The solver is a continuous observation planner, which updates the plan when new observations affect the candidate set (possible places to visit). The objective of the robot is to map its environment. The robot chooses to navigate to observation locations, which will maximize information gain. Each observation may affect the utility and cost of unvisited observation locations (candidates), which necessitates re-planning. There is an implicit trade-off between the planning horizon and how often the candidates are updated. The planning horizon should mirror the expected time period between re-planning. In other words, if we look ahead 5 tasks, we want to be able to execute those 5 tasks before we have to re-plan. If this does not occur, then our plan is optimized for a different planning period than it is executed for. This results in sub-optimal planning. Ultimately, the system is making an explorationexploitation trade-off, which can be generalized to other tasks. The tasks must involve observation and candidate list utility/cost updates. This method is likely to be effective when we have (at a minimum) a large-scale prior map of the exploration region. The thesis [1] addressed a mapping application where the candidates frequently changed due to new observations. The finite horizon technique is more effective when the candidates do not change frequently. Yet the mapping application actually favors observation candidates that increase its situational knowledge the most. For these reasons, the finite horizon method is more effective when a high-level map is known. The attributes of continuous finite horizon planning lend themselves to exploratory missions with a specific objective (i.e. a science exploration application) where a prior map is known. Refining the map will affect the cost estimate for the science tasks and the utilities of the science tasks may change as prior successes affect the probability of future successes. This necessitates continuous planning. However, the changes should be sufficiently infrequent, so that a finite horizon is more effective that a purely greedy candidate selection strategy. Key elements of the framework presented in [1] are shown in Table 1. Exploration Problem: Exploration Method: Assumption: Path Cost: Path Planner: Map Type: Pose: Candidate: Candidate Utility: Candidate Dynamics: Explore and construct a map of an environment Feature based (Newman, Boss, and Leonard) The robot knows the large-scale environment structure Path length (physical distance) Visibility Path Planner : F(map, candidates, pose) Feature based SLAM map Robot position and heading An observation point bordering an unexplored area An estimate of the observable unexplored area How do candidates change as a robot explores This is an open area of statistical learning research Table 1: Key attributes of the continuous observation-planning framework. Coordinating Agile Systems Through the Model-based Execution of Temporal Plans [2] This work provides a novel model-based execution of a temporally flexible state plan for the purpose of UAV navigation. Its kino-dynamic controller is a continuous planning framework. However, the high level planner is not. Our integration would enable this work to perform continuous high-level planning. The scope of our project includes enabling the simulation framework to accept TFSP updates. In particular, the observations would be considered when creating the temporally flexible state plan. Continuous high-level planning would allow the UAV to adapt its high-level goals to the observed environment. Specifically, it adapts the additional information that it learns about its environment (i.e. a more accurate map or the ramifications that one task has on the utility of future tasks). Therefore, this integration would provide a system, which observes, learns and updates its higher level planning goals. Conflict-directed A* and Its Role in Modelbased Embedded Systems [3] This paper [3] introduces a method for solving optimal constraint satisfaction problems. An Optimal Constraint Satisfaction Problem (OCSP) is the problem of finding a consistent assignment of variables to values that is both consistent and has optimal value. More rigorously, such a problem is a 5-tuple x, y, Dx , C x , g ( y ) where • x is a set of variables each with domain Dx, • C x is a set of constraints on the variables that define what is a consistent assignment, • F ( y ) is a function y ⊂ x → ℜ that defines the cost of an assignment. Many problems can be formulated as OCSPs. Specifically, an S-TSP problem can be represented as an OCSP. CDA* has been shown to be an efficient implementation for solving OCSPs. For this reason, we proposed to replace the Constraint Based A* solver in [1] with a CDA* implementation. Technical Approach: Our first step will be to adapt Bradley Hasegawa’s work (Continuous Observation Planning for Autonomous Exploration) [1] which involves a Selective Traveling Salesman Problem (STSP) with value updates. We have Conflict Directed A* (CDA*) code (in C++) from Tony Jimenez’s work on the second assignment. We hope to gain access to the code for [1], and change it so that when it frames the STSP as an Optimal Constraint Satisfaction Problem (OCSP), it will use our CDA* algorithm to solve the OCSP. This may involve a lot of work in merging two separately developed algorithms. Our next step will be to change the STSP solution to a Temporally Flexible State Plan (TFSP). This will involve adding time constraints to the ordered list of points to visit. This can be done trivially by generating the constraints at random or in a more logical manner by estimating times using a probabilistic motion model. We will probably use a simple C++ program to input the STSP solution and output a TFSP. Our final goal is to link the work for [1] with the work by Thomas Leaute [2] on Coordinating Agile Systems Through The Model-based Execution of Temporal Plans. The work for [2] takes a TFSP and tests the controller using simulation. Conceivably we could start with an STSP, use the work for [1] with CDA* to solve it, turn the solution into a TFSP, and run the TFSP using Thomas Leaute’s code in a simulator. However, [1] uses a STSP with changing values from observations, so it will keep producing different solutions over time. In discussion with Thomas Leaute, we learned that he thought it would be feasible to change the code for [2] to handle updates in the TFSP. Our fallback position would be to restart the simulation whenever the TFSP is updated. As a final implementation though, we note that our motivating application is the Mars Airplane, where the expected value of future scientific tasks is affected by ongoing observations. The above integration of the continuous observation planner intends to allow plan updates due to new observations. These observations may affect the value of the future tasks. A science exploration mission lends itself to continuous (finite horizon) observation planning because it embodies an implicit exploration-exploitation trade-off. Our integration will, therefore, allow the simulator to extend the feedback-control loop to high level planning. This feedback-control loop integration, however, is left as future work. Plan: Our minimal plan will be an extension to [1]. To accomplish our learning goals, our objective will be to master the concepts presented in Bradley Hasegawa’s thesis and develop an extension to his work by replacing constraint-based A* with conflict-directed A*. Our baseline plan would be a further extension to the above-mentioned algorithm. We will then take the algorithm developed by Mr. Hasegawa and convert the solution to the Selective Traveling Salesman Problem (STSP) into a Temporally Flexible State Plan (TFSP). Our enhanced plan is a novel cognitive robot application. This plan would be to take the extension to the previous work from [1] and merge it with [2] by modifying it to accept a continuously updated TFSP. The resulting application would be able to take as inputs a set of waypoints and continuously plan a kino-dynamic path that will be optimized according to the utility of the waypoints. This cognitive robot application can then be executed on a hardware-in-the-loop simulation as cited in [2] if that equipment is available from the MERS lab. Schedule: 4/11 4/17 4/19 4/24 4/27 5/4 5/7 Turn in proposal Integrate [1] with CDA* Create translation from STSP solution to TFSP. Reassess situation and schedule Integrate [1] with [2], modify [2] to handle updates to TFSP. Coding complete Simulations complete Paper complete Presentation complete Division of Labor: • Adapt Bradley Hasegawa’s work Solve OCSP using CDA* Convert OCSP [1] representation to CDA* [2] representation • Change the STSP solution to a TFSP add time constraints • Link Bradley Hasegawa’s work with the work by Thomas Leaute Adapt his code so that the TFSP can be continuously replaced with a new one. • Overarching Activities: Debugging Interfacing o Interfacing can often result in unforeseen implementation delays. Therefore, we intend to address most interfacing issues upfront, which will allow us to properly adapt the scope of our project. o We intend to do this by first implementing the interface between the two algorithms ([1] and [2]) without observations. Our second step will involve properly formatting the TFSP for the simulation interface [2]. The simulation framework [2] will initially require no fundamental changes since the TFSP will be static. The second spiral of our project will then include observations and TFSP updates. This will require a fundamental change to the simulation framework and interface [2]. Write-up References: [1] B. Hasegawa, “M.Eng. Thesis: Continuous Observation Planning for Autonomous Exploration,” Massachusetts Institute of Technology,(2004). http://mers.csail.mit.edu/theses/BradHasegawaT hesis.pdf [2] T. Leaute and B. Williams, “Coordinating Agile Systems Through the Model-based Execution of Temporal Plans,” accepted to the International Workshop on Planning under Uncertainty for Autonomous Systems (2005). [3] B. Williams and R. Ragno, “Conflict-directed A* and Its Role in Model-based Embedded Systems,” to appear in the Special Issue on Theory and Applications of Satisfiability Testing, accepted in Journal of Discrete Applied Math (January 2003). http://mers.csail.mit.edu/papers/jdam.pdf