A VIEW OF C PERATIVE CONTROL FOR AUTONOMOUS VEHICLES AND SENSOR NETWORKS C. G. Cassandras Center for Information and Systems Engineering Boston University Christos G. Cassandras CODES Lab. - Boston University OUTLINE ¾ COOPERATIVE CONTROL SETTING ¾ COOPERATIVE “MISSION” TYPES ¾ REWARD MAXIMIZATION MISSIONS ¾ COOPERATIVE RECEDING HORIZON (CRH) CONTROL ¾ SENSOR NETWORKS, COVERAGE CONTROL MISSIONS ¾ DEMOS: Applets and Movies Christos G. Cassandras CODES Lab. - Boston University COOPERATIVE MISSION SETTING TARGETS VEHICLES (MOBILE NODES) Christos G. Cassandras CODES Lab. - Boston University DIFFERENT COOPERATIVE MISSION TYPES ¾ RENDEZ-VOUS AT SOME TARGET POINT ¾ FORMATION MAINTENANCE ¾ REWARD MAXIMIZATION ¾ COVERAGE CONTROL Christos G. Cassandras CODES Lab. - Boston University RENDEZ-VOUS MISSION RENDEZ-VOUS AT THIS TARGET Christos G. Cassandras CODES Lab. - Boston University FORMATION MAINTAINING MISSION Christos G. Cassandras CODES Lab. - Boston University REWARD MAXIMIZATION MISSION TARGETS WITH DIFFERENT REWARDS AND DEADLINES MISSION OBJECTIVE: MAXIMIZE TOTAL REWARD COLLECTED BY VISITING TARGETS BEFORE THEIR “DEADLINES” EXPIRE Christos G. Cassandras CODES Lab. - Boston University REWARD MAXIMIZATION MISSION X X X X X X CONTINUED ? UNKNOWN TARGETS ? ? Vehicle 4 “repelled” by Vehicle 3 ⇒ Search task performed Christos G. Cassandras CODES Lab. - Boston University COVERAGE CONTROL MISSION ALL TARGETS UNKNOWN ONLY DENSITY FUNCTION ASSUMED Christos G. Cassandras CODES Lab. - Boston University COOPERATIVE REWARD MAXIMIZATION MISSION R1φ1(t) r6i, j = 1,…,3 R6φ6(t) R3φ3(t) r3i, j = 1,…,3 r1i, j = 1,…,3 R4φ4(t) R4 r4i, j = 1,…,3 R5φ5(t) R5 r5i, j = 1,…,3 C1 p1i, i = 1,…,6 C3 p3i, i = 1,…,6 Christos G. Cassandras C2 r2i, j = 1,…,3 p2i, i = 1,…,6 CODES Lab. - Boston University R R2φ22(t) COOP. REWARD MAXIMIZATION MISSION CONTINUED This is like the notorious TRAVELING SALESMAN problem, except that… ¾ … there are multiple (cooperating) salesmen ¾ … there are deadlines + time-varying costs ¾ … environment is stochastic (vehicles may fail, threats damage vehicles, etc.) Christos G. Cassandras CODES Lab. - Boston University SOLUTION APPROACHES ¾ Stochastic Dynamic Programming – Wohletz et al, 2001 Extremely complex… ¾ Functional Decomposition Dynamic Resource Allocation – Castanon and Wohletz, 2002 Assignment Problems through Mixed Integer Linear Programming – Bellingham et al, 2002 Combinatorially complex… Path Planning – Hu and Sastry, 2001, Lian and Murray 2002, Gazi and Passino, 2002, Bachmayer and Leonard, 2002 Christos G. Cassandras CODES Lab. - Boston University COMBINATORIAL + STOCHASTIC COMPLEXITY R2 R3 R4 R1 R5 1. Target Assignment Christos G. Cassandras 2. Routing 3. Path Control CODES Lab. - Boston University RECEDING HORIZON (RH) CONTROL: MAIN IDEA • Do not attempt to assign vehicles to targets • Cooperatively steer vehicles towards “high expected reward” regions PLANNING HORIZON, H periodically/on-event • Repeat process • Worry about final vehicle-target assignmentTurns out vehicles at the last possible instant converge to targets ACTION on their own! HORIZON, h Solve optimization problem by selecting all ui to maximize total expected rewards over H u1 u2 u3 Seealso alsoFranco, Franco,Parisini, Parisini,Polycarpou Polycarpou04; 04;Dunbar, Dunbar,Murray, Murray,04; 04;Richards, Richards,How, How,04 04 See Christos G. Cassandras CODES Lab. - Boston University CRH CONTROL PROBLEM FORMULATION Target positions (i = 1,…,N): yi ∈R² Vehicle dynamics (j = 1,…,M): • State: xj(t) ∈R² position of jth vehicle at time t • Control: uj(t) Vehicle heading at time t ⎡cos u j (t )⎤ 0 x& j (t ) = V j ⎢ , x ( 0 ) = x ⎥ j j ⎢⎣sin u j (t ) ⎥⎦ At kth iteration, time tk (k=1,2,…): • Planning Horizon: Hk • Vehicle position at time tk+Hk: x j (t k + H k ) = x j (t k ) + x& j (t k ) H k Christos G. Cassandras CODES Lab. - Boston University RH PROBLEM FORMULATION CONTINUED At kth iteration (k=1,2,…): Earliest time vehicle j can reach target i under control uj(tk): τ ij (u j (t k ), t k ) = (t k + H k ) + || x j (t k + H k ) − yi || V j τ ij VjHk xj(tk) yi qij uj(tk) xj (tk +Hk) qkj τ kj Christos G. Cassandras yk Probability vehicle j assigned to target CODES Lab. - Boston University CRH PROBLEM FORMULATION CONTINUED Objective at kth iteration: Maximize EXPECTED REWARD over horizon Hk Target i value attainable by veh. j [depends on uj(t)] M N veh. j value M N max ∑∑ Riφi (τ ij ) pij (τ ij ) qij (t k + H k ) − ∑∑ C j rij (t k + H k ) u Control vehicle headings i =1 j =1 Prob. vehicle j Prob. vehicle j Earliest time when collects i reward assigned to target i vehicle jtarget can collect [depends on uj(t)] [depends on u (t)] j reward from target i i =1 j =1 Prob. vehicle j destroyed by target i [depends on uj(t)] [depends on uj(t)] Christos G. Cassandras CODES Lab. - Boston University THE FUNCTION φi(t) [REWARD DISCOUNTING FUNCTION] • Targets with deadlines: 1 • Targets with time windows: Christos G. Cassandras 1 CODES Lab. - Boston University THE FUNCTION φi(t) CONTINUED • Sequencing targets: 1 φ1(t) • A general purpose φ−function: ln(1−α ) t Di ⎧ ⎪e φi (t ) = ⎨ ln(1−α ) ⎪e Di t e − β (t − Di ) ⎩ α 1 0.8 β 0.6 if t ≤ Di if t > Di 0.4 0.2 0 Christos G. Cassandras φ2(t) 20 40 t 60 80 CODES Lab. - Boston University 100 THE FUNCTION qij [TARGET ASSIGNMENT FUNCTION] • Vehicle-to-target distance: dij = x j − yi • Relative distance: δ ij = d ij ∑ M m =1 d im • Target assignment function qij(δij): Monotonically non-increasing and s.t. qij (0) = 1, qij (1) = 0 Christos G. Cassandras CODES Lab. - Boston University THE FUNCTION qij CONTINUED • A example of qij function (M=2): ⎧1 ⎪⎪ 1 qij (δij ) = ⎨ (1− Δ) −δij ⎪1− 2Δ ⎪⎩0 [ 1 ] if δij ≤ Δ if Δ < δij ≤1− Δ otherwise Capture Radius (Δ) 0.8 0.6 0.4 0.2 0 Christos G. Cassandras 0.2 0.4 0.6 0.8 1 1.2 1.4 CODES Lab. - Boston University THE FUNCTION qij CONTINUED qij(t) defines DYNAMIC RESPONSIBILITY REGIONS for vehicle j • Sj – Full Responsibility Region (FR) δij≤Δ • Cj – Cooperative Region (CR) Δ <δij≤1-Δ • Ij – Invisibility Region (IR) δij> 1-Δ Vehicle Targets in FR committed to vehicle Christos G. Cassandras FR x j CR CR IR Targets in IR ignored by vehicle CODES Lab. - Boston University THE FUNCTION qij CONTINUED 10 8 6 4 2 1 1 0 2 2 -2 3 3 -4 -6 4 4 -8 -10 -10 Δ=0.1 Δ=0.2 What happens as parameter Δ increases ? Christos G. Cassandras -5 0 5 10 Voronoi partition Partition of a plane with into n convex polygons such that each polygon contains exactly one point and every point in a given polygon is closer to its central point than to any other. CODES Lab. - Boston University 2-VEHICLE CASE – DYNAMIC PARTITIONING Possible Target Location Vehicle Locations II: Only vehicle 1 goes to target III: Both vehicles go to target IV: Only vehicle 2 goes to target (1 is repelled !) Christos G. Cassandras CODES Lab. - Boston University PLANNING AND ACTION HORIZONS PLANNING Horizon H(t): H (t ) = d min (t ) ≡ min d ij (t ) i, j ACTION Horizon h(t): h(t ) = α H + β H H (t ), α H ≥ 0, 0 ≤ β H ≤ 1 OR: Whenever next EVENT occurs Christos G. Cassandras CODES Lab. - Boston University FOUR veh’s: Green Purple BlacK BLue Christos G. Cassandras CODES Lab. - Boston University TARGET ASSIGNMENT MAIN IDEA IN CRH APPROACH: Replace complex Discrete Stochastic Optimization problem by a sequence of simpler Continuous Optimization problems But how do we guarantee that vehicles ultimately head for the desired DISCRETE TARGET POINTS? Christos G. Cassandras CODES Lab. - Boston University STABILITY ANALYSIS • TARGETS: yi • UAVs: xj DEFINITION: Vehicle trajectory x(t ) = [x1 (t ),K, xM (t )] generated by a controller is stationary, if there exists some tV < ∞, such that x j (tv ) − yi ≤ si for some i = 1, K , N , j = 1, K , M . QUESTION: Under what conditions is a CRH-generated trajectory stationary ? Wei Li Target Size CISE - Boston University STABILITY ANALYSIS CONTINUED Recall objective function: M M N N max ∑∑ Riφi (τ ij ) pij (τ ij )qij (t k + H k ) − ∑∑ C j rij (t k + H k ) u i =1 j =1 φi (t ) = 1 - α T i =1 j =1 t t ∈ [0, T ] rij=0 ⎧1 ⎪⎪ 1 (1− Δ) −δij qij (δij ) = ⎨ ⎪1− 2Δ ⎪⎩0 1 [ α 0.8 pij=1 0.6 0.4 0.2 ] if j ∈Bi ,δij ≤ Δ if j ∈Bi , Δ < δij ≤1− Δ otherwise 1 0 20 40 t 60 80 100 0.8 0.6 0.4 0.2 0 Wei Li 0.2 0.4 0.6 0.8 1 1.2 1.4 CISE - Boston University STABILITY ANALYSIS CONTINUED N M J (x) = ∑∑ Ri x j − yi qij Objective function reduces to: i =1 j =1 CRH controller solves optimization problem: ⎧⎪min J (x) x∈Fk ⎨ ⎪⎩ Fk = w : w j − x j (t k ) = VH k { } i.e., minimize the potential function J(x) over a set of M circles: min J(x) … 1 k F Fk2 Wei Li FkM CISE - Boston University MAIN STABILITY RESULT Local minima of J(x): x l = ( x1l ,..., xMl ) ∈ R 2 M , l = 1, K , L Vector of vehicle positions at kth iteration of CRH controller: xk Theorem: Suppose H k = min d ij (t k ) . i, j l If, for all l = 1,…,L, x j = yi for some i = 1,…,N, j = 1,…,M, then J ( x k ) − J ( x k +1 ) > b (b > 0 is a constant). If all local minima coincide with targets, the CRH-generated trajectory is stationary Wei Li CISE - Boston University MAIN STABILITY RESULT QUESTION: When do all local minima coincide with target points? 1 Vehicle, N targets If there exists a yi s.t. Ri − N ∑ j =1, j ≠ i Rj yi − y j yi − y j >0 2 Vehicles, 1 target 2 Vehicles, 2 targets Wei Li CISE - Boston University TO RECAP… • Limited look-ahead – control optimizes expectation over “planning horizon” • Control updates – event-driven (events are deterministic or random) or time-driven (for a given “action horizon”) • Target assignment – done implicitly, not explicitly: No combinatorial problem involved • Assignment + Routing + Path Control – all done together Christos G. Cassandras CODES Lab. - Boston University RH CONTROLLER FEATURES CONTINUED • Target values change – deadlines, target sequencing, return to base • Vehicle capabilities change - resource depletion, failures, damage • Threat capabilities change - radar on/off, threat damage • Target locations change - new targets, moving targets • Obstacle avoidance - targets with negative values • Randomness - new control actions in response to random events • Constraints – heading change, heading-dependent costs, sensing tasks Christos G. Cassandras CODES Lab. - Boston University ONLY KNOWN TARGET DETECTION RADIUS: 20 DISTRIBUTED COOPERATIVE CONTROL Construct GRADIENT FIELD instead of artificial potential field ∂J i Fij = = ci ( x j ) ⋅ f i 0 ( x j ) ∂x j if yi ∈ S j ⎧1 ⎪⎪ (1 − δ ij )(2δ ij − 1) ci ( x j ) = ⎨qij − if yi ∈ C j (1 − 2Δ) ⎪ if yi ∈ I j ⎪⎩0 Cooperation coefficient Christos G. Cassandras ⎧ Ri x j − yi if x j ≠ yi ⎪− f i 0 ( x j ) = ⎨ Di || x j − yi || ⎪0 otherwise ⎩ Force exerted by target i on vehicle j given that it is the only vehicle in the mission space CODES Lab. - Boston University DISTRIBUTED COOPERATIVE CONTROL • 2 examples (M=2, N=10) Christos G. Cassandras CODES Lab. - Boston University OTHER ISSUES ¾ Local optima in the CRH optimization problem ¾ Oscillatory vehicle behavior (instabilities) ¾ Additional path constraints ¾ Does CRH control generate optimal assignments? Christos G. Cassandras CODES Lab. - Boston University REWARD MAXIMIZATION MISSION DEMO MOVIES OF SUCH MISSIONS WITH SMALL ROBOTS: 3 Khepera robots http://frontera.bu.edu/CoopCtrl.html executing mission: visiting 8 targets with different rewards and deadlines. Robots communicate wirelessly and dynamically update headings. Overhead vision system provides location data. Inner ring - “reward": R - 200, G - 300, B - 600 Outer ring - "deadline": R - 80s, G - 50s, B - 20s Christos G. Cassandras CODES Lab. - Boston University COVERAGE CONTROL MISSION SENSOR FIELD WITH UNKNOWN DATA SOURCES - ONLY DENSITY FUNCTION ASSUMED --Meguerdichian Meguerdichianetetal, al,INFOCOM, INFOCOM,2001, 2001, --Cortes Cortesetetal, al,IEEE IEEETrans. Trans.on onRobotics Roboticsand andAuto., Auto.,2004 2004 --Li Liand andCassandras, Cassandras,CDC, CDC,2005 2005 Christos G. Cassandras CODES Lab. - Boston University WHAT’S A SENSOR NETWORK (SNET)? A NETWORK consisting of devices (sensors) that: … communicate wirelessly … are battery-powered … may have different characteristics … have limited processing capabilities … have limited life … often operate in noisy/adversarial environments … monitor/control physical processes Christos G. Cassandras CODES Lab. - Boston University WHAT’S A SENSOR NETWORK (SNET)? CONTINUED SNETs will consist of thousands of interacting devices! Christos G. Cassandras CODES Lab. - Boston University WHY ARE SNETs EXCITING? They interact with the physical world They promise fascinating applications: - Smart Buildings (locate persons/objects, find closest resource, adjust environment, detect emergency conditions) - Health monitoring - Security and military applications - Environmental monitoring - Inventory monitoring/replenishment (smart shelves) - Equipment condition monitoring and active maintenance (smart appliances) They realize a convergence of the “3 Cs”: Communications + Computing + Control Christos G. Cassandras CODES Lab. - Boston University COVERAGE CONTROL MISSION GOAL: Deploy mobile nodes to maximize data source detection probability – unknown data sources – data sources may be mobile R(x) (Hz/m2) 50 40 30 20 10 0 10 8 6 ? ? ??Ω??? ? ? 4 2 0 10 5 0 Perceived data source density over mission space Christos G. Cassandras CODES Lab. - Boston University PROBLEM FORMULATION N mobile sensors, each located at si∈R2 Data source at x emits signal with energy E Signal observed by sensor node i (at si ) Sensing model: pi ( x) ≡ p (Detected by i | A( x), si ) ( A(x) = data source emits at x ) Sensing attenuation: pi(x) is a decreasing function of di(x) ≡ ||x - si|| (distance between x and si ) Wei Li CISE - Boston University PROBLEM FORMULATION CONTINUED Joint detection prob. assuming sensor independence: N P( x) = 1 − ∏ [1 − pi ( x)] i =1 OBJECTIVE: Determine locations si (i=1,…,N) to maximize total detection probability: N ⎫ ⎧ max ∫ R( x)⎨1 − ∏ [1 − pi ( x)]⎬dx si ∈Ω ⎭ ⎩ i =1 Ω Perceived data source density Wei Li CISE - Boston University DISTRIBUTED COOPERATIVE SCHEME CONTINUED Denote N ⎧ ⎫ F ( s1 , K, s N ) = ∫ R( x)⎨1 − ∏ [1 − pi ( x)]⎬dx ⎩ i =1 ⎭ Ω Maximize F(s1,…,sN) by forcing nodes to move using gradient information: N ∂pk ( x) sk − x ∂F dx = ∫ R ( x) ∏ [1 − pi ( x)] ∂sk Ω ∂d k ( x) d k ( x) i =1,i ≠ k Wei Li CISE - Boston University DISTRIBUTED COOPERATIVE SCHEME CONTINUED N ∂pk ( x) sk − x ∂F dx = ∫ R ( x) ∏ [1 − pi ( x)] ∂sk Ω ∂d k ( x) d k ( x) i =1,i ≠ k This has to be evaluated numerically. Not doable for a mobile sensor with limited computation capacity. ¾ Approximate pi(x) by truncating sensing attenuation ¾ Discretize pi(x) using a grid Wei Li CISE - Boston University COVERAGE CONTROL MISSION DEMO SOFTWARE DEMO OF COVERAGE CONTROL ALGORITHM: http://frontera.bu.edu/Applets/CoverageContr/index.html No communication cost With communication cost Sensing Range Christos G. Cassandras CODES Lab. - Boston University SOME FINAL THOUGHTS… ¾ Small, cheap cooperating devices cannot handle complexity ⇒ we need DISTRIBUTED control ! ¾ Cooperating agents operate asynchronously ⇒ we need ASYNCHRONOUS control/optimization schemes ¾ Wireless communication is alarmingly vulnerable to security threats ¾ Different views/aspects of “Cooperative Control” abound… ACKNOWLEDGEMENTS: PhD Students: Wei Li, Ning Xu Sponsors: NSF, AFOSR, ARO, Honeywell Members of Boston U. Sensor Networks Consortium Christos G. Cassandras CODES Lab. - Boston University