DISTRIBUTED OPTIMIZATION FOR COOPERATIVE MISSIONS IN

DISTRIBUTED OPTIMIZATION FOR COOPERATIVE MISSIONS IN UNCERTAIN ENVIRONMENTS C. G. Cassandras Center for Information and Systems Engineering Boston University Christos G. Cassandras CODES Lab. - Boston University OUTLINE ¾ COOPERATIVE “MISSION” SETTING ¾ REWARD MAXIMIZATION MISSIONS ¾ COOPERATIVE RECEDING HORIZON (CRH) CONTROL ¾ COVERAGE CONTROL MISSIONS ¾ DEMOS: Applets and Movies ACKNOWLEDGEMENTS: PhD Students: Wei Li (PhD 2006), Ning Xu, Minyi Zhong Sponsors: NSF, AFOSR, ARO, DOE, Honeywell Members of Boston U. Sensor Networks Consortium Christos G. Cassandras CODES Lab. - Boston University COOPERATIVE MISSION SETTING TARGETS MOBILE NODES Christos G. Cassandras CODES Lab. - Boston University DIFFERENT COOPERATIVE MISSION TYPES ¾ RENDEZ-VOUS AT SOME TARGET POINT ¾ FORMATION MAINTENANCE ¾ REWARD MAXIMIZATION ¾ COVERAGE CONTROL Christos G. Cassandras CODES Lab. - Boston University RENDEZ-VOUS MISSION RENDEZ-VOUS AT THIS TARGET Christos G. Cassandras CODES Lab. - Boston University FORMATION MAINTAINANCE MISSION Christos G. Cassandras CODES Lab. - Boston University REWARD MAXIMIZATION MISSION TARGETS WITH DIFFERENT REWARDS AND DEADLINES MISSION OBJECTIVE: MAXIMIZE TOTAL REWARD COLLECTED BY VISITING TARGETS BEFORE THEIR “DEADLINES” EXPIRE Christos G. Cassandras CODES Lab. - Boston University REWARD MAXIMIZATION MISSION X X X X X X CONTINUED ? UNKNOWN TARGETS ? ? Node 4 “repelled” by Node 3 ⇒ Search task performed Christos G. Cassandras CODES Lab. - Boston University COVERAGE CONTROL MISSION SENSOR FIELD WITH UNKNOWN DATA SOURCES - ONLY DENSITY FUNCTION ASSUMED --Meguerdichian Meguerdichianetetal, al,INFOCOM, INFOCOM,2001, 2001, --Cortes Cortesetetal, al,IEEE IEEETrans. Trans.on onRobotics Roboticsand andAuto., Auto.,2004 2004 --Cassandras Cassandrasand andLi, Li,Euro. Euro.J.J.ofofControl, Control,2005 2005 Christos G. Cassandras CODES Lab. - Boston University COOPERATIVE REWARD MAXIMIZATION MISSION R1φ1(t) r6i, j = 1,…,3 R6φ6(t) R3φ3(t) r3i, j = 1,…,3 r1i, j = 1,…,3 R4φ4(t) R4 r4i, j = 1,…,3 R5φ5(t) R5 r5i, j = 1,…,3 C1 p1i, i = 1,…,6 C3 p3i, i = 1,…,6 Christos G. Cassandras C2 r2i, j = 1,…,3 p2i, i = 1,…,6 CODES Lab. - Boston University R R2φ22(t) COOP. REWARD MAXIMIZATION MISSION CONTINUED This is like the notorious TRAVELING SALESMAN problem, except that… ¾ … there are multiple (cooperating) salesmen ¾ … there are deadlines + time-varying costs ¾ … environment is stochastic (vehicles may fail, threats damage vehicles, etc.) Christos G. Cassandras CODES Lab. - Boston University SOLUTION APPROACHES ¾ Stochastic Dynamic Programming – Wohletz et al, 2001 Extremely complex… ¾ Functional Decomposition: Dynamic Resource Allocation – Castanon and Wohletz, 2002 Assignment Problems through Mixed Integer Linear Programming – Bellingham et al, 2002 Combinatorially complex… Path Planning – Hu and Sastry, 2001, Lian and Murray 2002, Gazi and Passino, 2002, Bachmayer and Leonard, 2002 Christos G. Cassandras CODES Lab. - Boston University COMBINATORIAL + STOCHASTIC COMPLEXITY R2 R3 R4 R1 R5 1. Target Assignment Christos G. Cassandras 2. Routing 3. Path Control CODES Lab. - Boston University RECEDING HORIZON (RH) CONTROL: MAIN IDEA • Do not attempt to assign nodes to targets • Cooperatively steer nodes towards “high expected reward” regions PLANNING HORIZON, H periodically/on-event • Repeat process • Worry about final node-target assignment Turns out nodes at the last possible instant converge to targets ACTION on their own! HORIZON, h Solve optimization problem by selecting all ui to maximize total expected rewards over H u1 u2 Christos G. Cassandras u3 CODES Lab. - Boston University CONTRAST APPROACHES HEDGE-AND-REACT HEDGE-AND-REACT ¾ Delay decisions until last possible instant ¾ No stochastic model ¾ Simpler opt. problems VS ESTIMATE-AND-PLAN ESTIMATE-AND-PLAN ¾ Need accurate stochastic models ¾ Curse of dimensionality Compare to Model Predictive Control (MPC) Christos G. Cassandras CODES Lab. - Boston University CRH CONTROL PROBLEM FORMULATION Target positions (i = 1,…,N): yi ∈R² Node dynamics (j = 1,…,M): • State: xj(t) ∈R² position of jth node at time t • Control: uj(t) Node heading at time t ⎡cos u j (t )⎤ 0 x& j (t ) = V j ⎢ , x ( 0 ) = x ⎥ j j ⎢⎣sin u j (t ) ⎥⎦ At kth iteration, time tk (k=1,2,…): • Planning Horizon: Hk • Node position at time tk+Hk: x j (t k + H k ) = x j (t k ) + x& j (t k ) H k Christos G. Cassandras CODES Lab. - Boston University RH PROBLEM FORMULATION CONTINUED At kth iteration (k=1,2,…): Earliest time node j can reach target i under control uj(tk): τ ij (u j (t k ), t k ) = (t k + H k ) + || x j (t k + H k ) − yi || V j τ ij VjHk xj(tk) yi qij uj(tk) xj (tk +Hk) qkj τ kj Christos G. Cassandras yk Probability vehicle j assigned to target CODES Lab. - Boston University CRH PROBLEM FORMULATION CONTINUED Objective at kth iteration: Maximize EXPECTED REWARD over horizon Hk Target i value attainable by node j [depends on uj(t)] M N node j value M N max ∑∑ Riφi (τ ij ) pij (τ ij ) qij (t k + H k ) − ∑∑ C j rij (t k + H k ) u Control node headings i =1 j =1 Prob. time nodewhen j node Prob. node j Earliest collects targetreward i reward assigned to target i j can collect [depends on uj(t)] [depends on u (t)] j from target i [depends on uj(t)] Christos G. Cassandras i =1 j =1 Prob. node j destroyed by target i [depends on uj(t)] CODES Lab. - Boston University THE FUNCTION φi(t) [REWARD DISCOUNTING FUNCTION] • Targets with deadlines: 1 • Targets with time windows: Christos G. Cassandras 1 CODES Lab. - Boston University THE FUNCTION φi(t) CONTINUED • Sequencing targets: 1 φ1(t) • A general purpose φ−function: ln(1−α ) t Di ⎧ ⎪e φi (t ) = ⎨ ln(1−α ) ⎪e Di t e − β (t − Di ) ⎩ α 1 0.8 β 0.6 if t ≤ Di if t > Di 0.4 0.2 0 Christos G. Cassandras φ2(t) 20 40 t 60 80 CODES Lab. - Boston University 100 THE FUNCTION qij [TARGET ASSIGNMENT FUNCTION] • Node-to-target distance: dij • Relative distance: δ ij = = x j − yi d ij ∑ M m =1 or: b closest nodes to j only d im • Target assignment function qij(δij): Monotonically non-increasing and s.t. qij (0) = 1, qij (1) = 0 Christos G. Cassandras CODES Lab. - Boston University THE FUNCTION qij CONTINUED • A example of qij function (M=2): ⎧1 ⎪⎪ 1 qij (δij ) = ⎨ (1− Δ) −δij ⎪1− 2Δ ⎪⎩0 [ 1 ] if δij ≤ Δ if Δ < δij ≤1− Δ otherwise Capture Radius (Δ) 0.8 0.6 0.4 0.2 0 Christos G. Cassandras 0.2 0.4 0.6 0.8 1 1.2 1.4 CODES Lab. - Boston University THE FUNCTION qij CONTINUED qij(t) defines DYNAMIC RESPONSIBILITY REGIONS for vehicle j • Sj – Full Responsibility Region (FR) δij≤Δ • Cj – Cooperative Region (CR) Δ <δij≤1-Δ • Ij – Invisibility Region (IR) δij> 1-Δ Vehicle Targets in FR committed to node Christos G. Cassandras FR x j CR CR IR Targets in IR ignored by node CODES Lab. - Boston University THE FUNCTION qij CONTINUED 10 8 6 4 2 1 1 0 2 2 -2 3 3 -4 -6 4 4 -8 -10 -10 Δ=0.1 Δ=0.2 What happens as parameter Δ increases ? Christos G. Cassandras -5 0 5 10 Voronoi partition Partition of a plane with into n convex polygons such that each polygon contains exactly one point and every point in a given polygon is closer to its central point than to any other. CODES Lab. - Boston University 2-VEHICLE CASE – DYNAMIC PARTITIONING Possible Target Location Vehicle Locations II: Only vehicle 1 goes to target III: Both vehicles go to target IV: Only vehicle 2 goes to target (1 is repelled !) Christos G. Cassandras CODES Lab. - Boston University PLANNING AND ACTION HORIZONS PLANNING Horizon H(t): H (t ) = d min (t ) ≡ min d ij (t ) i, j ACTION Horizon h(t): h(t ) = α H + β H H (t ), α H ≥ 0, 0 ≤ β H ≤ 1 OR: Whenever next EVENT occurs Christos G. Cassandras CODES Lab. - Boston University TARGET ASSIGNMENT MAIN IDEA IN CRH APPROACH: Replace complex Discrete Stochastic Optimization problem by a sequence of simpler Continuous Optimization problems But how do we guarantee that vehicles ultimately head for the desired DISCRETE TARGET POINTS? Christos G. Cassandras CODES Lab. - Boston University STABILITY ANALYSIS • TARGETS: yi • UAVs: xj DEFINITION: Node trajectory x(t ) = [x1 (t ),K, xM (t )] generated by a controller is stationary, if there exists some tV < ∞, such that x j (tv ) − yi ≤ si for some i = 1, K , N , j = 1, K , M . QUESTION: Under what conditions is a CRH-generated trajectory stationary ? Wei Li Target Size CISE - Boston University STABILITY ANALYSIS CONTINUED Recall objective function: M M N N max ∑∑ Riφi (τ ij ) pij (τ ij )qij (t k + H k ) − ∑∑ C j rij (t k + H k ) u i =1 j =1 i =1 j =1 α T φi (t ) = 1- t t ∈ [ 0 ,T] rij=0 ⎧1 ⎪⎪ 1 (1− Δ) −δij qij (δij ) = ⎨ ⎪1− 2Δ ⎪⎩0 1 [ α 0.8 pij=1 0.6 0.4 0.2 ] if j ∈Bi ,δij ≤ Δ if j ∈Bi , Δ < δij ≤1− Δ otherwise 1 0 20 40 t 60 80 100 0.8 0.6 0.4 0.2 0 Wei Li 0.2 0.4 0.6 0.8 1 1.2 1.4 CISE - Boston University STABILITY ANALYSIS CONTINUED N M J (x) = ∑∑ Ri x j − yi qij Objective function reduces to: i =1 j =1 CRH controller solves optimization problem: ⎧⎪min J (x) x∈Fk ⎨ ⎪⎩ Fk = w : w j − x j (t k ) = VH k { } i.e., minimize the potential function J(x) over a set of M circles: min J(x) … F 1 k Fk2 Wei Li FkM CISE - Boston University MAIN STABILITY RESULT Local minima of J(x): x l = ( x1l ,..., xMl ) ∈ R 2 M , l = 1, K , L Vector of node positions at kth iteration of CRH controller: xk Theorem: Suppose H k = min d ij (t k ) . i, j l If, for all l = 1,…,L, x j = yi for some i = 1,…,N, j = 1,…,M, then J ( x k ) − J ( x k +1 ) > b (b > 0 is a constant). If all local minima coincide with targets, the CRH-generated trajectory is stationary Wei Li CISE - Boston University MAIN STABILITY RESULT QUESTION: When do all local minima coincide with target points? 1 Vehicle, N targets If there exists a yi s.t. Ri − N ∑ j =1, j ≠ i Rj yi − y j yi − y j >0 2 Vehicles, 1 target 2 Vehicles, 2 targets Wei Li CISE - Boston University TO RECAP… • Limited look-ahead – control optimizes expectation over “planning horizon” • Control updates – event-driven (events are deterministic or random) or time-driven (for a given “action horizon”) • Target assignment – done implicitly, not explicitly: No combinatorial problem involved • Assignment + Routing + Path Control – all done together Christos G. Cassandras CODES Lab. - Boston University RH CONTROLLER FEATURES CONTINUED • Target values change – deadlines, target sequencing, return to base • Node capabilities change - resource depletion, failures, damage • Threat capabilities change - radar on/off, threat damage • Target locations change - new targets, moving targets • Obstacle avoidance - targets with negative values • Randomness - new control actions in response to random events • Constraints – heading change, heading-dependent costs, sensing tasks Christos G. Cassandras CODES Lab. - Boston University DISTRIBUTED COOPERATIVE CONTROL Construct GRADIENT FIELD instead of artificial potential field ∂J i Fij = = ci ( x j ) ⋅ f i 0 ( x j ) ∂x j if yi ∈ S j ⎧1 ⎪⎪ (1 − δ ij )(2δ ij − 1) ci ( x j ) = ⎨qij − if yi ∈ C j (1 − 2Δ) ⎪ if yi ∈ I j ⎪⎩0 Cooperation coefficient Christos G. Cassandras ⎧ Ri x j − yi if x j ≠ yi ⎪− f i 0 ( x j ) = ⎨ Di || x j − yi || ⎪0 otherwise ⎩ Force exerted by target i on node j given that it is the only node in the mission space CODES Lab. - Boston University DISTRIBUTED COOPERATIVE CONTROL • 2 examples (M=2, N=10) Christos G. Cassandras CODES Lab. - Boston University OTHER ISSUES ¾ Local optima in the CRH optimization problem ¾ Oscillatory vehicle behavior (instabilities) ¾ Additional path constraints, e.g., rendez-vous at targets ¾ Does CRH control generate optimal assignments? Christos G. Cassandras CODES Lab. - Boston University REWARD MAXIMIZATION MISSION DEMO MOVIES OF SUCH MISSIONS WITH SMALL ROBOTS: 3 Khepera robots http://frontera.bu.edu/CoopCtrl.html executing mission: visiting 8 targets with different rewards and deadlines. Robots communicate wirelessly and dynamically update headings. Overhead vision system provides location data. Inner ring - “reward": R - 200, G - 300, B - 600 Outer ring - "deadline": R - 80s, G - 50s, B - 20s Christos G. Cassandras CODES Lab. - Boston University COVERAGE CONTROL MISSION GOAL: Deploy mobile nodes to maximize data source detection probability – unknown data sources – data sources may be mobile R(x) (Hz/m2) 50 40 30 20 10 0 10 8 6 ? ? ??Ω??? ? ? 4 2 0 10 5 0 Perceived data source density over mission space Christos G. Cassandras CODES Lab. - Boston University PROBLEM FORMULATION N mobile sensors, each located at si∈R2 Data source at x emits signal with energy E Signal observed by sensor node i (at si ) Sensing model: pi ( x) ≡ p (Detected by i | A( x), si ) ( A(x) = data source emits at x ) Sensing attenuation: pi(x) is a decreasing function of di(x) ≡ ||x - si|| (distance between x and si ) Christos G. Cassandras CODES Lab. - Boston University PROBLEM FORMULATION CONTINUED Joint detection prob. assuming sensor independence: N P( x) = 1 − ∏ [1 − pi ( x)] i =1 OBJECTIVE: Determine locations si (i=1,…,N) to maximize total detection probability: N ⎫ ⎧ max ∫ R( x)⎨1 − ∏ [1 − pi ( x)]⎬dx si ∈Ω ⎭ ⎩ i =1 Ω Perceived data source density Christos G. Cassandras CODES Lab. - Boston University DISTRIBUTED COOPERATIVE SCHEME CONTINUED Denote N ⎧ ⎫ F ( s1 , K, s N ) = ∫ R( x)⎨1 − ∏ [1 − pi ( x)]⎬dx ⎩ i =1 ⎭ Ω Maximize F(s1,…,sN) by forcing nodes to move using gradient information: N ∂pk ( x) sk − x ∂F dx = ∫ R ( x) ∏ [1 − pi ( x)] ∂sk Ω ∂d k ( x) d k ( x) i =1,i ≠ k Christos G. Cassandras CODES Lab. - Boston University DISTRIBUTED COOPERATIVE SCHEME CONTINUED N ∂pk ( x) sk − x ∂F dx = ∫ R ( x) ∏ [1 − pi ( x)] ∂sk Ω ∂d k ( x) d k ( x) i =1,i ≠ k This has to be evaluated numerically. Not doable for a mobile sensor with limited computation capacity. ¾ Approximate pi(x) by truncating sensing attenuation ¾ Discretize pi(x) using a grid Details in Christos G. Cassandras --Cassandras Cassandrasand andLi, Li,Euro. Euro.J.J.ofofControl, Control,2005 2005 CODES Lab. - Boston University COVERAGE CONTROL MISSION DEMO SOFTWARE DEMO OF COVERAGE CONTROL ALGORITHM: http://frontera.bu.edu/Applets/CoverageContr/index.html No communication cost With communication cost Sensing Range Christos G. Cassandras CODES Lab. - Boston University POLYGONAL OBSTACLES… CONTINUED • Constrain the navigation of mobile nodes • Interfere with the sensing ⎧ pi ( x, si ) if x is visible from si pˆ i ( x, si ) = ⎨ otherwise ⎩ 0 Visibility Region Christos G. Cassandras CODES Lab. - Boston University GRADIENT CALCULATION WITH OBSTACLES CONTINUED N ∂pˆ i ( x, si ) si − x ∂H = ∫ R( x) ∏ [1 − pˆ k ( x, sk )] dx + ∂si V ( si ) ∂di ( x) di ( x) k =1, k ≠ i ⎧ pi pˆ i = ⎨ ⎩0 visible invisible Q ( si ) ∑A j =1 j Q(si): # of occluding corner points New term captures change in visibility region of si Mathematically: use an extension of the Leibnitz rule for differentiating an integral where both the integrand and the integration domain are functions of the control variable Christos G. Cassandras CODES Lab. - Boston University DEPLOYMENT DEMO – WITH OBSTACLES http://codescolor.bu.edu/coverage Christos G. Cassandras CODES Lab. - Boston University A “FAIRNESS” ISSUE… CONTINUED Some areas covered extremely well, while others not covered at all SOLUTION: Assign higher reward to the same amount of marginal gain in P(x,s) in low coverage region H (s) = ∫ R( x) P( x, s)dx F H M (s) = ∫ R( x) M ( P( x, s) )dx F M(·) : [0,1] → R concave non-decreasing function Christos G. Cassandras CODES Lab. - Boston University DEPLOYMENT DEMO – REACTION TO EVENTS Event detected Christos G. Cassandras CODES Lab. - Boston University ONGOING WORK: SCALABLE, ASYNCHRONOUS, DISTRIBUTED OPTIMIZATION ¾ Small, cheap cooperating devices cannot handle complexity ⇒ we need DISTRIBUTED control and optim. algorithms ¾ Cooperating agents operate asynchronously ⇒ we need ASYNCHRONOUS control/optimization schemes ¾ Too much communication kills node energy sources ⇒ communicate ONLY when necessary ⇒ we need ASYNCHRONOUS control/optimization schemes ¾ Networks grow large, sensing tasks grow large ⇒ we need SCALABLE control and optim. algorithms Christos G. Cassandras CODES Lab. - Boston University

DISTRIBUTED OPTIMIZATION FOR COOPERATIVE MISSIONS IN

Related documents

Products

Support

DISTRIBUTED OPTIMIZATION FOR COOPERATIVE MISSIONS IN

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib