A VIEW OF FOR AUTONOMOUS VEHICLES

advertisement
A VIEW OF
C
PERATIVE CONTROL
FOR
AUTONOMOUS VEHICLES
AND SENSOR NETWORKS
C. G. Cassandras
Center for Information and Systems Engineering
Boston University
Christos G. Cassandras
CODES Lab. - Boston University
OUTLINE
¾ COOPERATIVE CONTROL SETTING
¾ COOPERATIVE “MISSION” TYPES
¾ REWARD MAXIMIZATION MISSIONS
¾ COOPERATIVE RECEDING HORIZON (CRH) CONTROL
¾ SENSOR NETWORKS, COVERAGE CONTROL MISSIONS
¾ DEMOS: Applets and Movies
Christos G. Cassandras
CODES Lab. - Boston University
COOPERATIVE MISSION SETTING
TARGETS
VEHICLES
(MOBILE NODES)
Christos G. Cassandras
CODES Lab. - Boston University
DIFFERENT COOPERATIVE MISSION TYPES
¾ RENDEZ-VOUS AT SOME TARGET POINT
¾ FORMATION MAINTENANCE
¾ REWARD MAXIMIZATION
¾ COVERAGE CONTROL
Christos G. Cassandras
CODES Lab. - Boston University
RENDEZ-VOUS MISSION
RENDEZ-VOUS
AT THIS TARGET
Christos G. Cassandras
CODES Lab. - Boston University
FORMATION MAINTAINING MISSION
Christos G. Cassandras
CODES Lab. - Boston University
REWARD MAXIMIZATION MISSION
TARGETS
WITH DIFFERENT
REWARDS AND DEADLINES
MISSION OBJECTIVE: MAXIMIZE TOTAL REWARD COLLECTED
BY VISITING TARGETS BEFORE THEIR “DEADLINES” EXPIRE
Christos G. Cassandras
CODES Lab. - Boston University
REWARD MAXIMIZATION MISSION
X
X
X
X
X
X
CONTINUED
?
UNKNOWN
TARGETS
?
?
Vehicle 4 “repelled” by Vehicle 3
⇒ Search task performed
Christos G. Cassandras
CODES Lab. - Boston University
COVERAGE CONTROL MISSION
ALL TARGETS
UNKNOWN
ONLY DENSITY
FUNCTION ASSUMED
Christos G. Cassandras
CODES Lab. - Boston University
COOPERATIVE REWARD MAXIMIZATION MISSION
R1φ1(t)
r6i, j = 1,…,3
R6φ6(t)
R3φ3(t)
r3i, j = 1,…,3
r1i, j = 1,…,3
R4φ4(t)
R4
r4i, j = 1,…,3
R5φ5(t)
R5
r5i, j = 1,…,3
C1
p1i, i = 1,…,6
C3
p3i, i = 1,…,6
Christos G. Cassandras
C2
r2i, j = 1,…,3
p2i, i = 1,…,6
CODES Lab. - Boston University
R
R2φ22(t)
COOP. REWARD MAXIMIZATION MISSION
CONTINUED
This is like the notorious TRAVELING SALESMAN
problem, except that…
¾ … there are multiple (cooperating) salesmen
¾ … there are deadlines + time-varying costs
¾ … environment is stochastic
(vehicles may fail, threats damage vehicles, etc.)
Christos G. Cassandras
CODES Lab. - Boston University
SOLUTION APPROACHES
¾ Stochastic Dynamic Programming – Wohletz et al, 2001
Extremely complex…
¾ Functional Decomposition
ƒ Dynamic Resource Allocation – Castanon and Wohletz, 2002
ƒ Assignment Problems through Mixed Integer Linear
Programming – Bellingham et al, 2002
Combinatorially complex…
ƒ Path Planning – Hu and Sastry, 2001, Lian and Murray 2002, Gazi and
Passino, 2002, Bachmayer and Leonard, 2002
Christos G. Cassandras
CODES Lab. - Boston University
COMBINATORIAL + STOCHASTIC COMPLEXITY
R2
R3
R4
R1
R5
1. Target Assignment
Christos G. Cassandras
2. Routing
3. Path Control
CODES Lab. - Boston University
RECEDING HORIZON (RH) CONTROL: MAIN IDEA
• Do not attempt to assign vehicles to targets
• Cooperatively steer vehicles
towards
“high expected reward” regions
PLANNING
HORIZON,
H periodically/on-event
• Repeat
process
• Worry about final vehicle-target assignmentTurns out vehicles
at the last possible instant
converge to targets
ACTION
on their own!
HORIZON, h
Solve optimization problem
by selecting all ui to maximize
total expected rewards over H
u1
u2
u3
Seealso
alsoFranco,
Franco,Parisini,
Parisini,Polycarpou
Polycarpou04;
04;Dunbar,
Dunbar,Murray,
Murray,04;
04;Richards,
Richards,How,
How,04
04
See
Christos G. Cassandras
CODES Lab. - Boston University
CRH CONTROL PROBLEM FORMULATION
ƒ Target positions (i = 1,…,N): yi ∈R²
ƒ Vehicle dynamics (j = 1,…,M):
• State:
xj(t) ∈R²
position of jth vehicle at time t
• Control:
uj(t)
Vehicle heading at time t
⎡cos u j (t )⎤
0
x& j (t ) = V j ⎢
,
x
(
0
)
=
x
⎥
j
j
⎢⎣sin u j (t ) ⎥⎦
ƒ At kth iteration, time tk (k=1,2,…):
• Planning Horizon:
Hk
• Vehicle position at time tk+Hk:
x j (t k + H k ) = x j (t k ) + x& j (t k ) H k
Christos G. Cassandras
CODES Lab. - Boston University
RH PROBLEM FORMULATION
CONTINUED
ƒ At kth iteration (k=1,2,…):
Earliest time vehicle j can reach target i under control uj(tk):
τ ij (u j (t k ), t k ) = (t k + H k ) + || x j (t k + H k ) − yi || V j
τ ij
VjHk
xj(tk)
yi
qij
uj(tk)
xj (tk +Hk)
qkj
τ kj
Christos G. Cassandras
yk
Probability vehicle j
assigned to target
CODES Lab. - Boston University
CRH PROBLEM FORMULATION
CONTINUED
ƒ Objective at kth iteration:
Maximize EXPECTED REWARD over horizon Hk
Target i value attainable by veh. j
[depends on uj(t)]
M
N
veh. j value
M
N
max ∑∑ Riφi (τ ij ) pij (τ ij ) qij (t k + H k ) − ∑∑ C j rij (t k + H k )
u
Control
vehicle
headings
i =1 j =1
Prob. vehicle j
Prob.
vehicle
j
Earliest
time when
collects
i reward assigned to target i
vehicle jtarget
can collect
[depends on uj(t)]
[depends
on
u
(t)]
j
reward from target i
i =1 j =1
Prob. vehicle j
destroyed by target i
[depends on uj(t)]
[depends on uj(t)]
Christos G. Cassandras
CODES Lab. - Boston University
THE FUNCTION φi(t) [REWARD DISCOUNTING FUNCTION]
• Targets with deadlines:
1
• Targets with time
windows:
Christos G. Cassandras
1
CODES Lab. - Boston University
THE FUNCTION φi(t)
CONTINUED
• Sequencing targets:
1
φ1(t)
• A general purpose
φ−function:
ln(1−α )
t
Di
⎧
⎪e
φi (t ) = ⎨ ln(1−α )
⎪e Di t e − β (t − Di )
⎩
α
1
0.8
β
0.6
if t ≤ Di
if t > Di
0.4
0.2
0
Christos G. Cassandras
φ2(t)
20
40
t 60
80
CODES Lab. - Boston University
100
THE FUNCTION qij
[TARGET ASSIGNMENT FUNCTION]
• Vehicle-to-target distance: dij = x j − yi
• Relative distance:
δ ij =
d ij
∑
M
m =1
d im
• Target assignment function qij(δij):
Monotonically non-increasing and s.t.
qij (0) = 1, qij (1) = 0
Christos G. Cassandras
CODES Lab. - Boston University
THE FUNCTION qij
CONTINUED
• A example of qij function (M=2):
⎧1
⎪⎪ 1
qij (δij ) = ⎨
(1− Δ) −δij
⎪1− 2Δ
⎪⎩0
[
1
]
if δij ≤ Δ
if Δ < δij ≤1− Δ
otherwise
Capture Radius (Δ)
0.8
0.6
0.4
0.2
0
Christos G. Cassandras
0.2
0.4
0.6
0.8
1
1.2
1.4
CODES Lab. - Boston University
THE FUNCTION qij
CONTINUED
qij(t) defines DYNAMIC RESPONSIBILITY REGIONS for vehicle j
• Sj – Full Responsibility Region (FR) δij≤Δ
• Cj – Cooperative Region (CR)
Δ <δij≤1-Δ
• Ij – Invisibility Region (IR)
δij> 1-Δ
Vehicle
Targets in FR
committed to
vehicle
Christos G. Cassandras
FR x
j
CR
CR
IR
Targets in IR
ignored by
vehicle
CODES Lab. - Boston University
THE FUNCTION qij
CONTINUED
10
8
6
4
2
1
1
0
2
2
-2
3
3
-4
-6
4
4
-8
-10
-10
Δ=0.1
Δ=0.2
What happens as
parameter Δ increases ?
Christos G. Cassandras
-5
0
5
10
Voronoi partition
Partition of a plane with into n convex polygons such that
each polygon contains exactly one point and every point
in a given polygon is closer to its central point than to any other.
CODES Lab. - Boston University
2-VEHICLE CASE – DYNAMIC PARTITIONING
Possible
Target
Location
Vehicle
Locations
II: Only vehicle 1 goes to target
III: Both vehicles go to target
IV: Only vehicle 2 goes to target (1 is repelled !)
Christos G. Cassandras
CODES Lab. - Boston University
PLANNING AND ACTION HORIZONS
PLANNING Horizon H(t):
H (t ) = d min (t ) ≡ min d ij (t )
i, j
ACTION Horizon h(t):
h(t ) = α H + β H H (t ), α H ≥ 0, 0 ≤ β H ≤ 1
OR: Whenever next EVENT occurs
Christos G. Cassandras
CODES Lab. - Boston University
FOUR veh’s:
Green
Purple
BlacK
BLue
Christos G. Cassandras
CODES Lab. - Boston University
TARGET ASSIGNMENT
MAIN IDEA IN CRH APPROACH:
Replace complex Discrete Stochastic Optimization problem
by a sequence of simpler Continuous Optimization problems
But how do we guarantee that vehicles ultimately
head for the desired DISCRETE TARGET POINTS?
Christos G. Cassandras
CODES Lab. - Boston University
STABILITY ANALYSIS
• TARGETS: yi
• UAVs: xj
DEFINITION: Vehicle trajectory x(t ) = [x1 (t ),K, xM (t )]
generated by a controller is stationary, if there
exists some tV < ∞, such that x j (tv ) − yi ≤ si for
some i = 1, K , N , j = 1, K , M .
QUESTION:
Under what conditions is a CRH-generated
trajectory stationary ?
Wei Li
Target Size
CISE - Boston University
STABILITY ANALYSIS
CONTINUED
Recall objective function:
M
M
N
N
max ∑∑ Riφi (τ ij ) pij (τ ij )qij (t k + H k ) − ∑∑ C j rij (t k + H k )
u
i =1 j =1
φi (t ) = 1 -
α
T
i =1 j =1
t t ∈ [0, T ]
rij=0
⎧1
⎪⎪ 1
(1− Δ) −δij
qij (δij ) = ⎨
⎪1− 2Δ
⎪⎩0
1
[
α
0.8
pij=1
0.6
0.4
0.2
]
if j ∈Bi ,δij ≤ Δ
if j ∈Bi , Δ < δij ≤1− Δ
otherwise
1
0
20
40
t 60
80
100
0.8
0.6
0.4
0.2
0
Wei Li
0.2
0.4
0.6
0.8
1
1.2
1.4
CISE - Boston University
STABILITY ANALYSIS
CONTINUED
N
M
J (x) = ∑∑ Ri x j − yi qij
Objective function reduces to:
i =1 j =1
CRH controller solves
optimization problem:
⎧⎪min J (x)
x∈Fk
⎨
⎪⎩ Fk = w : w j − x j (t k ) = VH k
{
}
i.e., minimize the potential function J(x) over a set of M circles:
min J(x)
…
1
k
F
Fk2
Wei Li
FkM
CISE - Boston University
MAIN STABILITY RESULT
Local minima of J(x): x l = ( x1l ,..., xMl ) ∈ R 2 M , l = 1, K , L
Vector of vehicle positions
at kth iteration of CRH controller:
xk
Theorem: Suppose H k = min d ij (t k ) .
i, j
l
If, for all l = 1,…,L, x j = yi for some i = 1,…,N, j = 1,…,M,
then J ( x k ) − J ( x k +1 ) > b (b > 0 is a constant).
If all local minima coincide with targets,
the CRH-generated trajectory is stationary
Wei Li
CISE - Boston University
MAIN STABILITY RESULT
QUESTION:
When do all local minima coincide with target points?
1 Vehicle, N targets
If there exists a yi s.t. Ri −
N
∑
j =1, j ≠ i
Rj
yi − y j
yi − y j
>0
2 Vehicles, 1 target
2 Vehicles, 2 targets
Wei Li
CISE - Boston University
TO RECAP…
• Limited look-ahead – control optimizes expectation
over “planning horizon”
• Control updates – event-driven (events are deterministic or random)
or time-driven (for a given “action horizon”)
• Target assignment – done implicitly, not explicitly:
No combinatorial problem involved
• Assignment + Routing + Path Control – all done together
Christos G. Cassandras
CODES Lab. - Boston University
RH CONTROLLER FEATURES
CONTINUED
• Target values change – deadlines, target sequencing, return to base
• Vehicle capabilities change - resource depletion, failures, damage
• Threat capabilities change - radar on/off, threat damage
• Target locations change - new targets, moving targets
• Obstacle avoidance - targets with negative values
• Randomness - new control actions in response to random events
• Constraints – heading change, heading-dependent costs,
sensing tasks
Christos G. Cassandras
CODES Lab. - Boston University
ONLY KNOWN TARGET
DETECTION RADIUS: 20
DISTRIBUTED COOPERATIVE CONTROL
Construct GRADIENT FIELD instead of artificial potential field
∂J i
Fij =
= ci ( x j ) ⋅ f i 0 ( x j )
∂x j
if yi ∈ S j
⎧1
⎪⎪
(1 − δ ij )(2δ ij − 1)
ci ( x j ) = ⎨qij −
if yi ∈ C j
(1 − 2Δ)
⎪
if yi ∈ I j
⎪⎩0
Cooperation coefficient
Christos G. Cassandras
⎧ Ri x j − yi
if x j ≠ yi
⎪−
f i 0 ( x j ) = ⎨ Di || x j − yi ||
⎪0
otherwise
⎩
Force exerted by target i
on vehicle j given that it is the
only vehicle in the mission space
CODES Lab. - Boston University
DISTRIBUTED COOPERATIVE CONTROL
• 2 examples (M=2, N=10)
Christos G. Cassandras
CODES Lab. - Boston University
OTHER ISSUES
¾ Local optima in the CRH optimization problem
¾ Oscillatory vehicle behavior (instabilities)
¾ Additional path constraints
¾ Does CRH control generate optimal
assignments?
Christos G. Cassandras
CODES Lab. - Boston University
REWARD MAXIMIZATION MISSION DEMO
MOVIES OF SUCH MISSIONS WITH SMALL ROBOTS:
3 Khepera robots
http://frontera.bu.edu/CoopCtrl.html
executing mission:
visiting 8 targets with
different rewards and
deadlines. Robots
communicate wirelessly
and dynamically update
headings. Overhead
vision system provides
location data.
Inner ring - “reward":
R - 200, G - 300, B - 600
Outer ring - "deadline":
R - 80s, G - 50s, B - 20s
Christos G. Cassandras
CODES Lab. - Boston University
COVERAGE CONTROL MISSION
SENSOR FIELD WITH
UNKNOWN DATA SOURCES
- ONLY DENSITY
FUNCTION ASSUMED
--Meguerdichian
Meguerdichianetetal,
al,INFOCOM,
INFOCOM,2001,
2001,
--Cortes
Cortesetetal,
al,IEEE
IEEETrans.
Trans.on
onRobotics
Roboticsand
andAuto.,
Auto.,2004
2004
--Li
Liand
andCassandras,
Cassandras,CDC,
CDC,2005
2005
Christos G. Cassandras
CODES Lab. - Boston University
WHAT’S A SENSOR NETWORK (SNET)?
A NETWORK consisting
of devices (sensors) that:
ƒ … communicate wirelessly
ƒ … are battery-powered
ƒ … may have different
characteristics
ƒ … have limited processing
capabilities
ƒ … have limited life
ƒ … often operate in
noisy/adversarial environments
ƒ … monitor/control physical
processes
Christos G. Cassandras
CODES Lab. - Boston University
WHAT’S A SENSOR NETWORK (SNET)?
CONTINUED
SNETs will consist of thousands of interacting devices!
Christos G. Cassandras
CODES Lab. - Boston University
WHY ARE SNETs EXCITING?
ƒ They interact with the physical world
ƒ They promise fascinating applications:
- Smart Buildings (locate persons/objects, find closest
resource, adjust environment, detect emergency conditions)
- Health monitoring
- Security and military applications
- Environmental monitoring
- Inventory monitoring/replenishment (smart shelves)
- Equipment condition monitoring and active maintenance
(smart appliances)
ƒ They realize a convergence of the “3 Cs”:
Communications + Computing + Control
Christos G. Cassandras
CODES Lab. - Boston University
COVERAGE CONTROL MISSION
GOAL: Deploy mobile nodes to maximize data source detection
probability
– unknown data sources
– data sources may be mobile
R(x)
(Hz/m2)
50
40
30
20
10
0
10
8
6
?
? ??Ω??? ?
?
4
2
0
10
5
0
Perceived data source density
over mission space
Christos G. Cassandras
CODES Lab. - Boston University
PROBLEM FORMULATION
ƒ N mobile sensors, each located at si∈R2
ƒ Data source at x emits signal with energy E
ƒ Signal observed by sensor node i (at si )
ƒ Sensing model:
pi ( x) ≡ p (Detected by i | A( x), si )
( A(x) = data source emits at x )
ƒ Sensing attenuation:
pi(x) is a decreasing function of di(x) ≡ ||x - si||
(distance between x and si )
Wei Li
CISE - Boston University
PROBLEM FORMULATION
CONTINUED
ƒ Joint detection prob. assuming sensor independence:
N
P( x) = 1 − ∏ [1 − pi ( x)]
i =1
ƒ OBJECTIVE:
Determine locations si (i=1,…,N)
to maximize total detection probability:
N
⎫
⎧
max ∫ R( x)⎨1 − ∏ [1 − pi ( x)]⎬dx
si ∈Ω
⎭
⎩ i =1
Ω
Perceived data source density
Wei Li
CISE - Boston University
DISTRIBUTED COOPERATIVE SCHEME
CONTINUED
ƒ Denote
N
⎧
⎫
F ( s1 , K, s N ) = ∫ R( x)⎨1 − ∏ [1 − pi ( x)]⎬dx
⎩ i =1
⎭
Ω
ƒ Maximize F(s1,…,sN) by forcing nodes to move using
gradient information:
N
∂pk ( x) sk − x
∂F
dx
= ∫ R ( x) ∏ [1 − pi ( x)]
∂sk Ω
∂d k ( x) d k ( x)
i =1,i ≠ k
Wei Li
CISE - Boston University
DISTRIBUTED COOPERATIVE SCHEME
CONTINUED
N
∂pk ( x) sk − x
∂F
dx
= ∫ R ( x) ∏ [1 − pi ( x)]
∂sk Ω
∂d k ( x) d k ( x)
i =1,i ≠ k
This has to be evaluated numerically.
Not doable for a mobile sensor with limited
computation capacity.
¾ Approximate pi(x) by truncating sensing attenuation
¾ Discretize pi(x) using a grid
Wei Li
CISE - Boston University
COVERAGE CONTROL MISSION DEMO
SOFTWARE DEMO OF COVERAGE CONTROL ALGORITHM:
http://frontera.bu.edu/Applets/CoverageContr/index.html
No communication cost
With communication cost
Sensing Range
Christos G. Cassandras
CODES Lab. - Boston University
SOME FINAL THOUGHTS…
¾ Small, cheap cooperating devices cannot handle complexity
⇒ we need DISTRIBUTED control !
¾ Cooperating agents operate asynchronously
⇒ we need ASYNCHRONOUS control/optimization schemes
¾ Wireless communication is alarmingly vulnerable to
security threats
¾ Different views/aspects of “Cooperative Control” abound…
ACKNOWLEDGEMENTS:
PhD Students: Wei Li, Ning Xu
Sponsors: NSF, AFOSR, ARO, Honeywell
Members of Boston U. Sensor Networks Consortium
Christos G. Cassandras
CODES Lab. - Boston University
Download