Multi-Robot Systems Collision Avoidance and Coordination

advertisement
Multi Robot Systems Course
Bar Ilan University
Mor Vered
Motivation
 In a multi-robot environment path planning or
collision avoidance is an important problem.
 Multi-robot systems researchers have been
investigating distributed coordination methods for
improving spatial coordination in teams – a.k.a
collision avoidance.
 Such methods adapt the coordination method to the
dynamic changes in density of the robots.
 Basically, the goal is to help avoid collision from any
static obstacles or other dynamic objects, such as
moving robots.
Motivation
 Spatial conflicts can cause the team’s productivity to
drop with the addition of robots.
 We will see that this phenomenon is impacted by the
coordination methods used by the team-members, as
different coordination methods yield radically
different productivity results.
Heuristic Coordination Approaches
 Noise method - If I am on a given trajectory that is
danger of colliding with another agents', add random
noise to my direction vector.
”Behavior Based Formation Control for Multi-robot Teams”. Balch, Tucker and Arkin,
Ronald C
Heuristic Coordination Approaches
 Aggression method - Describes a controller which breaks
deadlocks in favour of the most ‘aggressive’ robot.
 The robots compete and only one gains access to the resource.
When robots come to close to each other, each of the robots
chooses an aggression level (randomly); the robot with the lower
level concedes its position, preventing a collision. Later showed
that it might be best to choose aggression level proportional to
the robot's task.
 For every cycle a robot found itself within 2 radii of a teammate,
it selected either an aggressive or timid behavior, with
probability of 0.5. If the robot selected to become timid, it
backed away for 100 cycles (10 simulated seconds). Otherwise it
proceeded forward, executing the aggressive behavior. As robots
chose to continue being “aggressive” or to become “timid” every
cycle, the probability that two robots would collide in this
implementation was near zero.
”Go ahead, Make my day: Robot Conflict Resolution by Aggressive Competition”. Vaughan, Richard and St{\o}y, Kasper
and Sukhatme, Gaurav and Mataric, Maja
Heuristic Coordination Approaches
 Repel method - When sensing a possible collision on
course the Repel group backtracked for 500 cycles (50
seconds) but mutually repelled using a direction of 180
degrees away from the closest robot.
Heuristic Coordination Approaches
 TimeRand method - This method contained no
repulsion methods to directly avoid collision but in
case of a collision could take care of the immobile
robot.
 When robots sensed that they did not significantly
move for 100 cycles (10 seconds), they proceeded to
move with a random walk for 150 cycles (15 seconds).
Heuristic Coordination Approaches
 TimeRepel method - This method also contained no
repulsion methods to directly avoid collision but only
reacted after the fact to collisions.
 Once these robots did not move for 150 cycles (15
seconds), they then moved backwards for 50 cycles (5
seconds).
 All methods taken from “ A Study of Mechanisms for Improving Robotic Group
Performance “Rosenfeld, Avi and Kaminka, Gal A and Kraus, Sarit and Shehory,
Onn
Heuristic Coordination Approaches
Homework
 Come up with your own heuristic coordination
approach by next week’s lesson and e-mail it to me.
Selecting the Best Approach
 Assuming we have come up with several solutions. Another
problem arising from that is how to select the best
coordination method.
 As we stated before spatial conflicts can cause the team’s
productivity to drop with the addition of robots. This
phenomenon is impacted by the coordination methods
used by the team-members, as different coordination
methods yield radically different productivity results.
 No one collision avoidance method is best in all domain
and group size settings.
 The effectiveness of coordination methods in a given
context is not known in advance.
Selecting the Best Approach
CCC - Combined Coordination Cost
Method
 How to select the best coordination method
 CCC -quantifies the production resources spent on
coordination conflicts ; quantify the cost of group
interactions.
 Multi-attribute cost measure to quantify resources
such as time and fuel each group member spends in
coordination behaviors during task execution.
 Facilitates comparison between different group
methods.
 “ A Study of Mechanisms for Improving Robotic Group Performance “Rosenfeld, Avi
and Kaminka, Gal A and Kraus, Sarit and Shehory, Onn
CCC - Combined Coordination Cost
Method
 Contended that if robots dynamically reduce their
CCC, group productivity will be improved.
 To demonstrate this, they created robotic groups
which dynamically adapt their coordination
techniques based on each robot’s CCC estimate.
 Problem with this method – it ignores the gains
accumulated from long periods of no coordination
needs. The next method tries to fix that
 “ A Study of Mechanisms for Improving Robotic Group Performance “Rosenfeld, Avi
and Kaminka, Gal A and Kraus, Sarit and Shehory, Onn
Adaptive Multi-Robot Coordination: A
Game-Theoretic Perspective
 Used a reinforcement-learning approach to
coordination algorithm selection.
 Proved it both on experiments (foraging) and
empirically by mathematical equations.
 “Adaptive Multi-Robot Coordination: A Game-Theoretic Perspective“Gal A.
Kaminka, Dan Erusalimchik and Sarit Kraus
Problem Definition
 The normal routine of a robot's operation is to
carry out its primary task until interrupted by a
conflict with another robot which must be
resolved by a coordination algorithm. This is called
a conflict event.
 The event triggers a coordination algorithm to
handle the conflict.
 Once it successfully finishes, the robots involved
go back to their primary task.
Problem Definition
 Defined several kinds of tasks :
1) Loose-coordination between the robots ( only
occasional need for spatial or temporal
coordination ). For example; multi-robot
foraging.
2) Cooperative task – the robots seek to maximize
group utility. For example; exploration.
3) Timed tasks – the task is bound in time. For
example; exploration – completely explore a new
area as quickly as possible, or patrolling.
Problem Definition
 Divided time into :
1) Active interval – in which the robot was actively
investing resources in coordination.
2) Passive interval – in which the robot no longer
requires investing in coordination.
3) The robot has a nonempty set of coordination
algorithms to select from. The choice of c
coordination algorithm selection effects the
duration of the active and passive intervals.
Reinforcement Learning
 Computes how an agent ought to take actions in an
environment so as to maximize some definition of
cumulative reward.
 Basic reinforcement model consists of :
A set of environment states - S.
A set of actions - A.
Rules of transitioning between states.
Rules that determine the immediate reward of a
transition.
5) Rules that describe what the agent observes.
1)
2)
3)
4)
Reinforcement Learning
 The trick is - how to define the reward.
 They introduced a reward function EI effectiveness
index , that reduces time and resources spent
coordinating and maximizes the time between
conflicts that require coordination.
 Took into consideration time spent on
coordination and time not spent on coordination.
Reinforcement Learning
 Instead of reward functions – used Q-Learning. ( a
variant of reinforcement learning ).
 A technique that works by learning an action-value
function that gives the expected utility of taking a
given action in a given state and following a fixed
policy thereafter.
 Takes into account an accumulative value - of all
results taken up until now.
 “Q-learning”, Watkins, Christopher JCH and Dayan, Peter
General reward
 The general reward is comprised of :
1) The total cost of coordination – cost of internal
resources such as battery life and fuel.
2) Time spent on coordinating.
3) Frequency of coordinating.
Download