Allocating Roles in Large Scale Teams: An Empirical Evaluation

advertisement
Steven Okamoto
Milind Tambe
ABSTRACT
ALLOCATING ROLES IN LARGE SCALE TEAMS:
AN EMPIRICAL EVALUATION
Role allocation, the act of assigning tasks to agents, is an important
coordination problem for multiagent teams. It is also an intractable problem that
frequently needs to be solved quickly if the results are to be of any real-world use.
Low communication, approximate DCOP (LA-DCOP) is a distributed,
asynchronous, anytime, approximate role allocation algorithm designed to quickly
find good allocations for large, cooperative teams in complex domains. Four key
innovations – token-based access to roles, probabilistically-calculated thresholds to
guide allocation changes, potential tokens to facilitate efficient coalition formation to
perform constrained tasks, and exploitation of agents’ local knowledge – distinguish
LA-DCOP and allow it to manage complexity and quickly find good solutions. I
present empirical results demonstrating that LA-DCOP outperforms a prominent
competing algorithm while using several orders of magnitude fewer messages.
ALLOCATING ROLES IN LARGE SCALE TEAMS:
AN EMPIRICAL EVALUATION
by
Steven Okamoto
A Thesis Presented to the
FACULTY OF THE GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
MASTER OF SCIENCE
(COMPUTER SCIENCE)
December 2003
Copyright 2003
Steven Okamoto
iii
ACKNOWLEDGEMENTS
Deep, heartfelt thanks to Milind Tambe for valuable discussions, unflagging
support, and much-needed guidance; to Paul Scerri for wonderful work on the
simulator, helpful suggestions on experiments, tireless discussions, and generally
stimulating banter; to both of them for getting me into this whole wonderful
business; and to Sven Koenig for his understanding assistance, keen insight, and very
pointy questions.
iv
TABLE OF CONTENTS
Acknowledgements
iii
List of Tables and Figures
v
Abstract
viii
Introduction
1
Chapter One – Extended Generalized Assignment Problem
3
Chapter Two – Low Cost, Approximate DCOP
7
Chapter Three – Experimental Results
15
Chapter Four – Conclusion and Future Work
40
Bibliography
42
v
LIST OF TABLES AND FIGURES
Table 1: Summary of experiments performed.
15
Figure 1: Algorithm1: Agent algorithm. The algorithm run by each agent.
12
Figure 2: Algorithm2: AND-constraint owner algorithm. The algorithm
run by the owner of each AND-constrained set.
14
Figure 3: Comparison of output for DSA and LA-DCOP. Economic output
model, non-GAP, no AND-constraints, static environment. The
number of agents is equal to the number of roles.
20
Figure 4: Comparison of total communication for DSA and LA-DCOP.
Economic output model, non-GAP, no AND-constraints, static
environment. The number of agents is equal to the number of roles.
Every agent has a non-zero capability to perform every role.
21
Figure 5: Communication per role for DSA and LA-DCOP. Economic
output model, non-GAP, no AND-constraints, static environment.
The number of agents is equal to the number of roles.
22
Figure 6: Comparison of output per role for a centralized greedy algorithm
and LA-DCOP. Economic output model, non-GAP, no
AND-constraints, static environment. The number of agents is equal
to the number of roles. Every agent has a non-zero capability to
perform all tasks.
23
Figure 7: Effect of percentage of capable agents and threshold on output.
500 agents, 500 roles, economic output, non-GAP, no
AND-constraints, static environment.
24
Figure 8: Effect of percentage of capable agents and threshold on output.
500 agents, 500 roles, non-economic output, non-GAP, no
AND-constraints, static environment.
25
Figure 9: Effect of percentage of capable agents and threshold on output
of each performed role. 500 agents, 500 roles, non-economic
output, non-GAP, no AND-constraints, static environment.
26
vi
Figure 10: Effect of thresholds and percentage of capable agents on the
percentage of roles that are performed in each timestep. 500
agents, 500 roles, economic output, non-GAP, no AND-constraints,
static environment.
27
Figure 11: 2-AND-constrained results, no retainers. Effect of percentage
of capable agents and percentage of AND-constrained. 1500 agents,
1500 roles, no thresholds.
28
Figure 12: 2-AND-constrained results, 1/1 retainers. Effect of percentage
of capable agents and percentage of AND-constrained. 1500 agents,
1500 roles, no thresholds.
29
Figure 13: 2-AND-constrained results, 5/5 retainers. Effect of percentage
of capable agents and percentage of AND-constrained. 1500 agents,
1500 roles, no thresholds.
30
Figure 14: 5-AND-constrained results, no retainers. Effect of percentage
of capable agents and percentage of AND-constrained. 1500 agents,
1500 roles, no thresholds.
31
Figure 15: 5-AND-constrained results, 1/1 retainers. Effect of percentage
of capable agents and percentage of AND-constrained. 1500 agents,
1500 roles, no thresholds.
32
Figure 16: 5-AND-constrained results, 5/5 retainers. Effect of percentage
of capable agents and percentage of AND-constrained. 1500 agents,
1500 roles, no thresholds.
33
Figure 17: Effect of threshold and percentage of capable agents on output.
1500 agents, 1500 roles, Economic, GAP, no AND, static
environment.
34
Figure 18: Effect of threshold and percentage of capable agents on output.
1500 agents, 3000 roles, Economic, GAP, no AND, static
environment.
35
Figure 19: Effect of threshold and percentage of capable agents on output.
1500 agents, 3000 roles, Economic, GAP, no AND, static
environment.
36
vii
Figure 20: Effect of threshold and percentage of capable agents on output.
1500 agents, 1500 roles, Economic, GAP, no AND, 1% dynamism.
37
Figure 21: Effect of threshold and percentage of capable agents on output.
1500 agents, 1500 roles, Economic, GAP, no AND, 10% dynamism.
38
Figure 22: Effect of threshold and percentage of capable agents on output.
500 agents, 500 roles, Non-economic, non-GAP, no AND, 10%
dynamism.
39
viii
ABSTRACT
Role allocation, the act of assigning tasks to agents, is an important
coordination problem for multiagent teams. It is also an intractable problem that
frequently needs to be solved quickly if the results are to be of any real-world use.
Low communication, approximate DCOP (LA-DCOP) is a distributed,
asynchronous, anytime, approximate role allocation algorithm designed to quickly
find good allocations for large, cooperative teams in complex domains. Four key
innovations – token-based access to roles, probabilistically-calculated thresholds to
guide allocation changes, potential tokens to facilitate efficient coalition formation to
perform constrained tasks, and exploitation of agents’ local knowledge – distinguish
LA-DCOP and allow it to manage complexity and quickly find good solutions. I
present empirical results demonstrating that LA-DCOP outperforms a prominent
competing algorithm while using several orders of magnitude fewer messages.
1
INTRODUCTION
Multiagent systems of the future will involve large numbers of heterogeneous
agents performing many diverse, interacting tasks in dynamic environments.
Success in such domains will require coordination algorithms with unprecedented
flexibility, efficiency, and robustness. In addition, many applications will demand
distributed and/or asynchronous algorithms that can be executed by individual
autonomous agents with only limited communication. One coordination algorithm
that will have to cope with these demands is the algorithm to perform role allocation,
the assignment of tasks to agents. Current techniques are insufficient for dealing
with these challenges.
Existing role allocation techniques are unable to cope with the rigors future
applications will place on them. Firstly, existing techniques do not scale well
enough to coordinate the hundreds or thousands of heterogeneous agents that will
make up future teams, taking far too long to select an allocation to be of practical use
in general cases [4] [6]. Secondly, existing algorithms cannot deal with the rapid
dynamism inherent in many domains [6] [7]. This worsens the running time of
existing algorithms, because it shortens the amount of time that can be devoted to
computation before the world changes and the allocation is potentially invalidated.
Many current algorithms must be rerun whenever the world is changed, an
unacceptable solution given the time constraints dynamism imposes, especially since
even a small change in the situation may result in arbitrarily many role allocation
changes. Thirdly, existing algorithms simplify problems by assuming that agents can
2
only execute a single task at a time [5], which is an unnecessarily limiting restriction.
Fourthly, existing algorithms do not take into account local knowledge that agents
may have about the capabilities of other agents. Finally, many existing techniques
rely on high communication [1] or centralized computation between agents to find
high-quality allocations, neither of which may be possible in some of the future
domains where communication costs are high and robustness is needed.
LA-DCOP is a new algorithm designed to overcome these limitations. Role
allocation for cooperative teams can be cast as a distributed constraint optimization
problem (DCOP). LA-DCOP is a distributed, asynchronous, token-based algorithm
for finding approximate solutions to the kind of DCOPs that correspond to role
allocation problems for cooperative teams. These types of DCOPS are difficult for
conventional DCOP algorithms to solve exactly.
To go from a role allocation problem to a DCOP requires a few
transformations. Role allocation corresponds naturally to the Generalized
Assignment Problem (GAP), which assign values to variables while respecting local
resource constraints and maximizing overall utility. GAP is known to be NPcomplete. GAP can be extended to incorporate features such as dynamism and
constraints between roles. This Extended GAP (E-GAP) can then be converted to a
DCOP.
3
CHAPTER ONE – Extended Generalized Assignment Problem
A generalized assignment problem assigns roles from a set R to agents from a
set E, maximizing the value of the assignment while respecting local resource
constraints. The value of an assignment is determined by the capabilities agents
have for performing roles. The capability of each agent ei ∈ E to perform each role
rj ∈ R, given by:
Cap(ei, rj)  [0, 1]
This capability represents the agent’s competence at performing that role, its chance
of success, and all other factors that may affect the success of its performing the role.
These factors may differ greatly from task to task, but it is reasonable to assume that
there may be classes of roles sharing similar factors. An agent will then have equal
capability to perform any role in this class. For example, an agent’s capability to
fight a fire may be vastly different from its ability to render medical aid, but the
agent’s capability to perform any of several firefighting roles corresponding to
different fires will not differ significantly. The capability of an agent ei ∈ E to
perform any role rj ∈ Class can then be represented by
Cap(ei, Class)  [0, 1]
where
∀ rj, rk ∈ Class, Cap(ei, rj) = Cap(ei, rk)
In addition to capabilities, each agent also has resources which are used to
perform roles. While there can be different kinds of resources, I will restrict focus
4
here to the case where there is a single type of resource, which can usually be
thought of as time. Furthermore, all agents will have an identical amount of
resources, normalized to 1.0. The amount of resources required for an agent ei ∈ E
to perform role rj ∈ R is given by Resources(ei, rj).
Let A = (ai,j) be an allocation matrix such that
 1 if ei is performing r j
ai , j  
 0 otherwise
Then the goal of GAP is to find the matrix A that maximizes the value f(A), where
f(A) is given by
f ( A) 
  Cap(e , r )  a
i
ei E r j R
j
i, j
such that
ei  E ,
 Resources (e , r )  a
i
r j R
j
i, j
 1.0
and
r j  R,
a
ei E
i, j
1
That is, GAP maximizes the sum of the capabilities with which each role is being
performed, while ensuring that no agent exceeds its resources and no role is
performed by more than one agent.
With a GAP formulation of role allocation, agents are able to perform
multiple roles at once. However, in order to incorporate additional aspects such as
dynamics and constraints between tasks, GAP must be explicitly extended.
5
Extended GAP (E-GAP) accommodates dynamics by allowing E, R, Cap,
and Resources to all vary by time. This means that a single allocation A is no longer
sufficient; instead, the solution to E-GAP is a sequence of allocations, A, indexed
by time. Each item in the sequence is an allocation of roles to agents for a discrete
time step.
Coordination constraints ≍ exist on sets of roles. The only constraint that I
will be considering here is a simultaneous-execution constraint, although many
others are possible. The constraint I will be considering is an AND-constraint, which
specifies that the team receives a benefit for an agent performing a role in the
constrained set only if all roles in that set are simultaneously performed. Let
ANDk ∈ ≍ be an AND-constraint on the set RANDk. Then we can quantify the value
of performing a task in the AND-constrained set RANDk by

 Cap(ei , rj ) if
∀ ei ∈ E, ∀ rj ∈ RANDk, Val(ei, rj, ≍) = 

 0 otherwise
 a
rmRANDk enE
n ,m
 RANDk
This just says that the value for an agent performing an AND-constrained role is the
agent’s capability for performing that role if all the roles in the constrained set are
being executed, and 0 otherwise. For roles that do not take part in an ANDconstraint, the value of performing that role is just the capability of the agent
performing the role, as in GAP.
The goal of E-GAP is to maximize
f ( A  )     Val(ei, rj, ≍, t)  ai,j,t
t
ei E r j R
6
such that
∀ t, ∀ ei ∈ E,
 Resources (e , r , t )  a
i
r j R
j
i , j ,t
 1.0
and
∀ t, ∀ rj ∈ R,
a
ei E
i , j ,t
1
That is, maximize the sum of all total rewards over all time steps (up to a finite
horizon), respecting local resource constraints at all time steps and ensuring that no
task is ever executed by more than one agent at a time.
7
CHAPTER TWO – Low Cost, Approximate DCOP
LA-DCOP provides an approximate solution to E-GAP by straightforwardly
reformulating the problem as a DCOP. Agents are mapped to DCOP variables and
roles are mapped to DCOP values. Because an agent in E-GAP can perform multiple
roles at the same time, the DCOP variables must be able to take on multiple values at
the same time. Capabilities are mapped to rewards associated with a variable taking
on a value. Resource constraints are mapped to constraints on the values a variable
can simultaneously have. The important constraint in E-GAP that a role cannot be
performed by more than one agent at a time leads to a complete graph of not-equals
constraints in DCOP.
Complete graphs are difficult for conventional DCOP algorithms [3]. LADCOP surmounts this difficulty through the use of tokens. A token is created for
each value (role) and distributed to the team. An agent can assign a value to its
variable only if it holds the corresponding token for that value. If an agent chooses
not to assign to its variable the value of a token, it passes that token on to another
agent. This ensures that no value is assigned to more than one agent at a time, and
limits the amount of communication needed for ensuring this constraint to the
passing of the tokens.
The central decision that an agent must now make is which tokens to keep
and which tokens to pass on. For the tokens that it keeps, the agent assigns the
corresponding values to its variable. For the tokens that it chooses to pass on, the
agent must decide to whom it will pass each token. An agent’s decision is
8
constrained by the constraints imposed by E-GAP. In particular, it must ensure that
it has sufficient resources to assign the values of all tokens it chooses to keep. This
is straightforward to check.
A more complicated choice is posed by whether the best interests of the team
are served by the agent keeping tokens that it can assign while respecting local
resource constraints. In some cases, the team may be better served by the agent
passing along a token that it could have held, thereby allowing the token to be
acquired by a more capable agent who will better perform the associated role. This
is a decision theoretic choice that can be made based on the expected value to the
team of passing on the token and the expected value to the team of keeping the
token.
To simplify decision-making for agents (who may not have sufficient
information to make these calculations), a threshold can be calculated and attached to
the token. The threshold represents the minimum capability that an agent must have
for performing the role associated with the token in order to keep the token. An
agent need only check its capability to perform the role against the threshold; if it has
lower capability than the threshold, it passes on the token, otherwise, it keeps the
token. In this way, distributed constraint optimization is converted to distributed
constraint satisfaction, which is a much easier task. If the thresholds are chosen
appropriately, this can be done in a way that maintains satisfactory solutions.
Ideally, the threshold would be computed using a probabilistic model with
sufficient information about the team and the world-state to ensure that agents pass
9
the threshold test if and only if the expected value to the team of their keeping the
token is higher than the expected value to the team of passing the token on. This
would take into account all agents’ capabilities, the time it would take to reach an
agent with sufficiently high capability who will be able to execute the role (given
resource constraints), and the opportunity cost of not performing the role
immediately.
While it may be possible to identify such thresholds using exact information
on the distribution of tasks and agents in certain limited situations, this is impractical
for large systems because of the complexity of the calculation. In addition,
dynamism will quickly render a calculated threshold obsolete, and recalculating the
threshold every time the world-state changes will return LA-DCOP to one of the
problems of existing role allocation algorithms. Instead, the threshold can be
calculated using knowledge only of the distributions of capabilities and roles, which
is information that may be much more widely available. In addition, it allows the
same calculated threshold to hold up for different problem instances drawn from the
same distribution and provides a natural way to factor dynamism into the calculation
by its impact on the distributions.
Another way to find the appropriate threshold values is by using dynamic
thresholds, which essentially are a very crude learning mechanism whereby tokens
approximate the correct threshold as they pass through the system. The threshold
value attached to a token is lowered each time the token is passed.
10
Once the tokens to be passed are identified, the agent must choose to whom
they will be passed. It is at this point that LA-DCOP allows agents to use
information they may have on the structure of the team and which agents may have
high capabilities. At the very least, agents should not pass the token back to agents
that have rejected it because of the threshold, since this could cause thrashing. Thus,
a history is attached to each token recording which agents have rejected it.
AND-constraints pose special problems because they require the
simultaneous execution of tasks. This introduces the possibility of deadlock and
starvation. To see how deadlock can occur, consider a simple example with two
agents, Agent1 and Agent2, and four roles, A, B, C, and D, each of which requires an
agent’s full resources to perform, thus limiting agents to performing only one role at
a time. Suppose that A and B are AND-constrained and C and D are ANDconstrained. It may be that Agent1 has higher capability to perform A than any other
role. Similarly, suppose that Agent2 has higher capability to perform C than any
other role. Then if Agent1 gets the token for A, it will prefer to perform A above all
other tasks, and likewise for Agent2 with C. Agent1 now waits for Agent2 to
perform B, while Agent2 waits for Agent1 to perform D. Deadlock has occurred,
and no roles are performed. Starvation, while less serious, would happen if C and D
were not AND-constrained. Then Agent1 would choose to perform A and wait
forever for Agent2 to perform B, while Agent2 happily performs C. Deadlock and
starvation are just extreme cases of the inefficiencies that could emerge if agents
11
naively ignore AND-constraints in choosing which role to perform and simply wait
for other agents to perform roles that are constrained to their own.
To improve performance and avoid, or at least reduce, deadlock and
starvation, LA-DCOP uses potential tokens for AND-constrained roles. The idea is
that the tokens for each set of AND-constrained roles are given to a single agent,
which will be the “owner” for that set of roles. For each of the AND-constrained
roles, the owner creates a small number of potential tokens, which are circulated
among the agents instead of the normal token representing that role. Potential tokens
can be passed or accepted just like normal tokens, but whereas accepting a normal
token means that the agent will perform the corresponding role, accepting a potential
token just commits the agent to performing the role when a coalition to perform all
of the constrained tasks is found. An agent that has accepted a potential token is free
to perform other roles while waiting for the coalition to be fully formed. An agent
that accepts a potential token is called a retainer for the corresponding role, and is
said to be retained for the role, in the same sense that attorneys or other professionals
in the real world are retained for their services. The owner of the set of roles (who
need not be a retainer itself) is informed whenever an agent is retained or an agent
previously retained opts out of the retainer agreement. When the owner detects that
an agent has been retained for each role, it chooses one of the retainers for each of
the constrained roles and issues them the corresponding real tokens. This is called
locking the coalition. All potential tokens for the constrained roles are then revoked.
Because of the asynchrony of the algorithm, there is still a small chance that a lock
12
will fail (i.e., the owner
will think that there is a
retainer for each role
when in fact one of the
agents has since left the
retainer agreement).
Figure 1 shows
the basic algorithm that
is run by each agent. It
is event-driven,
handling each message
as it arrives. If the
message is a token, the
agent first checks to see
if its capability to
perform the role passes
the threshold test. If
not, the token is
summarily passed on.
If it does, then the agent
Figure 1: Algorithm1: Agent algorithm. The algorithm run by
each agent.
Algorithm 1: Agent Algorithm
VARMONITOR(Cap, Resources)
(1) V ← Ø, PV ← Ø
(2) while true
(3)
msg ← GETMSG()
(4)
if msg is token
(5)
token ← msg
(6)
if token.threshold < Cap(token.value)
(7)
if token.potential
(8)
PV ← PV ∪ token.value
(9)
SENDMSG(token.owner, “retained”)
(10)
else
(11)
V ← V ∪ token.value
(12)
(13)
if vV Resources (v)  1.0
(14)
(15)
(16)
(17)
(18)
(19)
out ← V – MAXCAP(V)
foreach v in out PASSON(new token(v))
V ← V – out
foreach pv in PV
if vV  pv Resources (v)  1.0
(20)
if pv is not in MAXCAP(V∪pv)
(21)
PV ← PV – token.value
(22)
SENDMSG(pv.owner, “released”)
(23)
PASSON(new token(pv, potential))
(24) else
(25)
PASSON(token)
(26) else if msg is “lock v”
(27) PV ← PV – v
(28) V ← V ∪ v
(29) else if msg is “release v”
(30)
PV ← PV – v
temporarily accepts the token while deciding what to do next. The agent then
calculates what subset of its current real tokens (including the newly accepted token,
13
if it was a real token) maximizes its output while satisfying its resource constraint
(this calculation is carried out by the MAXCAP function); all other real tokens are
kicked out. The agent then goes through its list of potential tokens (including the
newly accepted token, if it was a real token) and determines for each potential token
if its role would be executed were it actually a real token. If so, the token is kept
(and the agent stays retained); if not, the token is passed on (and the agent ends the
retainer). This ensures that if a coalition lock is sent out and the agent is called upon
to fulfill its retainer agreement, the agent will perform the role based on its greedy
role selection mechanism, thereby preventing coalition failure. Tokens are passed on
using the PASSON function which also updates the threshold for the token being
passed, if dynamic thresholds are being used.
14
Figure 2
shows the algorithm
executed by the
owner of each set of
AND-constrained
tasks. The owner first
distributes the
potential tokens, then
waits for at least one
potential token to be
accepted for each
role. Once the owner
determines that an
agent has been
retained for each role,
Figure 2: Algorithm2: AND-constraint owner algorithm. The
algorithm run by the owner of each AND-constrained set
Algorithm 2: Monitoring AND constrained roles
ANDMONITOR(V)
(1) foreach v ∈ V
(2)
for 1 to No. of Potential Values
(3)
PASSON(new token(v, potential))
(4)
(5)
- Wait until at least one potential token for each
(6)
- real token is accepted
(7)
while vV | Retained [v] |  0
(8)
msg ← getMsg()
(9)
if msg is “retained v”
(10)
Retained[v] ← Retained[v] ∪ msg.sender
(11) else if msg is “release v”
(12)
Retained[v] ← Retained[v] – msg.sender
(13)
(14) - Choose one potential token holder for each token
(15) - to send real token
(16) foreach v ∈ V
(17) a* = ∀ a ∈ Retained[v], Cap(a*, v) > Cap(a, v)
(18) SENDMSG(a*, “lock v”)
(19) foreach a ∈ Retained[v]
(20)
if a ≠ a*
(21)
SENDMSG(a, “release v”)
it locks the coalition
and releases the other retainers.
15
CHAPTER THREE - Experimental Results
LA-DCOP was empirically evaluated using an abstracted simulator that
offered fine control over a broad range of parameters. The results show that it
outperforms or performs comparably to other role allocation schemes. I also tested
the algorithm using various parameter settings which were not replicable using other
algorithms. The experiments are summarized in Table 1; I will present the results for
each experiment below.
Parameters
Table 1: Summary of experiments performed.
Output
model
Economic
GAP or
non-GAP
non-GAP
GAP
NonEconomic
non-GAP
GAP
AND or
no AND
no AND
AND
no AND
AND
no AND
AND
no AND
AND
Summary of experiments
Compared to DSA and greedy algorithms;
Tested static and dynamic thresholds
Tested retainers
Tested static thresholds
No experiments performed
Tested static and dynamic thresholds
No experiments performed
No experiments performed
No experiments performed
For all experiments, I assumed five different classes of roles, and hence five
capability types. Agents were assigned capabilities independently of each other, and
each of the five capabilities of each agent was also determined independently of that
agent’s other capabilities. Tasks were initially assigned to agents randomly and the
16
simulation run for 1000 timesteps; the simulator was run 20 times for each parameter
setting and the results averaged over the 20 runs.
Experiments can be classified into eight general categories using three
parameters (this is shown in Table 1). LA-DCOP is compared to other algorithms
only in the non-economic output, non-GAP, no-AND constrained case, because this
is the only configuration that the other algorithms support.
Output model: The first parameter specifies the type of output model used.
Two output models are supported: the economic output model and the non-economic
output model. With economic output, agents performing a role receive value for that
performance on each time step. Under the non-economic output model, roles have
durations, and value is received only if a single agent continuously performs the role
for the entire duration of the role. Note that the output generated under the noneconomic output model will generally be less than that generated under the economic
output model, ceteris paribus. If a role has a duration of x time steps, then that role
could potentially generate x times the output under an economic output model as it
could under a non-economic output model. Correspondingly, both thresholds and
the greedy kick-out mechanism use output per remaining execution time to compare
the values of performing roles. This biases choices toward shorter roles.
GAP: The second parameter indicates whether agents can only perform a
single role at a time (this is a non-GAP setting) or can execute multiple roles at once
(this is a GAP setting). For GAP problems, tasks require an amount of resources
randomly chosen from the set {0.25, 0.5, 0.75}. Each agent has 1.0 resources that it
17
can expend each time step in performing roles. Thus, an agent can simultaneously
perform 4 roles at most, and 2 roles on average.
AND-constraints: The third parameter specifies whether AND constraints
exist between roles.
Other parameters that are varied in these experiments are the number of
agents, the number of roles, the chance for an agent to have a non-zero capability,
the amount of dynamism in the environment, and the threshold used for determining
if agents pass on tokens. Five parameters are unique to AND-constrained
experiments: the size of the AND-constrained sets, the percentage of roles that were
AND-constrained, whether retainers were used, and if retainers were used, the
maximum number of roles for which an agent could be retained and the maximum
number of agents a role could retain. In addition, two parameters were used for
experiments using dynamic thresholds, one specifying the type of function used to
update thresholds, and the other specifying the rate at which thresholds were
modified.
Percentage Capable: For each agent, capabilities were determined
independently. An agent had a random chance, equal to the percentage capable
parameter, to have a non-zero capability for each class of roles. If an agent was
determined to have a non-zero capability for a class of roles, then its actual capability
to perform that class of roles was drawn from a uniform(0, 1) distribution. For
example, if percentage capable were 0.6, then each agent would have a 60% chance
of having a non-zero capability for each capability type. As a result, for each class
18
of roles, about 60% of the agents would have some capability to perform that role,
and of those with non-zero capabilities, the average capability would be 0.5 (since
those capabilities would be randomly chosen between 0 and 1).
Dynamism: Dynamism is used in two different ways, depending on whether
economic output or non-economic output is used. If economic output is used,
dynamism reflects a percentage of roles that are randomly replaced by new roles on
every time step. Hence, dynamism of 0.1 corresponds to 10% of the roles being
replaced on every time step. New roles are chosen randomly among the different
classes of roles. If non-economic output is used, dynamism is used to find the
average number of timesteps that roles must be performed for output to be generated.
Average duration of a role is calculated as the reciprocal of dynamism; hence,
dynamism of 0.1 corresponds to an average role duration of 10 time steps.
Individual durations are drawn from discrete uniform distributions over positive
values with a minimum duration of 1 timestep and a maximum of twice the average
duration. For example, when dynamism is 0.1, the minimum duration of a role can
have is 1 time step and the maximum is 20 timesteps, with the average duration of
roles being 10 timesteps.
k-AND: For simplicity, AND-constraints were assumed to always involve a
fixed number of roles. Thus, for a 3-AND setting, all AND-constraints involve three
roles which all must be performed simultaneously for output to be received from any
of them. I tested AND-constraints using 2-AND, 3-AND, 4-AND, and 5-AND
settings.
19
Retainers: When retainers are used, I will use the notation “x/y retainers” to
denote that agents can be retained by for at most x roles, and each AND-constrained
role can retain at most y agents (y potential tokens are created for each token
corresponding to an AND-constrained role).
Comparisons
LA-DCOP was tested against two other algorithms. The first was DSA, an
established anytime, approximate DCOP algorithm [2]. DSA provides a good
baseline for comparing both the output and communication of LA-DCOP. The other
algorithm compared to LA-DCOP was a centralized greedy algorithm that allocated
roles to agents at each timestep by sequentially going through the tasks and choosing
the available agent with the highest capability to perform that task. The greedy
algorithm provides a reasonable approximation of the optimal allocation, which,
while theoretically calculable, was too computationally expensive to achieve for the
problem sizes under consideration. Because of the limitations of both DSA and the
simple greedy algorithm, these comparisons could only be made in the non-GAP
case using the economic output model with no AND-constraints. Also, all agents
have non-zero capabilities to perform all roles.
20
DSA: Figure 3 shows the output for DSA and LA-DCOP as the number of
agents and roles increases. The number of agents and roles increases from 10 to
1400. Results are shown for LA-DCOP run with two different thresholds, 0.0 (no
threshold) and 0.5. This experiment was designed to show how LA-DCOP scales
with the size of the input problem, and in this case, the number of agents is equal to
the number of roles. Output here is measured in average output per role per
timestep, which is always a value between 0.0 and 1.0. Figure 4 plots the total
communication for this experiment and Figure 5 shows the communication per role
per timestep, comparing DSA and LA-DCOP with 0.0 and 0.5 thresholds.
Figure 3: Comparison of output for DSA and LA-DCOP. Economic output model, non-GAP, no
AND-constraints, static environment. The number of agents is equal to the number of roles.
Every agent has a non-zero capability to perform every role.
0.9
0.7
0.6
DSA
0.5
LA-DCOP, 0.0 threshold
0.4
LA-DCOP, 0.5 threshold
0.3
0.2
0.1
problem size (#agents = #roles)
1310
1210
1110
1010
910
810
710
610
510
410
310
210
110
0
10
output per role per timestep
0.8
21
Figure 3 clearly shows that LA-DCOP outperforms DSA in output when the
number of agents is equal to the number of roles. DSA output averages around 0.5
for each role during each timestep. This demonstrates that roles under DSA are no
more likely to be performed by highly capable agents than less capable agents, since
the average capability is 0.5. In contrast, the greedy, token-based algorithm used by
LA-DCOP is able to allocate roles to agents that have high capabilities to perform
those tasks, even without using thresholds (threshold of 0.0). Using a threshold of
0.5 yields an even better allocation in terms of output.
Figure 4: Comparison of total communication for DSA and LA-DCOP. Economic output
model, non-GAP, no AND-constraints, static environment. The number of agents is equal to the
number of roles. Every agent has a non-zero capability to perform every role.
100000000
1000000
100000
DSA
10000
LA-DCOP, 0.0 threshold
LA-DCOP, 0.5 threshold
1000
100
10
1
10
11
0
21
0
31
0
41
0
51
0
61
0
71
0
81
0
91
10 0
1
11 0
1
12 0
1
13 0
10
communication (messages)
10000000
problem size (#agents = #roles)
22
Figure 4 shows the total amount of communication required for DSA and LA-DCOP.
Figure 5 shows the communication per role in each timestep for DSA and LADCOP. Communication is measured in messages passed between two agents;
“broadcast” messages to all agents are counted as N different messages, where N is
the number of agents. A variable under DSA broadcasts its value to all variables
whenever its value changes, and so the number of messages per role used for DSA
increases as the number of agents increases. In contrast, a variable under LA-DCOP
sends a single message when it passes a token to another agent, thereby achieving
communication per role that is independent of the number of agents.
Figure 5: Communication per role for DSA and LA-DCOP. Economic output model, non-GAP,
no AND-constraints, static environment. The number of agents is equal to the number of roles.
Every agent has a non-zero capability to perform every role.
1000
DSA
100
LA-DCOP, 0.0 threshold
LA-DCOP, 0.5 threshold
10
problem size (#agents = #roles)
1310
1210
1110
1010
910
810
710
610
510
410
310
210
110
1
10
communication per role (messages)
10000
23
These communication behaviors are shown in Figure 5, where the number of
messages per role per timestep increases with the number of agents and roles for
DSA and holds steady LA-DCOP. It is also not surprising that the total
communication for DSA increases faster than it does for LA-DCOP. Higher
thresholds for LA-DCOP also increases communication, though not as dramatically.
This is because higher thresholds mean that tokens are less likely to be accepted by
an agent (because more agents will have capabilities too low to pass the threshold
test), and so there is increased communication.
Figure 6: Comparison of output per role for a centralized greedy algorithm and LA-DCOP.
Economic output model, non-GAP, no AND-constraints, static environment. The number of
agents is equal to the number of roles. Every agent has a non-zero capability to perform all tasks.
0.9
0.8
0.6
Greedy
0.5
LA-DCOP, 0.0 threshold
0.4
LA-DCOP, 0.5 threshold
0.3
0.2
0.1
problem size (#agents = #roles)
49
0
45
0
41
0
37
0
33
0
29
0
25
0
21
0
17
0
90
13
0
50
0
10
output per role per time
0.7
24
Greedy: Figure 6 shows the output for the centralized greedy algorithm and
LA-DCOP as the number of agents and tasks increases. The number of agents is
equal to the number of roles, and increases from 10 to 500. Results for LA-DCOP
with two different threshold settings, 0.0 (no thresholds) and 0.5 are shown. LADCOP performs very well compared to the centralized greedy algorithm. There is
no comparison of communication since the greedy algorithm is centralized and hence
has no communication.
Figure 7: Effect of percentage of capable agents and threshold on output. 500 agents, 500 roles,
economic output, non-GAP, no AND-constraints, static environment.
0.9
0.8
10% capable
output per role per time
0.7
20% capable
0.6
30% capable
40% capable
0.5
50% capable
60% capable
0.4
70% capable
80% capable
0.3
90% capable
0.2
100% capable
0.1
0
0
0.1
0.2
0.3
0.4
0.5
threshold
0.6
0.7
0.8
0.9
25
Economic output, Non-GAP, no AND constraints
Figures 7 and 8 show the effect of thresholds on output for varying
percentages of agents with non-zero capabilities for each role class. Both figures are
based on the same data, but Figure 7 varies the threshold on the x-axis and Figure 8
varies the percentage of capable agents on the x-axis. This experiment involved
1500 agents and 1500 roles and a static environment.
Output increases as the percentage of capable agents increases. This makes
sense because a higher percentage of capable agents means that there are more
Figure 8: Effect of percentage of capable agents and threshold on output. 500 agents, 500 roles,
non-economic output, non-GAP, no AND-constraints, static environment.
0.9
0.8
0.0 threshold
output per role per time
0.7
0.1 threshold
0.6
0.2 threshold
0.3 threshold
0.5
0.4 threshold
0.5 threshold
0.4
0.6 threshold
0.7 threshold
0.3
0.8 threshold
0.2
0.9 threshold
0.1
0
10%
20%
30%
40%
50%
60%
70%
80%
percentage of capable agents
90%
100%
26
agents with high capabilities that can take on roles. The threshold value that
maximizes output also increases as the percentage of capable agents increases. The
higher threshold forces tokens to be passed until the roles can be performed by
agents with capabilities higher than the threshold. However, if thresholds become
too high, there is a lack of agents with sufficiently high thresholds, and so some roles
can never be performed. In this case, thresholds actually hurt performance, because
they prevent roles that might otherwise have been performed (albeit by less capable
agents) from being performed at all. Figure 9 shows the output received for each
Figure 9: Effect of percentage of capable agents and threshold on output of each performed role.
500 agents, 500 roles, non-economic output, non-GAP, no AND-constraints, static environment.
1.2
1
output per performed role
10% capable
20% capable
0.8
30% capable
40% capable
50% capable
0.6
60% capable
70% capable
80% capable
0.4
90% capable
100% capable
0.2
0
0
0.1
0.2
0.3
0.4
0.5
threshold
0.6
0.7
0.8
0.9
27
Figure 10: Effect of thresholds and percentage of capable agents on the percentage of roles that are
performed in each timestep. 500 agents, 500 roles, economic output, non-GAP, no ANDconstraints, static environment.
percentage of roles performed per timestep
1.2
1
10% capable
20% capable
0.8
30% capable
40% capable
50% capable
0.6
60% capable
70% capable
80% capable
0.4
90% capable
100% capable
0.2
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
threshold
role performed; note that as thresholds are increased, the output per performed role
increases to always stay above the threshold. However, Figure 10 shows that the
percentage of roles being performed per timestep decreases dramatically as
thresholds increase. The net result is that output degrades quite rapidly for high
thresholds, emphasizing the need for thresholds to be selected carefully.
Economic output, non-GAP, AND constraints
Figures 11 - 13 show the output achieved by LA-DCOP when there are 2AND constraints between roles in a static environment and no thresholds. These
experiments were conducted with 1500 agents and 1500 roles. Figure 9 shows the
28
base LA-DCOP algorithm output using no retainers, Figure 12 shows the output
using 1/1 retainers, and Figure 13 shows the output using 5/5 retainers.
In Figure 11, differences between outputs for different percentages of ANDconstrained roles are greatest for low percentages of capable agents. As the
percentage of capable agents increases, this difference becomes small, and when all
agents have some capability to perform all roles, there is no difference in output for
different percentages of AND-constrained roles. Even at low percentages of capable
agents, however, the absolute difference in outputs for the different percentages of
AND-constrained roles is low. Furthermore, the output is close to that received
Figure 11: 2-AND-constrained results, no retainers. Effect of percentage of capable agents and
percentage of AND-constrained. 1500 agents, 1500 roles, no thresholds.
0.8
0.7
10% AND
output per role per time
0.6
20% AND
30% AND
0.5
40% AND
50% AND
0.4
60% AND
70% AND
0.3
80% AND
90% AND
0.2
100% AND
0.1
0
10%
20%
30%
40%
50%
60%
70%
80%
percentage of capable agents
90% 100%
29
when there are no AND-constraints (see the 0.0 threshold line in Figure 8, which
suggests there is little room to improve performance in this case by using retainers.
Indeed, we see in Figure 12 that adding minimal 1/1 retainers does little to
change performance. There is a small increase of about 0.02 or 0.03 in output for
low percentages of AND-constrained roles, but this is partially offset by a loss of
about 0.01 in output at high percentages of AND-constrained roles and high
percentages of capable agents.
Using 5/5 retainers did much to reduce any difference between output for
different percentages of AND-constrained tasks, as shown in Figure 13. The outputs
Figure 12: 2-AND-constrained results, 1/1 retainers. Effect of percentage of capable agents and
percentage of AND-constrained. 1500 agents, 1500 roles, no thresholds.
0.9
0.8
10% AND
output per role per time
0.7
20% AND
0.6
30% AND
40% AND
0.5
50% AND
60% AND
0.4
70% AND
80% AND
0.3
90% AND
0.2
100% AND
0.1
0
10%
20%
30%
40%
50%
60%
70%
80%
percentage of capable agents
90% 100%
30
for all percentages of AND-constrained tasks are very close to the unconstrained
outputs in Figure 8.
Figure 13: 2-AND-constrained results, 5/5 retainers. Effect of percentage of capable agents and
percentage of AND-constrained. 1500 agents, 1500 roles, no thresholds.
0.9
0.8
10% AND
output per role per time
0.7
20% AND
0.6
30% AND
40% AND
0.5
50% AND
60% AND
0.4
70% AND
80% AND
0.3
90% AND
0.2
100% AND
0.1
0
10%
20%
30%
40%
50%
60%
70%
80%
percentage of capable agents
90% 100%
31
Figure 14: 5-AND-constrained results, no retainers. Effect of percentage of capable agents and
percentage of AND-constrained. 1500 agents, 1500 roles, no thresholds.
0.8
0.7
10% AND
output per role per time
0.6
20% AND
30% AND
0.5
40% AND
50% AND
0.4
60% AND
70% AND
0.3
80% AND
90% AND
0.2
100% AND
0.1
0
10%
20%
30%
40%
50%
60%
70%
80%
90% 100%
percentage of capable agents
Retainers made much more of a difference when the AND-constrained set
sizes were larger. Figures 14, 15, and 16 show the output for no retainers, 1/1
retainers, and 5/5 retainers in a 5-AND case, also with equal numbers of roles and
agents and no thresholds. The results suggest that retainers may only be useful for
larger sets of AND-constrained roles. However, for high percentages of capable
agents, retainers actually lower the output, by as much as 0.10 per task for 1/1
retainers, and to a much lesser extent for 5/5 retainers.
32
Also, the number of agents that can be retained for each role and roles for which
each agent can be retained seems to factor significantly into the performance in this
case. Figure 15 shows the output using 1/1 retainers. The curve is drastically
different in shape than in the case of no retainers or 5/5 retainers.
Figure 15: 5-AND-constrained results, 1/1 retainers. Effect of percentage of capable agents and
percentage of AND-constrained. 1500 agents, 1500 roles, no thresholds.
0.8
0.7
10% AND
output per role per time
0.6
20% AND
30% AND
0.5
40% AND
50% AND
0.4
60% AND
70% AND
0.3
80% AND
90% AND
0.2
100% AND
0.1
0
10%
20%
30%
40%
50%
60%
70%
80%
percentage of capable agents
90% 100%
33
As shown by the performance of the 5/5 retainers in both Figure 13 and
Figure 16, retainers can make a significant impact on performance for ANDconstrained tasks, greatly reducing the decrease in output associated with increasing
percentages of AND-constrained roles.
Figure 16: 5-AND-constrained results, 5/5 retainers. Effect of percentage of capable agents and
percentage of AND-constrained. 1500 agents, 1500 roles, no thresholds.
0.8
0.7
10% AND
output per role per time
0.6
20% AND
30% AND
0.5
40% AND
50% AND
0.4
60% AND
70% AND
0.3
80% AND
90% AND
0.2
100% AND
0.1
0
10%
20%
30%
40%
50%
60%
70%
80%
percentage of capable agents
90% 100%
34
Economic output, GAP, no AND
Recall that roles require resources to perform a role in a GAP problem. For
the GAP experiments that I ran, roles require a number of resources randomly
selected from {0.25, 0.5, 0.75}, which means that an agent can simultaneously
execute 4 roles at most, and 2 roles on average. Thus, it is reasonable to expect that
a number of agents (all of which are capable of performing any role) could, on
average, perform at most twice their number of roles simultaneously. This is the
saturation load for the team, the expected maximum number of tasks that can be
executed at once. This is different from the non-GAP case, where an equal number
Figure 17: Effect of threshold and percentage of capable agents on output. 1500 agents, 1500
roles, Economic, GAP, no AND, static environment.
1
0.9
output per role per time
0.8
10% capable
20% capable
0.7
30% capable
0.6
40% capable
50% capable
0.5
60% capable
70% capable
0.4
80% capable
0.3
90% capable
100% capable
0.2
0.1
0
0
0.1
0.2
0.3
0.4
0.5
0.6
threshold
0.7
0.8
0.9
1
35
Figure 18: Effect of threshold and percentage of capable agents on output. 1500 agents, 3000
roles, Economic, GAP, no AND, static environment.
0.8
0.7
10% capable
output per role per time
0.6
20% capable
30% capable
0.5
40% capable
50% capable
0.4
60% capable
70% capable
0.3
80% capable
90% capable
0.2
100% capable
0.1
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
threshold
of agents and roles constitutes a fully loaded team, where every agent has a role to
perform and all tasks can be simultaneously performed. Lower percentages of
capable agents effectively reduces the number of agents for load purposes.
Figure 17 shows how output changes as the threshold and the percentage of
capable agents varies, when there are equal numbers of agents and roles. A greater
percentage of capable agents increases output for all thresholds, and also increases
the threshold that yields maximum reward.
36
Figure 19: Effect of threshold and percentage of capable agents on output. 1500 agents, 3000
roles, Economic, GAP, no AND, static environment.
percentage of roles performed per timestep
1.2
1
10% capable
20% capable
0.8
30% capable
40% capable
50% capable
0.6
60% capable
70% capable
80% capable
0.4
90% capable
100% capable
0.2
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
threshold
Figure 18 shows the case when there are twice as many roles as agents. This
is the expected full load for the system. Note that thresholds yield much smaller
gains as compared to Figure 17. In fact, Figure 18 is somewhat similar to Figure 7,
which showed output for the fully loaded non-GAP case. The non-GAP case shows
a greater decrease in output for very high thresholds. This is probably because the
expected number of highly capable agents is the same in both cases, but in the GAP
case, these agents are able to perform more tasks simultaneously. Figure 19 shows
the percentage of tasks that are performed per timestep. Comparing this to Figure
10, which shows the same statistic for the non-GAP case, it is clear that far more
roles are performed at high thresholds in the GAP case than the non-GAP case.
37
Figure 20: Effect of threshold and percentage of capable agents on output. 1500 agents, 1500
roles, Economic, GAP, no AND, 1% dynamism.
0.9
0.8
10% capable
output per role per time
0.7
20% capable
0.6
30% capable
40% capable
0.5
50% capable
60% capable
0.4
70% capable
80% capable
0.3
90% capable
0.2
100% capable
0.1
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
threshold
LA-DCOP also holds up well with dynamics. Figures 20 and 21 show output
for dynamisms of 1% and 10%, respectively, in the case where there are equal
numbers of agents and roles. As can be seen, performance degrades slightly, but
even when 10% of the roles change on every timestep, LA-DCOP performs well and
appropriate threshold values continue to have a big impact.
38
Figure 21: Effect of threshold and percentage of capable agents on output. 1500 agents, 1500
roles, Economic, GAP, no AND, 10% dynamism.
0.8
0.7
10% capable
output per role per time
0.6
20% capable
30% capable
0.5
40% capable
50% capable
0.4
60% capable
70% capable
0.3
80% capable
90% capable
0.2
100% capable
0.1
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
threshold
Non-economic output, non-GAP, no AND
Figure 22 shows the output for different thresholds and percentages of
capable agents under the non-economic output model with equal numbers of agents
and roles and 10% dynamism. Recall that this means that roles have a minimum
duration of 1 timestep and a maximum duration of 20 timesteps, with an average
duration of 10 timesteps. Also, agents now base their threshold and kick out
measurements using output per unit time of remaining work that needs to be done.
This biases agents very heavily toward shorter tasks. For instance, if an agent has
two tokens A and B representing roles that it can perform with capabilities 0.2 and
39
Figure 22: Effect of threshold and percentage of capable agents on output. 500 agents, 500 roles,
Non-economic, non-GAP, no AND, 10% dynamism.
0.12
output per role per time
0.1
10% capable
20% capable
0.08
30% capable
40% capable
50% capable
0.06
60% capable
70% capable
0.04
80% capable
90% capable
100% capable
0.02
0.
9
0.
7
0.
5
0.
3
0.
1
0.
08
0.
06
0.
04
0.
02
0
0
threshold
0.9, respectively, but A has a duration of 1 timestep and B has a duration of 5
timesteps, then the agent will choose to keep A and kick out B, because the output
per time is 0.2 for A and 0.18 for B.
Also, outputs will be lower than under
economic output models. One consequence is that thresholds must also be lower.
Figure 22 shows very small thresholds on the left half and larger thresholds on the
right. Note that the x-axis values are not to scale. The fact that the base greedy
token algorithm without thresholds (threshold of 0.0) performs almost as well as or
outperforms the algorithm with all thresholds may be a consequence of the fact that
the minimum duration is 1 timestep, which is the same as the time it takes to pass a
token. As expected, increasing the percentage of capable agents increases the output.
40
CHAPTER FOUR – Conclusion and Future Work
As future multiagent domains increase in size and complexity, new
coordination algorithms must be developed to keep cooperative teams running
efficiently and effectively. LA-DCOP meets those rising challenges and performs
distributed, asynchronous, anytime role allocation for teams of cooperative agents.
Its scalability and effectiveness is a good indicator of its power in a domain known to
be intractable. Empirical simulation shows that LA-DCOP can match centralized
algorithms and greatly outperform other distributed, approximate algorithms while
using orders of magnitude fewer messages.
Thresholds, one of the distinguishing features of LA-DCOP, improved
performance greatly in many instances, but also impaired performance under some
circumstances. Thresholds can be harmful when they exclude from performing roles
those agents that would perform roles in the optimal allocation. This happens
especially when the number of tasks greatly exceeds the capacity of agents to
execute them. However, in the opposite circumstance, when there is an excess of
agents for the number of roles, thresholds can yield substantial improvements.
Potential tokens and associated retainers were very effective at mitigating the
inefficiencies of assembling a coalition to perform sets of constrained tasks. While
decreases in performance were seen in some cases, these were small when compared
to the gains when constrained sets were large.
It is clear that a mathematical theory must be developed to address some of
these issues, such as selecting a proper threshold to get maximal benefit. There is
41
also substantial room to incorporate additional constraints besides just ANDconstraints. Sequential constraints, where roles must be performed in a certain order
for reward to be gained, is one possibility.
The success of LA-DCOP in outperforming DSA indicates that the
difficulties posed by multiagent coordination are not inherently insurmountable, but
just limited by our current technology. LA-DCOP and other new algorithms will
push those boundaries farther out to admit tomorrow’s multiagent domains.
VARMONIT
OR(Cap,
Resources)
(1) V ←
Ø, PV ← Ø
(2) while
42
true
(3)
msg
←
GETMSG()
(4)
if
BIBLIOGRAPHY
msg is
token
[1]
C. Castelpietra, L. Iocchi, D. Nardi, and R. Rosati. Coordination in multi(5)
agent autonomous cognitive robotic systems. In Proceedings of the Second
token
International Cognitive RoboCup Workshop, 2000.
← msg
(6)
[2]
S. Fitzpatrick and L. Meertens. An experimental assessment of a stochastic,
if
anytime, decentralized, soft colourer for sparse graphs. In Stochastic
token.thres
Algorithms: Foundations and Algorithms, Proceedings SAGA, 2001
hold <
Cap(token.
[3]
P. Modi, W. Shen, M. Tambe, and M. Yokoo. An asynchronous complete
value)
method for distributed constraint optimization. In Proceedings of the Second
(7)
if token.potential International Conference on Autonomous Agents and Multiagent Systems,
(8)
PV ← PV ∪ token.value
2003.
(9)
SENDMSG(token.owner, “retained”)
(10)
[4]
R. Nair, M. Tambe, and S. Marsella. Role allocation and reallocation in
else
multiagent teams: Towards a practical analysis. In Proceedings of the
Second International Conference on Autonomous Agents and Multiagent
(11)
V ← V ∪ token.value
Systems, 2003.
(12)
(13)
[5]
O. Shehory and S. Kraus. Task allocation via coalition formation among
if
autonomous agents. In Proceedings of the Fourteenth International Joint
Conference on Artificial Intelligence, 1995.
out ← V – MAXCAP(V)
[6]
P. Stone and M. Veloso. Task decomposition, dynamic role assignment, and
foreach v in out PASSON(new token(v))
low-bandwidth communication for real-time strategic teamwork. Artificial
V ← V – out
Intelligence 110, pages 241-273, June 1999.
(14)
(15)
(16)
(17)
(18)
foreac
h pv in PV
(19)
if
(20)
(21)
(22)
(23)
(24)
(25)
[7]
G. Tidhar, A. S. Rao, and E. A. Sonenberg. Guided team selection. In
Proceedings of the 2nd International Conference on Multi-agent Systems
(ICMAS-96), 1996.
if pv is not in MAXCAP(V∪pv)
PV ← PV – token.value
SENDMSG(pv.owner, “released”)
PASSON(new token(pv, potential))
else
PASSO
N(token)
(26) else if
msg is
“lock v”
(27) PV
← PV – v
(28) V
←V∪v
Download