Steven Okamoto Milind Tambe ABSTRACT ALLOCATING ROLES IN LARGE SCALE TEAMS: AN EMPIRICAL EVALUATION Role allocation, the act of assigning tasks to agents, is an important coordination problem for multiagent teams. It is also an intractable problem that frequently needs to be solved quickly if the results are to be of any real-world use. Low communication, approximate DCOP (LA-DCOP) is a distributed, asynchronous, anytime, approximate role allocation algorithm designed to quickly find good allocations for large, cooperative teams in complex domains. Four key innovations – token-based access to roles, probabilistically-calculated thresholds to guide allocation changes, potential tokens to facilitate efficient coalition formation to perform constrained tasks, and exploitation of agents’ local knowledge – distinguish LA-DCOP and allow it to manage complexity and quickly find good solutions. I present empirical results demonstrating that LA-DCOP outperforms a prominent competing algorithm while using several orders of magnitude fewer messages. ALLOCATING ROLES IN LARGE SCALE TEAMS: AN EMPIRICAL EVALUATION by Steven Okamoto A Thesis Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree MASTER OF SCIENCE (COMPUTER SCIENCE) December 2003 Copyright 2003 Steven Okamoto iii ACKNOWLEDGEMENTS Deep, heartfelt thanks to Milind Tambe for valuable discussions, unflagging support, and much-needed guidance; to Paul Scerri for wonderful work on the simulator, helpful suggestions on experiments, tireless discussions, and generally stimulating banter; to both of them for getting me into this whole wonderful business; and to Sven Koenig for his understanding assistance, keen insight, and very pointy questions. iv TABLE OF CONTENTS Acknowledgements iii List of Tables and Figures v Abstract viii Introduction 1 Chapter One – Extended Generalized Assignment Problem 3 Chapter Two – Low Cost, Approximate DCOP 7 Chapter Three – Experimental Results 15 Chapter Four – Conclusion and Future Work 40 Bibliography 42 v LIST OF TABLES AND FIGURES Table 1: Summary of experiments performed. 15 Figure 1: Algorithm1: Agent algorithm. The algorithm run by each agent. 12 Figure 2: Algorithm2: AND-constraint owner algorithm. The algorithm run by the owner of each AND-constrained set. 14 Figure 3: Comparison of output for DSA and LA-DCOP. Economic output model, non-GAP, no AND-constraints, static environment. The number of agents is equal to the number of roles. 20 Figure 4: Comparison of total communication for DSA and LA-DCOP. Economic output model, non-GAP, no AND-constraints, static environment. The number of agents is equal to the number of roles. Every agent has a non-zero capability to perform every role. 21 Figure 5: Communication per role for DSA and LA-DCOP. Economic output model, non-GAP, no AND-constraints, static environment. The number of agents is equal to the number of roles. 22 Figure 6: Comparison of output per role for a centralized greedy algorithm and LA-DCOP. Economic output model, non-GAP, no AND-constraints, static environment. The number of agents is equal to the number of roles. Every agent has a non-zero capability to perform all tasks. 23 Figure 7: Effect of percentage of capable agents and threshold on output. 500 agents, 500 roles, economic output, non-GAP, no AND-constraints, static environment. 24 Figure 8: Effect of percentage of capable agents and threshold on output. 500 agents, 500 roles, non-economic output, non-GAP, no AND-constraints, static environment. 25 Figure 9: Effect of percentage of capable agents and threshold on output of each performed role. 500 agents, 500 roles, non-economic output, non-GAP, no AND-constraints, static environment. 26 vi Figure 10: Effect of thresholds and percentage of capable agents on the percentage of roles that are performed in each timestep. 500 agents, 500 roles, economic output, non-GAP, no AND-constraints, static environment. 27 Figure 11: 2-AND-constrained results, no retainers. Effect of percentage of capable agents and percentage of AND-constrained. 1500 agents, 1500 roles, no thresholds. 28 Figure 12: 2-AND-constrained results, 1/1 retainers. Effect of percentage of capable agents and percentage of AND-constrained. 1500 agents, 1500 roles, no thresholds. 29 Figure 13: 2-AND-constrained results, 5/5 retainers. Effect of percentage of capable agents and percentage of AND-constrained. 1500 agents, 1500 roles, no thresholds. 30 Figure 14: 5-AND-constrained results, no retainers. Effect of percentage of capable agents and percentage of AND-constrained. 1500 agents, 1500 roles, no thresholds. 31 Figure 15: 5-AND-constrained results, 1/1 retainers. Effect of percentage of capable agents and percentage of AND-constrained. 1500 agents, 1500 roles, no thresholds. 32 Figure 16: 5-AND-constrained results, 5/5 retainers. Effect of percentage of capable agents and percentage of AND-constrained. 1500 agents, 1500 roles, no thresholds. 33 Figure 17: Effect of threshold and percentage of capable agents on output. 1500 agents, 1500 roles, Economic, GAP, no AND, static environment. 34 Figure 18: Effect of threshold and percentage of capable agents on output. 1500 agents, 3000 roles, Economic, GAP, no AND, static environment. 35 Figure 19: Effect of threshold and percentage of capable agents on output. 1500 agents, 3000 roles, Economic, GAP, no AND, static environment. 36 vii Figure 20: Effect of threshold and percentage of capable agents on output. 1500 agents, 1500 roles, Economic, GAP, no AND, 1% dynamism. 37 Figure 21: Effect of threshold and percentage of capable agents on output. 1500 agents, 1500 roles, Economic, GAP, no AND, 10% dynamism. 38 Figure 22: Effect of threshold and percentage of capable agents on output. 500 agents, 500 roles, Non-economic, non-GAP, no AND, 10% dynamism. 39 viii ABSTRACT Role allocation, the act of assigning tasks to agents, is an important coordination problem for multiagent teams. It is also an intractable problem that frequently needs to be solved quickly if the results are to be of any real-world use. Low communication, approximate DCOP (LA-DCOP) is a distributed, asynchronous, anytime, approximate role allocation algorithm designed to quickly find good allocations for large, cooperative teams in complex domains. Four key innovations – token-based access to roles, probabilistically-calculated thresholds to guide allocation changes, potential tokens to facilitate efficient coalition formation to perform constrained tasks, and exploitation of agents’ local knowledge – distinguish LA-DCOP and allow it to manage complexity and quickly find good solutions. I present empirical results demonstrating that LA-DCOP outperforms a prominent competing algorithm while using several orders of magnitude fewer messages. 1 INTRODUCTION Multiagent systems of the future will involve large numbers of heterogeneous agents performing many diverse, interacting tasks in dynamic environments. Success in such domains will require coordination algorithms with unprecedented flexibility, efficiency, and robustness. In addition, many applications will demand distributed and/or asynchronous algorithms that can be executed by individual autonomous agents with only limited communication. One coordination algorithm that will have to cope with these demands is the algorithm to perform role allocation, the assignment of tasks to agents. Current techniques are insufficient for dealing with these challenges. Existing role allocation techniques are unable to cope with the rigors future applications will place on them. Firstly, existing techniques do not scale well enough to coordinate the hundreds or thousands of heterogeneous agents that will make up future teams, taking far too long to select an allocation to be of practical use in general cases [4] [6]. Secondly, existing algorithms cannot deal with the rapid dynamism inherent in many domains [6] [7]. This worsens the running time of existing algorithms, because it shortens the amount of time that can be devoted to computation before the world changes and the allocation is potentially invalidated. Many current algorithms must be rerun whenever the world is changed, an unacceptable solution given the time constraints dynamism imposes, especially since even a small change in the situation may result in arbitrarily many role allocation changes. Thirdly, existing algorithms simplify problems by assuming that agents can 2 only execute a single task at a time [5], which is an unnecessarily limiting restriction. Fourthly, existing algorithms do not take into account local knowledge that agents may have about the capabilities of other agents. Finally, many existing techniques rely on high communication [1] or centralized computation between agents to find high-quality allocations, neither of which may be possible in some of the future domains where communication costs are high and robustness is needed. LA-DCOP is a new algorithm designed to overcome these limitations. Role allocation for cooperative teams can be cast as a distributed constraint optimization problem (DCOP). LA-DCOP is a distributed, asynchronous, token-based algorithm for finding approximate solutions to the kind of DCOPs that correspond to role allocation problems for cooperative teams. These types of DCOPS are difficult for conventional DCOP algorithms to solve exactly. To go from a role allocation problem to a DCOP requires a few transformations. Role allocation corresponds naturally to the Generalized Assignment Problem (GAP), which assign values to variables while respecting local resource constraints and maximizing overall utility. GAP is known to be NPcomplete. GAP can be extended to incorporate features such as dynamism and constraints between roles. This Extended GAP (E-GAP) can then be converted to a DCOP. 3 CHAPTER ONE – Extended Generalized Assignment Problem A generalized assignment problem assigns roles from a set R to agents from a set E, maximizing the value of the assignment while respecting local resource constraints. The value of an assignment is determined by the capabilities agents have for performing roles. The capability of each agent ei ∈ E to perform each role rj ∈ R, given by: Cap(ei, rj) [0, 1] This capability represents the agent’s competence at performing that role, its chance of success, and all other factors that may affect the success of its performing the role. These factors may differ greatly from task to task, but it is reasonable to assume that there may be classes of roles sharing similar factors. An agent will then have equal capability to perform any role in this class. For example, an agent’s capability to fight a fire may be vastly different from its ability to render medical aid, but the agent’s capability to perform any of several firefighting roles corresponding to different fires will not differ significantly. The capability of an agent ei ∈ E to perform any role rj ∈ Class can then be represented by Cap(ei, Class) [0, 1] where ∀ rj, rk ∈ Class, Cap(ei, rj) = Cap(ei, rk) In addition to capabilities, each agent also has resources which are used to perform roles. While there can be different kinds of resources, I will restrict focus 4 here to the case where there is a single type of resource, which can usually be thought of as time. Furthermore, all agents will have an identical amount of resources, normalized to 1.0. The amount of resources required for an agent ei ∈ E to perform role rj ∈ R is given by Resources(ei, rj). Let A = (ai,j) be an allocation matrix such that 1 if ei is performing r j ai , j 0 otherwise Then the goal of GAP is to find the matrix A that maximizes the value f(A), where f(A) is given by f ( A) Cap(e , r ) a i ei E r j R j i, j such that ei E , Resources (e , r ) a i r j R j i, j 1.0 and r j R, a ei E i, j 1 That is, GAP maximizes the sum of the capabilities with which each role is being performed, while ensuring that no agent exceeds its resources and no role is performed by more than one agent. With a GAP formulation of role allocation, agents are able to perform multiple roles at once. However, in order to incorporate additional aspects such as dynamics and constraints between tasks, GAP must be explicitly extended. 5 Extended GAP (E-GAP) accommodates dynamics by allowing E, R, Cap, and Resources to all vary by time. This means that a single allocation A is no longer sufficient; instead, the solution to E-GAP is a sequence of allocations, A, indexed by time. Each item in the sequence is an allocation of roles to agents for a discrete time step. Coordination constraints ≍ exist on sets of roles. The only constraint that I will be considering here is a simultaneous-execution constraint, although many others are possible. The constraint I will be considering is an AND-constraint, which specifies that the team receives a benefit for an agent performing a role in the constrained set only if all roles in that set are simultaneously performed. Let ANDk ∈ ≍ be an AND-constraint on the set RANDk. Then we can quantify the value of performing a task in the AND-constrained set RANDk by Cap(ei , rj ) if ∀ ei ∈ E, ∀ rj ∈ RANDk, Val(ei, rj, ≍) = 0 otherwise a rmRANDk enE n ,m RANDk This just says that the value for an agent performing an AND-constrained role is the agent’s capability for performing that role if all the roles in the constrained set are being executed, and 0 otherwise. For roles that do not take part in an ANDconstraint, the value of performing that role is just the capability of the agent performing the role, as in GAP. The goal of E-GAP is to maximize f ( A ) Val(ei, rj, ≍, t) ai,j,t t ei E r j R 6 such that ∀ t, ∀ ei ∈ E, Resources (e , r , t ) a i r j R j i , j ,t 1.0 and ∀ t, ∀ rj ∈ R, a ei E i , j ,t 1 That is, maximize the sum of all total rewards over all time steps (up to a finite horizon), respecting local resource constraints at all time steps and ensuring that no task is ever executed by more than one agent at a time. 7 CHAPTER TWO – Low Cost, Approximate DCOP LA-DCOP provides an approximate solution to E-GAP by straightforwardly reformulating the problem as a DCOP. Agents are mapped to DCOP variables and roles are mapped to DCOP values. Because an agent in E-GAP can perform multiple roles at the same time, the DCOP variables must be able to take on multiple values at the same time. Capabilities are mapped to rewards associated with a variable taking on a value. Resource constraints are mapped to constraints on the values a variable can simultaneously have. The important constraint in E-GAP that a role cannot be performed by more than one agent at a time leads to a complete graph of not-equals constraints in DCOP. Complete graphs are difficult for conventional DCOP algorithms [3]. LADCOP surmounts this difficulty through the use of tokens. A token is created for each value (role) and distributed to the team. An agent can assign a value to its variable only if it holds the corresponding token for that value. If an agent chooses not to assign to its variable the value of a token, it passes that token on to another agent. This ensures that no value is assigned to more than one agent at a time, and limits the amount of communication needed for ensuring this constraint to the passing of the tokens. The central decision that an agent must now make is which tokens to keep and which tokens to pass on. For the tokens that it keeps, the agent assigns the corresponding values to its variable. For the tokens that it chooses to pass on, the agent must decide to whom it will pass each token. An agent’s decision is 8 constrained by the constraints imposed by E-GAP. In particular, it must ensure that it has sufficient resources to assign the values of all tokens it chooses to keep. This is straightforward to check. A more complicated choice is posed by whether the best interests of the team are served by the agent keeping tokens that it can assign while respecting local resource constraints. In some cases, the team may be better served by the agent passing along a token that it could have held, thereby allowing the token to be acquired by a more capable agent who will better perform the associated role. This is a decision theoretic choice that can be made based on the expected value to the team of passing on the token and the expected value to the team of keeping the token. To simplify decision-making for agents (who may not have sufficient information to make these calculations), a threshold can be calculated and attached to the token. The threshold represents the minimum capability that an agent must have for performing the role associated with the token in order to keep the token. An agent need only check its capability to perform the role against the threshold; if it has lower capability than the threshold, it passes on the token, otherwise, it keeps the token. In this way, distributed constraint optimization is converted to distributed constraint satisfaction, which is a much easier task. If the thresholds are chosen appropriately, this can be done in a way that maintains satisfactory solutions. Ideally, the threshold would be computed using a probabilistic model with sufficient information about the team and the world-state to ensure that agents pass 9 the threshold test if and only if the expected value to the team of their keeping the token is higher than the expected value to the team of passing the token on. This would take into account all agents’ capabilities, the time it would take to reach an agent with sufficiently high capability who will be able to execute the role (given resource constraints), and the opportunity cost of not performing the role immediately. While it may be possible to identify such thresholds using exact information on the distribution of tasks and agents in certain limited situations, this is impractical for large systems because of the complexity of the calculation. In addition, dynamism will quickly render a calculated threshold obsolete, and recalculating the threshold every time the world-state changes will return LA-DCOP to one of the problems of existing role allocation algorithms. Instead, the threshold can be calculated using knowledge only of the distributions of capabilities and roles, which is information that may be much more widely available. In addition, it allows the same calculated threshold to hold up for different problem instances drawn from the same distribution and provides a natural way to factor dynamism into the calculation by its impact on the distributions. Another way to find the appropriate threshold values is by using dynamic thresholds, which essentially are a very crude learning mechanism whereby tokens approximate the correct threshold as they pass through the system. The threshold value attached to a token is lowered each time the token is passed. 10 Once the tokens to be passed are identified, the agent must choose to whom they will be passed. It is at this point that LA-DCOP allows agents to use information they may have on the structure of the team and which agents may have high capabilities. At the very least, agents should not pass the token back to agents that have rejected it because of the threshold, since this could cause thrashing. Thus, a history is attached to each token recording which agents have rejected it. AND-constraints pose special problems because they require the simultaneous execution of tasks. This introduces the possibility of deadlock and starvation. To see how deadlock can occur, consider a simple example with two agents, Agent1 and Agent2, and four roles, A, B, C, and D, each of which requires an agent’s full resources to perform, thus limiting agents to performing only one role at a time. Suppose that A and B are AND-constrained and C and D are ANDconstrained. It may be that Agent1 has higher capability to perform A than any other role. Similarly, suppose that Agent2 has higher capability to perform C than any other role. Then if Agent1 gets the token for A, it will prefer to perform A above all other tasks, and likewise for Agent2 with C. Agent1 now waits for Agent2 to perform B, while Agent2 waits for Agent1 to perform D. Deadlock has occurred, and no roles are performed. Starvation, while less serious, would happen if C and D were not AND-constrained. Then Agent1 would choose to perform A and wait forever for Agent2 to perform B, while Agent2 happily performs C. Deadlock and starvation are just extreme cases of the inefficiencies that could emerge if agents 11 naively ignore AND-constraints in choosing which role to perform and simply wait for other agents to perform roles that are constrained to their own. To improve performance and avoid, or at least reduce, deadlock and starvation, LA-DCOP uses potential tokens for AND-constrained roles. The idea is that the tokens for each set of AND-constrained roles are given to a single agent, which will be the “owner” for that set of roles. For each of the AND-constrained roles, the owner creates a small number of potential tokens, which are circulated among the agents instead of the normal token representing that role. Potential tokens can be passed or accepted just like normal tokens, but whereas accepting a normal token means that the agent will perform the corresponding role, accepting a potential token just commits the agent to performing the role when a coalition to perform all of the constrained tasks is found. An agent that has accepted a potential token is free to perform other roles while waiting for the coalition to be fully formed. An agent that accepts a potential token is called a retainer for the corresponding role, and is said to be retained for the role, in the same sense that attorneys or other professionals in the real world are retained for their services. The owner of the set of roles (who need not be a retainer itself) is informed whenever an agent is retained or an agent previously retained opts out of the retainer agreement. When the owner detects that an agent has been retained for each role, it chooses one of the retainers for each of the constrained roles and issues them the corresponding real tokens. This is called locking the coalition. All potential tokens for the constrained roles are then revoked. Because of the asynchrony of the algorithm, there is still a small chance that a lock 12 will fail (i.e., the owner will think that there is a retainer for each role when in fact one of the agents has since left the retainer agreement). Figure 1 shows the basic algorithm that is run by each agent. It is event-driven, handling each message as it arrives. If the message is a token, the agent first checks to see if its capability to perform the role passes the threshold test. If not, the token is summarily passed on. If it does, then the agent Figure 1: Algorithm1: Agent algorithm. The algorithm run by each agent. Algorithm 1: Agent Algorithm VARMONITOR(Cap, Resources) (1) V ← Ø, PV ← Ø (2) while true (3) msg ← GETMSG() (4) if msg is token (5) token ← msg (6) if token.threshold < Cap(token.value) (7) if token.potential (8) PV ← PV ∪ token.value (9) SENDMSG(token.owner, “retained”) (10) else (11) V ← V ∪ token.value (12) (13) if vV Resources (v) 1.0 (14) (15) (16) (17) (18) (19) out ← V – MAXCAP(V) foreach v in out PASSON(new token(v)) V ← V – out foreach pv in PV if vV pv Resources (v) 1.0 (20) if pv is not in MAXCAP(V∪pv) (21) PV ← PV – token.value (22) SENDMSG(pv.owner, “released”) (23) PASSON(new token(pv, potential)) (24) else (25) PASSON(token) (26) else if msg is “lock v” (27) PV ← PV – v (28) V ← V ∪ v (29) else if msg is “release v” (30) PV ← PV – v temporarily accepts the token while deciding what to do next. The agent then calculates what subset of its current real tokens (including the newly accepted token, 13 if it was a real token) maximizes its output while satisfying its resource constraint (this calculation is carried out by the MAXCAP function); all other real tokens are kicked out. The agent then goes through its list of potential tokens (including the newly accepted token, if it was a real token) and determines for each potential token if its role would be executed were it actually a real token. If so, the token is kept (and the agent stays retained); if not, the token is passed on (and the agent ends the retainer). This ensures that if a coalition lock is sent out and the agent is called upon to fulfill its retainer agreement, the agent will perform the role based on its greedy role selection mechanism, thereby preventing coalition failure. Tokens are passed on using the PASSON function which also updates the threshold for the token being passed, if dynamic thresholds are being used. 14 Figure 2 shows the algorithm executed by the owner of each set of AND-constrained tasks. The owner first distributes the potential tokens, then waits for at least one potential token to be accepted for each role. Once the owner determines that an agent has been retained for each role, Figure 2: Algorithm2: AND-constraint owner algorithm. The algorithm run by the owner of each AND-constrained set Algorithm 2: Monitoring AND constrained roles ANDMONITOR(V) (1) foreach v ∈ V (2) for 1 to No. of Potential Values (3) PASSON(new token(v, potential)) (4) (5) - Wait until at least one potential token for each (6) - real token is accepted (7) while vV | Retained [v] | 0 (8) msg ← getMsg() (9) if msg is “retained v” (10) Retained[v] ← Retained[v] ∪ msg.sender (11) else if msg is “release v” (12) Retained[v] ← Retained[v] – msg.sender (13) (14) - Choose one potential token holder for each token (15) - to send real token (16) foreach v ∈ V (17) a* = ∀ a ∈ Retained[v], Cap(a*, v) > Cap(a, v) (18) SENDMSG(a*, “lock v”) (19) foreach a ∈ Retained[v] (20) if a ≠ a* (21) SENDMSG(a, “release v”) it locks the coalition and releases the other retainers. 15 CHAPTER THREE - Experimental Results LA-DCOP was empirically evaluated using an abstracted simulator that offered fine control over a broad range of parameters. The results show that it outperforms or performs comparably to other role allocation schemes. I also tested the algorithm using various parameter settings which were not replicable using other algorithms. The experiments are summarized in Table 1; I will present the results for each experiment below. Parameters Table 1: Summary of experiments performed. Output model Economic GAP or non-GAP non-GAP GAP NonEconomic non-GAP GAP AND or no AND no AND AND no AND AND no AND AND no AND AND Summary of experiments Compared to DSA and greedy algorithms; Tested static and dynamic thresholds Tested retainers Tested static thresholds No experiments performed Tested static and dynamic thresholds No experiments performed No experiments performed No experiments performed For all experiments, I assumed five different classes of roles, and hence five capability types. Agents were assigned capabilities independently of each other, and each of the five capabilities of each agent was also determined independently of that agent’s other capabilities. Tasks were initially assigned to agents randomly and the 16 simulation run for 1000 timesteps; the simulator was run 20 times for each parameter setting and the results averaged over the 20 runs. Experiments can be classified into eight general categories using three parameters (this is shown in Table 1). LA-DCOP is compared to other algorithms only in the non-economic output, non-GAP, no-AND constrained case, because this is the only configuration that the other algorithms support. Output model: The first parameter specifies the type of output model used. Two output models are supported: the economic output model and the non-economic output model. With economic output, agents performing a role receive value for that performance on each time step. Under the non-economic output model, roles have durations, and value is received only if a single agent continuously performs the role for the entire duration of the role. Note that the output generated under the noneconomic output model will generally be less than that generated under the economic output model, ceteris paribus. If a role has a duration of x time steps, then that role could potentially generate x times the output under an economic output model as it could under a non-economic output model. Correspondingly, both thresholds and the greedy kick-out mechanism use output per remaining execution time to compare the values of performing roles. This biases choices toward shorter roles. GAP: The second parameter indicates whether agents can only perform a single role at a time (this is a non-GAP setting) or can execute multiple roles at once (this is a GAP setting). For GAP problems, tasks require an amount of resources randomly chosen from the set {0.25, 0.5, 0.75}. Each agent has 1.0 resources that it 17 can expend each time step in performing roles. Thus, an agent can simultaneously perform 4 roles at most, and 2 roles on average. AND-constraints: The third parameter specifies whether AND constraints exist between roles. Other parameters that are varied in these experiments are the number of agents, the number of roles, the chance for an agent to have a non-zero capability, the amount of dynamism in the environment, and the threshold used for determining if agents pass on tokens. Five parameters are unique to AND-constrained experiments: the size of the AND-constrained sets, the percentage of roles that were AND-constrained, whether retainers were used, and if retainers were used, the maximum number of roles for which an agent could be retained and the maximum number of agents a role could retain. In addition, two parameters were used for experiments using dynamic thresholds, one specifying the type of function used to update thresholds, and the other specifying the rate at which thresholds were modified. Percentage Capable: For each agent, capabilities were determined independently. An agent had a random chance, equal to the percentage capable parameter, to have a non-zero capability for each class of roles. If an agent was determined to have a non-zero capability for a class of roles, then its actual capability to perform that class of roles was drawn from a uniform(0, 1) distribution. For example, if percentage capable were 0.6, then each agent would have a 60% chance of having a non-zero capability for each capability type. As a result, for each class 18 of roles, about 60% of the agents would have some capability to perform that role, and of those with non-zero capabilities, the average capability would be 0.5 (since those capabilities would be randomly chosen between 0 and 1). Dynamism: Dynamism is used in two different ways, depending on whether economic output or non-economic output is used. If economic output is used, dynamism reflects a percentage of roles that are randomly replaced by new roles on every time step. Hence, dynamism of 0.1 corresponds to 10% of the roles being replaced on every time step. New roles are chosen randomly among the different classes of roles. If non-economic output is used, dynamism is used to find the average number of timesteps that roles must be performed for output to be generated. Average duration of a role is calculated as the reciprocal of dynamism; hence, dynamism of 0.1 corresponds to an average role duration of 10 time steps. Individual durations are drawn from discrete uniform distributions over positive values with a minimum duration of 1 timestep and a maximum of twice the average duration. For example, when dynamism is 0.1, the minimum duration of a role can have is 1 time step and the maximum is 20 timesteps, with the average duration of roles being 10 timesteps. k-AND: For simplicity, AND-constraints were assumed to always involve a fixed number of roles. Thus, for a 3-AND setting, all AND-constraints involve three roles which all must be performed simultaneously for output to be received from any of them. I tested AND-constraints using 2-AND, 3-AND, 4-AND, and 5-AND settings. 19 Retainers: When retainers are used, I will use the notation “x/y retainers” to denote that agents can be retained by for at most x roles, and each AND-constrained role can retain at most y agents (y potential tokens are created for each token corresponding to an AND-constrained role). Comparisons LA-DCOP was tested against two other algorithms. The first was DSA, an established anytime, approximate DCOP algorithm [2]. DSA provides a good baseline for comparing both the output and communication of LA-DCOP. The other algorithm compared to LA-DCOP was a centralized greedy algorithm that allocated roles to agents at each timestep by sequentially going through the tasks and choosing the available agent with the highest capability to perform that task. The greedy algorithm provides a reasonable approximation of the optimal allocation, which, while theoretically calculable, was too computationally expensive to achieve for the problem sizes under consideration. Because of the limitations of both DSA and the simple greedy algorithm, these comparisons could only be made in the non-GAP case using the economic output model with no AND-constraints. Also, all agents have non-zero capabilities to perform all roles. 20 DSA: Figure 3 shows the output for DSA and LA-DCOP as the number of agents and roles increases. The number of agents and roles increases from 10 to 1400. Results are shown for LA-DCOP run with two different thresholds, 0.0 (no threshold) and 0.5. This experiment was designed to show how LA-DCOP scales with the size of the input problem, and in this case, the number of agents is equal to the number of roles. Output here is measured in average output per role per timestep, which is always a value between 0.0 and 1.0. Figure 4 plots the total communication for this experiment and Figure 5 shows the communication per role per timestep, comparing DSA and LA-DCOP with 0.0 and 0.5 thresholds. Figure 3: Comparison of output for DSA and LA-DCOP. Economic output model, non-GAP, no AND-constraints, static environment. The number of agents is equal to the number of roles. Every agent has a non-zero capability to perform every role. 0.9 0.7 0.6 DSA 0.5 LA-DCOP, 0.0 threshold 0.4 LA-DCOP, 0.5 threshold 0.3 0.2 0.1 problem size (#agents = #roles) 1310 1210 1110 1010 910 810 710 610 510 410 310 210 110 0 10 output per role per timestep 0.8 21 Figure 3 clearly shows that LA-DCOP outperforms DSA in output when the number of agents is equal to the number of roles. DSA output averages around 0.5 for each role during each timestep. This demonstrates that roles under DSA are no more likely to be performed by highly capable agents than less capable agents, since the average capability is 0.5. In contrast, the greedy, token-based algorithm used by LA-DCOP is able to allocate roles to agents that have high capabilities to perform those tasks, even without using thresholds (threshold of 0.0). Using a threshold of 0.5 yields an even better allocation in terms of output. Figure 4: Comparison of total communication for DSA and LA-DCOP. Economic output model, non-GAP, no AND-constraints, static environment. The number of agents is equal to the number of roles. Every agent has a non-zero capability to perform every role. 100000000 1000000 100000 DSA 10000 LA-DCOP, 0.0 threshold LA-DCOP, 0.5 threshold 1000 100 10 1 10 11 0 21 0 31 0 41 0 51 0 61 0 71 0 81 0 91 10 0 1 11 0 1 12 0 1 13 0 10 communication (messages) 10000000 problem size (#agents = #roles) 22 Figure 4 shows the total amount of communication required for DSA and LA-DCOP. Figure 5 shows the communication per role in each timestep for DSA and LADCOP. Communication is measured in messages passed between two agents; “broadcast” messages to all agents are counted as N different messages, where N is the number of agents. A variable under DSA broadcasts its value to all variables whenever its value changes, and so the number of messages per role used for DSA increases as the number of agents increases. In contrast, a variable under LA-DCOP sends a single message when it passes a token to another agent, thereby achieving communication per role that is independent of the number of agents. Figure 5: Communication per role for DSA and LA-DCOP. Economic output model, non-GAP, no AND-constraints, static environment. The number of agents is equal to the number of roles. Every agent has a non-zero capability to perform every role. 1000 DSA 100 LA-DCOP, 0.0 threshold LA-DCOP, 0.5 threshold 10 problem size (#agents = #roles) 1310 1210 1110 1010 910 810 710 610 510 410 310 210 110 1 10 communication per role (messages) 10000 23 These communication behaviors are shown in Figure 5, where the number of messages per role per timestep increases with the number of agents and roles for DSA and holds steady LA-DCOP. It is also not surprising that the total communication for DSA increases faster than it does for LA-DCOP. Higher thresholds for LA-DCOP also increases communication, though not as dramatically. This is because higher thresholds mean that tokens are less likely to be accepted by an agent (because more agents will have capabilities too low to pass the threshold test), and so there is increased communication. Figure 6: Comparison of output per role for a centralized greedy algorithm and LA-DCOP. Economic output model, non-GAP, no AND-constraints, static environment. The number of agents is equal to the number of roles. Every agent has a non-zero capability to perform all tasks. 0.9 0.8 0.6 Greedy 0.5 LA-DCOP, 0.0 threshold 0.4 LA-DCOP, 0.5 threshold 0.3 0.2 0.1 problem size (#agents = #roles) 49 0 45 0 41 0 37 0 33 0 29 0 25 0 21 0 17 0 90 13 0 50 0 10 output per role per time 0.7 24 Greedy: Figure 6 shows the output for the centralized greedy algorithm and LA-DCOP as the number of agents and tasks increases. The number of agents is equal to the number of roles, and increases from 10 to 500. Results for LA-DCOP with two different threshold settings, 0.0 (no thresholds) and 0.5 are shown. LADCOP performs very well compared to the centralized greedy algorithm. There is no comparison of communication since the greedy algorithm is centralized and hence has no communication. Figure 7: Effect of percentage of capable agents and threshold on output. 500 agents, 500 roles, economic output, non-GAP, no AND-constraints, static environment. 0.9 0.8 10% capable output per role per time 0.7 20% capable 0.6 30% capable 40% capable 0.5 50% capable 60% capable 0.4 70% capable 80% capable 0.3 90% capable 0.2 100% capable 0.1 0 0 0.1 0.2 0.3 0.4 0.5 threshold 0.6 0.7 0.8 0.9 25 Economic output, Non-GAP, no AND constraints Figures 7 and 8 show the effect of thresholds on output for varying percentages of agents with non-zero capabilities for each role class. Both figures are based on the same data, but Figure 7 varies the threshold on the x-axis and Figure 8 varies the percentage of capable agents on the x-axis. This experiment involved 1500 agents and 1500 roles and a static environment. Output increases as the percentage of capable agents increases. This makes sense because a higher percentage of capable agents means that there are more Figure 8: Effect of percentage of capable agents and threshold on output. 500 agents, 500 roles, non-economic output, non-GAP, no AND-constraints, static environment. 0.9 0.8 0.0 threshold output per role per time 0.7 0.1 threshold 0.6 0.2 threshold 0.3 threshold 0.5 0.4 threshold 0.5 threshold 0.4 0.6 threshold 0.7 threshold 0.3 0.8 threshold 0.2 0.9 threshold 0.1 0 10% 20% 30% 40% 50% 60% 70% 80% percentage of capable agents 90% 100% 26 agents with high capabilities that can take on roles. The threshold value that maximizes output also increases as the percentage of capable agents increases. The higher threshold forces tokens to be passed until the roles can be performed by agents with capabilities higher than the threshold. However, if thresholds become too high, there is a lack of agents with sufficiently high thresholds, and so some roles can never be performed. In this case, thresholds actually hurt performance, because they prevent roles that might otherwise have been performed (albeit by less capable agents) from being performed at all. Figure 9 shows the output received for each Figure 9: Effect of percentage of capable agents and threshold on output of each performed role. 500 agents, 500 roles, non-economic output, non-GAP, no AND-constraints, static environment. 1.2 1 output per performed role 10% capable 20% capable 0.8 30% capable 40% capable 50% capable 0.6 60% capable 70% capable 80% capable 0.4 90% capable 100% capable 0.2 0 0 0.1 0.2 0.3 0.4 0.5 threshold 0.6 0.7 0.8 0.9 27 Figure 10: Effect of thresholds and percentage of capable agents on the percentage of roles that are performed in each timestep. 500 agents, 500 roles, economic output, non-GAP, no ANDconstraints, static environment. percentage of roles performed per timestep 1.2 1 10% capable 20% capable 0.8 30% capable 40% capable 50% capable 0.6 60% capable 70% capable 80% capable 0.4 90% capable 100% capable 0.2 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 threshold role performed; note that as thresholds are increased, the output per performed role increases to always stay above the threshold. However, Figure 10 shows that the percentage of roles being performed per timestep decreases dramatically as thresholds increase. The net result is that output degrades quite rapidly for high thresholds, emphasizing the need for thresholds to be selected carefully. Economic output, non-GAP, AND constraints Figures 11 - 13 show the output achieved by LA-DCOP when there are 2AND constraints between roles in a static environment and no thresholds. These experiments were conducted with 1500 agents and 1500 roles. Figure 9 shows the 28 base LA-DCOP algorithm output using no retainers, Figure 12 shows the output using 1/1 retainers, and Figure 13 shows the output using 5/5 retainers. In Figure 11, differences between outputs for different percentages of ANDconstrained roles are greatest for low percentages of capable agents. As the percentage of capable agents increases, this difference becomes small, and when all agents have some capability to perform all roles, there is no difference in output for different percentages of AND-constrained roles. Even at low percentages of capable agents, however, the absolute difference in outputs for the different percentages of AND-constrained roles is low. Furthermore, the output is close to that received Figure 11: 2-AND-constrained results, no retainers. Effect of percentage of capable agents and percentage of AND-constrained. 1500 agents, 1500 roles, no thresholds. 0.8 0.7 10% AND output per role per time 0.6 20% AND 30% AND 0.5 40% AND 50% AND 0.4 60% AND 70% AND 0.3 80% AND 90% AND 0.2 100% AND 0.1 0 10% 20% 30% 40% 50% 60% 70% 80% percentage of capable agents 90% 100% 29 when there are no AND-constraints (see the 0.0 threshold line in Figure 8, which suggests there is little room to improve performance in this case by using retainers. Indeed, we see in Figure 12 that adding minimal 1/1 retainers does little to change performance. There is a small increase of about 0.02 or 0.03 in output for low percentages of AND-constrained roles, but this is partially offset by a loss of about 0.01 in output at high percentages of AND-constrained roles and high percentages of capable agents. Using 5/5 retainers did much to reduce any difference between output for different percentages of AND-constrained tasks, as shown in Figure 13. The outputs Figure 12: 2-AND-constrained results, 1/1 retainers. Effect of percentage of capable agents and percentage of AND-constrained. 1500 agents, 1500 roles, no thresholds. 0.9 0.8 10% AND output per role per time 0.7 20% AND 0.6 30% AND 40% AND 0.5 50% AND 60% AND 0.4 70% AND 80% AND 0.3 90% AND 0.2 100% AND 0.1 0 10% 20% 30% 40% 50% 60% 70% 80% percentage of capable agents 90% 100% 30 for all percentages of AND-constrained tasks are very close to the unconstrained outputs in Figure 8. Figure 13: 2-AND-constrained results, 5/5 retainers. Effect of percentage of capable agents and percentage of AND-constrained. 1500 agents, 1500 roles, no thresholds. 0.9 0.8 10% AND output per role per time 0.7 20% AND 0.6 30% AND 40% AND 0.5 50% AND 60% AND 0.4 70% AND 80% AND 0.3 90% AND 0.2 100% AND 0.1 0 10% 20% 30% 40% 50% 60% 70% 80% percentage of capable agents 90% 100% 31 Figure 14: 5-AND-constrained results, no retainers. Effect of percentage of capable agents and percentage of AND-constrained. 1500 agents, 1500 roles, no thresholds. 0.8 0.7 10% AND output per role per time 0.6 20% AND 30% AND 0.5 40% AND 50% AND 0.4 60% AND 70% AND 0.3 80% AND 90% AND 0.2 100% AND 0.1 0 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% percentage of capable agents Retainers made much more of a difference when the AND-constrained set sizes were larger. Figures 14, 15, and 16 show the output for no retainers, 1/1 retainers, and 5/5 retainers in a 5-AND case, also with equal numbers of roles and agents and no thresholds. The results suggest that retainers may only be useful for larger sets of AND-constrained roles. However, for high percentages of capable agents, retainers actually lower the output, by as much as 0.10 per task for 1/1 retainers, and to a much lesser extent for 5/5 retainers. 32 Also, the number of agents that can be retained for each role and roles for which each agent can be retained seems to factor significantly into the performance in this case. Figure 15 shows the output using 1/1 retainers. The curve is drastically different in shape than in the case of no retainers or 5/5 retainers. Figure 15: 5-AND-constrained results, 1/1 retainers. Effect of percentage of capable agents and percentage of AND-constrained. 1500 agents, 1500 roles, no thresholds. 0.8 0.7 10% AND output per role per time 0.6 20% AND 30% AND 0.5 40% AND 50% AND 0.4 60% AND 70% AND 0.3 80% AND 90% AND 0.2 100% AND 0.1 0 10% 20% 30% 40% 50% 60% 70% 80% percentage of capable agents 90% 100% 33 As shown by the performance of the 5/5 retainers in both Figure 13 and Figure 16, retainers can make a significant impact on performance for ANDconstrained tasks, greatly reducing the decrease in output associated with increasing percentages of AND-constrained roles. Figure 16: 5-AND-constrained results, 5/5 retainers. Effect of percentage of capable agents and percentage of AND-constrained. 1500 agents, 1500 roles, no thresholds. 0.8 0.7 10% AND output per role per time 0.6 20% AND 30% AND 0.5 40% AND 50% AND 0.4 60% AND 70% AND 0.3 80% AND 90% AND 0.2 100% AND 0.1 0 10% 20% 30% 40% 50% 60% 70% 80% percentage of capable agents 90% 100% 34 Economic output, GAP, no AND Recall that roles require resources to perform a role in a GAP problem. For the GAP experiments that I ran, roles require a number of resources randomly selected from {0.25, 0.5, 0.75}, which means that an agent can simultaneously execute 4 roles at most, and 2 roles on average. Thus, it is reasonable to expect that a number of agents (all of which are capable of performing any role) could, on average, perform at most twice their number of roles simultaneously. This is the saturation load for the team, the expected maximum number of tasks that can be executed at once. This is different from the non-GAP case, where an equal number Figure 17: Effect of threshold and percentage of capable agents on output. 1500 agents, 1500 roles, Economic, GAP, no AND, static environment. 1 0.9 output per role per time 0.8 10% capable 20% capable 0.7 30% capable 0.6 40% capable 50% capable 0.5 60% capable 70% capable 0.4 80% capable 0.3 90% capable 100% capable 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 threshold 0.7 0.8 0.9 1 35 Figure 18: Effect of threshold and percentage of capable agents on output. 1500 agents, 3000 roles, Economic, GAP, no AND, static environment. 0.8 0.7 10% capable output per role per time 0.6 20% capable 30% capable 0.5 40% capable 50% capable 0.4 60% capable 70% capable 0.3 80% capable 90% capable 0.2 100% capable 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 threshold of agents and roles constitutes a fully loaded team, where every agent has a role to perform and all tasks can be simultaneously performed. Lower percentages of capable agents effectively reduces the number of agents for load purposes. Figure 17 shows how output changes as the threshold and the percentage of capable agents varies, when there are equal numbers of agents and roles. A greater percentage of capable agents increases output for all thresholds, and also increases the threshold that yields maximum reward. 36 Figure 19: Effect of threshold and percentage of capable agents on output. 1500 agents, 3000 roles, Economic, GAP, no AND, static environment. percentage of roles performed per timestep 1.2 1 10% capable 20% capable 0.8 30% capable 40% capable 50% capable 0.6 60% capable 70% capable 80% capable 0.4 90% capable 100% capable 0.2 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 threshold Figure 18 shows the case when there are twice as many roles as agents. This is the expected full load for the system. Note that thresholds yield much smaller gains as compared to Figure 17. In fact, Figure 18 is somewhat similar to Figure 7, which showed output for the fully loaded non-GAP case. The non-GAP case shows a greater decrease in output for very high thresholds. This is probably because the expected number of highly capable agents is the same in both cases, but in the GAP case, these agents are able to perform more tasks simultaneously. Figure 19 shows the percentage of tasks that are performed per timestep. Comparing this to Figure 10, which shows the same statistic for the non-GAP case, it is clear that far more roles are performed at high thresholds in the GAP case than the non-GAP case. 37 Figure 20: Effect of threshold and percentage of capable agents on output. 1500 agents, 1500 roles, Economic, GAP, no AND, 1% dynamism. 0.9 0.8 10% capable output per role per time 0.7 20% capable 0.6 30% capable 40% capable 0.5 50% capable 60% capable 0.4 70% capable 80% capable 0.3 90% capable 0.2 100% capable 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 threshold LA-DCOP also holds up well with dynamics. Figures 20 and 21 show output for dynamisms of 1% and 10%, respectively, in the case where there are equal numbers of agents and roles. As can be seen, performance degrades slightly, but even when 10% of the roles change on every timestep, LA-DCOP performs well and appropriate threshold values continue to have a big impact. 38 Figure 21: Effect of threshold and percentage of capable agents on output. 1500 agents, 1500 roles, Economic, GAP, no AND, 10% dynamism. 0.8 0.7 10% capable output per role per time 0.6 20% capable 30% capable 0.5 40% capable 50% capable 0.4 60% capable 70% capable 0.3 80% capable 90% capable 0.2 100% capable 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 threshold Non-economic output, non-GAP, no AND Figure 22 shows the output for different thresholds and percentages of capable agents under the non-economic output model with equal numbers of agents and roles and 10% dynamism. Recall that this means that roles have a minimum duration of 1 timestep and a maximum duration of 20 timesteps, with an average duration of 10 timesteps. Also, agents now base their threshold and kick out measurements using output per unit time of remaining work that needs to be done. This biases agents very heavily toward shorter tasks. For instance, if an agent has two tokens A and B representing roles that it can perform with capabilities 0.2 and 39 Figure 22: Effect of threshold and percentage of capable agents on output. 500 agents, 500 roles, Non-economic, non-GAP, no AND, 10% dynamism. 0.12 output per role per time 0.1 10% capable 20% capable 0.08 30% capable 40% capable 50% capable 0.06 60% capable 70% capable 0.04 80% capable 90% capable 100% capable 0.02 0. 9 0. 7 0. 5 0. 3 0. 1 0. 08 0. 06 0. 04 0. 02 0 0 threshold 0.9, respectively, but A has a duration of 1 timestep and B has a duration of 5 timesteps, then the agent will choose to keep A and kick out B, because the output per time is 0.2 for A and 0.18 for B. Also, outputs will be lower than under economic output models. One consequence is that thresholds must also be lower. Figure 22 shows very small thresholds on the left half and larger thresholds on the right. Note that the x-axis values are not to scale. The fact that the base greedy token algorithm without thresholds (threshold of 0.0) performs almost as well as or outperforms the algorithm with all thresholds may be a consequence of the fact that the minimum duration is 1 timestep, which is the same as the time it takes to pass a token. As expected, increasing the percentage of capable agents increases the output. 40 CHAPTER FOUR – Conclusion and Future Work As future multiagent domains increase in size and complexity, new coordination algorithms must be developed to keep cooperative teams running efficiently and effectively. LA-DCOP meets those rising challenges and performs distributed, asynchronous, anytime role allocation for teams of cooperative agents. Its scalability and effectiveness is a good indicator of its power in a domain known to be intractable. Empirical simulation shows that LA-DCOP can match centralized algorithms and greatly outperform other distributed, approximate algorithms while using orders of magnitude fewer messages. Thresholds, one of the distinguishing features of LA-DCOP, improved performance greatly in many instances, but also impaired performance under some circumstances. Thresholds can be harmful when they exclude from performing roles those agents that would perform roles in the optimal allocation. This happens especially when the number of tasks greatly exceeds the capacity of agents to execute them. However, in the opposite circumstance, when there is an excess of agents for the number of roles, thresholds can yield substantial improvements. Potential tokens and associated retainers were very effective at mitigating the inefficiencies of assembling a coalition to perform sets of constrained tasks. While decreases in performance were seen in some cases, these were small when compared to the gains when constrained sets were large. It is clear that a mathematical theory must be developed to address some of these issues, such as selecting a proper threshold to get maximal benefit. There is 41 also substantial room to incorporate additional constraints besides just ANDconstraints. Sequential constraints, where roles must be performed in a certain order for reward to be gained, is one possibility. The success of LA-DCOP in outperforming DSA indicates that the difficulties posed by multiagent coordination are not inherently insurmountable, but just limited by our current technology. LA-DCOP and other new algorithms will push those boundaries farther out to admit tomorrow’s multiagent domains. VARMONIT OR(Cap, Resources) (1) V ← Ø, PV ← Ø (2) while 42 true (3) msg ← GETMSG() (4) if BIBLIOGRAPHY msg is token [1] C. Castelpietra, L. Iocchi, D. Nardi, and R. Rosati. Coordination in multi(5) agent autonomous cognitive robotic systems. In Proceedings of the Second token International Cognitive RoboCup Workshop, 2000. ← msg (6) [2] S. Fitzpatrick and L. Meertens. An experimental assessment of a stochastic, if anytime, decentralized, soft colourer for sparse graphs. In Stochastic token.thres Algorithms: Foundations and Algorithms, Proceedings SAGA, 2001 hold < Cap(token. [3] P. Modi, W. Shen, M. Tambe, and M. Yokoo. An asynchronous complete value) method for distributed constraint optimization. In Proceedings of the Second (7) if token.potential International Conference on Autonomous Agents and Multiagent Systems, (8) PV ← PV ∪ token.value 2003. (9) SENDMSG(token.owner, “retained”) (10) [4] R. Nair, M. Tambe, and S. Marsella. Role allocation and reallocation in else multiagent teams: Towards a practical analysis. In Proceedings of the Second International Conference on Autonomous Agents and Multiagent (11) V ← V ∪ token.value Systems, 2003. (12) (13) [5] O. Shehory and S. Kraus. Task allocation via coalition formation among if autonomous agents. In Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, 1995. out ← V – MAXCAP(V) [6] P. Stone and M. Veloso. Task decomposition, dynamic role assignment, and foreach v in out PASSON(new token(v)) low-bandwidth communication for real-time strategic teamwork. Artificial V ← V – out Intelligence 110, pages 241-273, June 1999. (14) (15) (16) (17) (18) foreac h pv in PV (19) if (20) (21) (22) (23) (24) (25) [7] G. Tidhar, A. S. Rao, and E. A. Sonenberg. Guided team selection. In Proceedings of the 2nd International Conference on Multi-agent Systems (ICMAS-96), 1996. if pv is not in MAXCAP(V∪pv) PV ← PV – token.value SENDMSG(pv.owner, “released”) PASSON(new token(pv, potential)) else PASSO N(token) (26) else if msg is “lock v” (27) PV ← PV – v (28) V ←V∪v