From: AAAI-00 Proceedings. Copyright © 2000, AAAI (www.aaai.org). All rights reserved. Coordination Failure and Congestion A. M. Bell*, W. A. Setharest, in Information and J. A. Bucklewl than sixty people Abstract The El Far-01or Santa Fe bar problem the action of other agents, may be an important source of congestion in large decentralized systems. sizes the difficulty The El Far01 or Santa Fe bar problem provides independent a simple paradigm for congestion and coordina- nism. tion problems that may arise with over utilization decentralized of the Internet. This paper recasts the problem framework and derives a simple go, but no one has a good time when the bar is overcrowded. Coordination failure, or agents’ uncertainty about in a stochastic Networks of coordinating agents without The analogy between a centralized resource allocation adaptive strategy that has intriguing optimization work (Sethares sion makers, each acting in their own best interests the dynamics and with limited knowledge, converge to a solution Internet that (optimally) with public goods prevail. solves a complex congestion and consider as well as in our 1998). and Huberman (1994) (1997) when externalities and is noted by Green- and Bell and Huberman properties; a large collection of decentralized de& of mecha- the bar problem wald, Mishra and Parikh (1998), previous empha- the actions Glance and Lukose of congestion on the similar to those found Unlike the standard pub- social coordination problem. A variation in which lic good framework, agents are allowed access to full information is not optimizing agents will not increase consumption of nearly as successful. a publicly available resource until it experiences an inefficient level of congestion. Introduction This paper focuses coordination diet the behavior on imperfect failure across information agents as a source and of in this scenario fully informed of other agents perfectly a good time. The only We uti- (1994) as a sim- of agents to coordinate plified model of a large class of congestion and coor- congestion in large decentralized dination posed by Arthur problems and economic that arise in modern engineering systems. El Far01 is a bar in Santa Fe. The bar is popular, but becomes when more than sixty people evening. Everyone systems. enjoys overcrowded attend on any given themselves when fewer *NASA Ames Research Center, Mail Stop 269-3, Moffett Field, CA 94035-1000, abell@mail.arc nasa.gov. ‘Department of Electrical and Computer Engineering, University of Wisconsin-Madison. tDepartment of Electrical and Computer Engineering, University of Wisconsin-Madison. Copyright @ 2000, American Association for Artificial Intelligence (www.aaai.org). All rights reserved. the bar would never be crowded and all patrons would have least in a deterministic lize the scenario if agents could %pre- In a previous 1998) we proposed gorithm based in aggregate of congestion, their actions. treatment (Sethares formation attendance and adaptive which in a decentralized the seemingly at is the inability a deterministic on habit agents to coordinate while avoiding source framework, Bell al- enabled environment random fluctuations that Arthur’s simulations demonstrated. Here we consider tic setting where agents’ ized by a probability time. the bar problem strategies of attending version of the adaptive a clearer problem statement, are character- that evolves over There are several advantages the stochastic in a stochas- to considering learning r+le: a simpler algorithm that is amenable general results librium to detailed We analyze characteristics the mixed analysis, of the system and pure strategy responding imize pleasure to of the cor- game. observed limiting agents leads them Pareto efficient information of the equilibria crucially available we show that to agents. to successfully emphasize leviating congestion failure. For coordinate example, role that to the agent goes more often the bar is uncrowded, supplying about system-wide and al- routers may not information parameter p. This learning Let agents an agent receives g is the payoff uncrowded for attending bar. Without M be the total maximum capacity of two strategies, stay home all s-f, ui(l, the ith (iteration) of agents a characteristic agent of agents counter attending pure > N - 1. where are (‘Qb”) such equilibria. parameter that ric pure strategy no agent another worse strategy Because better equilibrium of the variance Let by 1) and value Bernoulli evolution of the p;(k) pi(k) - p(iV(L) pi(k) - p(N(k) - p(N(lc) The operation - of the algorithm otherwise to N(k) -N and decreasing it proportionally occurs bar is crowded. making each period. that the equilibrium When the agent attends, proportionally is one symmet- results is not attends p percent to N(k) -N Since the pi(k) represent pi(k + 1) = pi(L). makes it feasible z;(k) is zero and simplicity in section does not of the scheme to analyze the resulting and as demonstrated In Arthur’s The if the probabili- Note that the stepsize over time. it to lie within 0 and 1. the agent does not attend decrease increasing if the bar is uncrowded ties, they must be constrained When behavior, . formulation of the problem, agents have access to information about attendance at the bar even on evenings attend. rithm efficient. (2) At each time k the agent flips a biased coin, attend- utilize strategy de- is uncomplicated. N - 1, and Pure is then N)‘zi(k) > 1 q(k) -N) pi(lc) is adjusted, There and zero z;(k) < 0 -N) pi(k). agents random pi(k) then the parameter where every agent Suppose that the agent initially (1) ing with probability in equilibrium of attending Zi(lc) = 0 for are no symmet- in attendance from agents’ mixed strategies Pareto how much to new informa- the instantaneous are independent The bar. The game off without There otherwise. if efficient: off. has the same probability be the fined by pi (k + 1) = 1 equilibria. are Pareto can be made agent ric mixed Nash defines that are 1 with probability p;(k) equilibrium There variables and N be the sixty agents choose to attend. Nash equilibria and N(lc) Let of pi at the time Ic. Let if 5 setting a Nash an where Si consists = g when I,_, strategies when exactly is pi. at time /c. Let p be tion and let pi(k) designate let h, be zero. by 0) and ui(O,s_i) = b when C,_i attends each agent changes pi in response 0 home, go to the bar (indicated In a deterministic only loss of generality u;(s~, s-i)] (indicated ui(l,s_;) s-i) for attending of an uncrowded is then G = [M, {Si}, there are M notation for the N spaces at the bar. The where the pi a crowded bar and for staying number of i=l b is the payoff payoffs: an agent receives the payoff received the state performance. Statement have identical Over time, about or stimulus-response. N(lc) = 5 Algorithm if rule can be interpreted the previous that L be a time number p slightly) to go less often this in the form of the as a kind of habit formation probability to max- experiences, (increases but prefers and ‘remembers’ Summarizing environment congestion the bar, painful p) if the bar is crowded gathers agents competing increased individual the agent on a Our with the desire and minimize more arises from coordination in a complex with more information result in better outcome. may play in creating that such as the Internet available while providing the critical exchange of In particular, the information equilibrium ac- on the nature leads to an inefficient information Consistent (to decrease depend the information results of the time. and equi- in relation equilibria The type and characteristics tually and more the dynamic This when they do not themselves can be incorporated into (a), giving the update pi(lc + 1) = 0 if pi(k) - ,u(N(~) -N) < 0 the algo- if pi(k) - p(N(k) 1 p;(k) which Arthur’s otherwise (3) the information structure used by agents. mation As will become structure of the algorithm. clear, this infor- as in (2) they utilize In contrast, (3) utilizes ‘%_ill to whether algorithm the bar is crowded if pi(Ic) - p sgn(N(k) 1 if p;(k) - N) - p sgn(l\r(k) -N) - ,U sgn(N(lc) -N) Behavior explores z;(k) xi(k) for 60 that ing 40 agents attend less and less frequently, with their probability This parameter of the population very near zero. appears nowhere in the algorithm statement; rather, it is an emergent prop- or erty of the adaptive solution to the El Far01 prob- lem. When agents follow this adaptive 0 section themselves parameter up- not: pi(k + 1) = system the agents have divided The probability pi(k) run. By the they go to El Far01 nearly every time. The remain- division A related version of the stochastic This final iteration, simulation of the agents has risen very near 1, indicating agents base their updates because agents base their decisions on Generic Figure 2 shows values of the probabilities over the course of a typical When the full record of attendance. pi(k) value and about half the time. into two groups. “partial information”. dates according about the optimal the bar is overcrowded is a key element in the behavior on only their own experiences information” far greater excursions > 1 - N) - p(N(L) mimics -N) Farol looks more like Cheers. < 0 Despite tic nature of agents of the adaptive converges to a pure strategy zi(le) > 1 strategy, El the stochas- learning rule it Nash equilibrium. (4) otherwise of the Algorithms the generic behavior of the when each of the M = 100 agents follows the strategy defined by (2) above. of the various simulations Though differ, a typical illustrated in Figure 1. The probabilities initialized randomly. details case is pi (0) were 80 $ 70 15’00 Ej 60 z 0 50 t?i < 40 time (iteration number) 1500 1000 500 Figure 2: Probabilities time (iteration number) with Partial Information Finally Figures 3 and 4 use the “full information” Figure I: Attendance with Partial Information algorithm (3) to investigate the effect of allowing the agents to update their probabilities Perhaps the most striking aspect of these simu- lations is the rapid convergence to near the optimal value of N = 60 and the associated variance of attendance. decline in the The outcome approaches that which would be chosen with centralized con- eration, whether they have personally in Arthur’s proximately over time, simulations. Note that the transient ences. In comparison, attendance is ap- does not decline and often the bar is overcrowded. behavior in the initial pe- that is, on its own experi- riods is masked by the long time scale. in Arthur’s should setup, there are the structure that seats in the bar ,often remain unfilled, information, Mean 60, but the variance indicating and makes the decision on local attended bar or not. This reflects the information trol, despite the fact that each agent is autonomous, to go (or not to go) based at every it- be compared to Figure Figufe 2; the probability 4 parameters for these agents continue to bounce ran- tration,” or similar responses domly about some fixed value as their probabilities tion, and that consequently, all increase lead to increased congestion. or decrease simultaneously in response to common informa- more information can to the same signals. Analysis of the Adaptive Solutions The first step in the analysis of the dynamic ior of the algorithms is to determine under which the means of the p(k) that is, to determine aged system. ’ 0 m 500 1500 1000 We consider the partial Taking the expectation E{Pi(k+ with Full Information 1)) = E{&(k)) assuming exactly 1 -@{(N(k) that the pi(k) information -N) G(k)), are not at the boundary when the update remains unchanged portion is zero, that is, when _$ 09 E{(N(k) ’c 08 Using (1) this can be rewritten 3 2 $2 a& Em gr 07 2 % fixed; of both sides of (2) gives points 0 or 1. This expectation x .z .z remain the steady states of the aver- case first. time (iteration number) Figure 3: Attendance behav- the conditions El& 0.6 Xj(k)-N) - N) G(k)) Q(k)} = S{(F j=l 0.5 since the xi(k) 03 otherwise. 0.2 pendent of pi(k) 1000 500 1500 is 1 with probability R(k)) pi(lc) and zero Because the term in parenthesis dent Bernoulli OO z:j(k)+l-N) j=1 j#a 04 0.1 = 0. ( recall that the zj(k) random variables) is inde- are indepen- this becomes M 2000 = (1 -N time (iteration number) + ~E{q(k)}) pi(k) ~ j=l j#i M Figure 4: Probabilities Somewhat their a Pareto efficient access with Full Information paradoxically, ordinate behavior noted a similar phenomena ulations sue a strict Several authors in transportation strategy, as a whole tion. Arnott, pur- changing their over De Palma show that congestion of the system if more than 25% of drivers have access to real time information and Lindsey PI (1-N+Cjf2~jW) pa(k) tk> . (5) i Consider 1991) can arise because of “concen- any candidate ones and N - M zeroes. steady state p* with N Let II be the indices of the ones and IO be the indices of the zeroes. there are two kinds of terms in (5). When Then i E 11, C$-pj=N-landso about conges(1996, (1- N + C;, c(k)) rout- how small the improvement degrades ‘&&+)) pi(k). have (1991) use sim- their current choice, the performance (I- N + In full vector form this is achieves that when individuals best response co- only when agents have and Jayakrishnan to demonstrate route no matter agents successfully and the system outcome to less information. ing. Mahmassani = M (l-N+xp;) p;’ = (I-N+N-1) pi’ = O.l (6) When i x%-ipj’ = N, IO, E pt = 0, and hence can be derived as an approximation neous gradient to an instanta- descent for minimization of the cost function j#a Hence p* is a steady 0 < pi the relevant (7) < 1 for at least one n. M i=l i=l In this case, term in (5) is M M is not of the form of N ones and N - M zeroes. Thus -N)” where state. any p* for which CyZ1 pj = N that Now consider = (E{N(k)} J(lc) j=l i=l (8) is the expected typical number gradient of attendees strategy at time k. The is to update the state us- ing j=l j=l j#n pi(k + I) = Pi(k) - P(#. =(l-N+N-p;)p;=(l-p;)p:. This state. With Hence the only steady states of algorithm are at p* consisting In particular, at pj = is not a steady In contrast, out for the Nash a similar “full information” analysis algorithm. Taking states = E(pi(k)} -PE{Fsj(k) occur when E{Cg, dpi(k) -N}. zj(k) which -N} = 0, N. j=l j=l any p’ of this with j=l xy=ip; algorithm. mixed strategy equilibria state ual probabilities a higher these are not of the El Far01 game un- less pi = .6 for every agent. the steady is an instantaneous of J(k). gradient In the limited The expected return in state. is lower for agents whose individare lower than average as they face actually observed and vice versa for those whose individual ties are higher than average. probabili- Consequently, a sensi- information Similarly, the (N(k) - N), assuming analogy, To further understand system we relate viduals to a global the global behavior the algorithms cost function. utilized The of the by indialgorithm (LMS) in the context adaptive J(k) these algorithms Mean Square oc- regardless Adding the a priori lim(2) and (3). = N in a steady the limited information equilibria based on the althe with full costs will be $-Var[N algorithm equilibrium their parame- update (k)]. sign of 4, can be derived from the absolute value cost function adjust .(IO) costs will be 0, whereas the expected rule might converge to a pure strategy agents this to a pure strategy ble learning ters slowly overtime. case E{N(k)} However, because converges that the bar will be crowded, -N). every iteration attendance. algorithms to the = 1, in the full information occurs gorithm probability P (N(k) its on pi(k) then gives the algorithms For both value gives this into (9) gives information curs only when xi(k) of the agent’s -N. approximation Substituting case this update N is a steady = Note that = I, and hence N(k) -N M pi(k + I) = pi(k) - E{erj(k)] =gE{zj(k)] =cpj(k) = state is w by its instantaneous -dJ(k) i e., whenever Hence E{N(k)} Replacing of both sides of (3) gives + I)} d;$L;)}. * 3 % =E(N(k)} carried j=l Steady From (8), the derivative -N) 1, b x (for g = state of this algorithm. consider the expectation E(G(~ mixed strategy .6 for all j as in (7), dJO =(E{N(k)} dpi(k) (2) of N ones and M - N zeroes. the symmetric equilibrium -0.98) J(lc) be zero and hence p’ is not a steady cannot (9) 2 LMS algorithm. are variants algorithms of linear filtering; = IE{N(k)} system - nil. By of the Least which are common identification (4) is an analog and of the signed The adaptive mechanism ized decision interests thus provides a large collection makers, a simple of decentral- each acting in their own best and with only limited knowledge, a complex lem. solution whereby congestion Moreover, atively rapid can solve and social coordination convergence (depending prob- to the solution on the initial is rel- conditions) and robust. References Arnott, R.; De Palma Information Facilities and Usage of Free-access with Stochastic International Arnott, Does Capacity Review Economic R.; De Palma Providing Traffic A.; and Lindsey, R. 1996. Congestible and Demand. 37(1):181-203. A.; and Lindsey, Information Transportation Congestion? R. 1991. to Drivers Reduce Research A 25A(5):309-318. Arthur, W. B. 1994. Bounded Rationality: American Economic Inductive Reasoning EZ Far01 The Review: Papers and Problem. and Proceed- ings 1994 84(May):406-411. Glance, N S.; Dynamics and Huberman, of Social B. A. 1994. Scientific Dilemmas. The Ameri- can (March):76-83. Greenwald, The A.; Mishra, Santa Fe Bar cal and Practical B.; and Parikh, Problem Revisited: Implications. R. 1998. Theoreti- Technical Report, New York University. Huberman, B. A.; cial Dilemmas 277(July and Lukose, and Internet R. M. 1997. Congestion. So- Science 25):535-537. Mahmassani, System H. S.; and Jayakrishnan, Performance Real-time ridor ,” and User Information Transportation in a Congested Research R. Response A 1991. Under Traffic Cor25A(5):293- 307. Sethares, W. A.; and Bell, A. M. 1998. An Adap- tive Solution of the to the El Far01 Problem. 36th Annual munication, Sept. 1998. Control, Allerton Proceedings Conference and Computing, on Com- Allerton IL,