Artificial Agents Play the “Mad Mex Trust Game”:

Artificial Agents Play the “Mad Mex Trust Game”: A Computational Approach1 Wu, D.J., S. Kimbrough and F. Zhong Abstract We investigate the “Mad Mex Trust Game,” which cannot easily be represented in strategic form. In outline the game is as follows. N players of various types are free to negotiate with each other. The players and their types are identified and known to the other players. Players of a given type produce a particular commodity at a certain rate. The well being of the individual players depends upon having a mixture of commodities; hence the players have incentive to negotiate trades with players of other types. After arriving at an agreement, there is a fulfillment stage. Players are free to renege on their agreements, and players are able to remember who has reneged and who hasn't. Will cooperative behavior emerge and under what conditions? What are some of the efficient and effective mechanisms for trust building in electronic markets? How will these mechanisms affect the emergence of trust and cooperative behavior? What are the key ingredients in building distributed trust and what destroys trust? This game constitutes a more realistic model of negotiation support systems in electronic markets, particularly on the Internet. 1 This material is based upon work supported by, or in part by, DARPA contract DASW01 97 K 0007. File: Trust_HICSS35_J22.doc. Partial support by a mini-Summer research grant and a research fellowship by the Safeguard Scientifics Center for Electronic Commerce Management, LeBow College of Business at Drexel University are gratefully acknowledged. Corresponding author is D.J. Wu, his current address is: 101 North 33rd Street, Academic Building, Philadelphia, PA 19104. Email: wudj@drexel.edu. The Java code we developed in this study can be downloaded for interested readers at www.lebow.drexel.edu/wu300/trust_game/. 1 1. Introduction An important aspect of electronic commerce is that often it is not trusted (Tan and Thoen 2001), since it is often difficult for a user to figure out who to trust in online communities (Dusgopta 1988; Schillo and Funk 1999). Recently much interest has developed from researchers into how to build trust in electronic markets operation in such environments as the Internet. The literature has approached the study of trust with various well-defined “trust games”. Basically there are two versions of the trust game, the classical trust game in economics or the investment game (Lahno 1995; Berg et al. 1994; Erev and Roth 1998), and the “Electronic Commerce Trust Game” or the “Mad Mex Trust Game”2. The former is well studied in the economics literature and it is regarded as revealing and of fundamental importance in social interaction and knowledge management, as fundamental as the prisoner’s dilemma game (Hardin 1982; Lahno 1995). The latter is due to Kimbrough and Tan where the players exchange goods such as red sauce or green sauce rather than money. In this paper, we focus on the Mad Mex Trust Game using artificial agents. We leave the Economics Trust Game to a subsequent paper where we plan to use the agent-based approach as well. Will trust/cooperative behavior emerge? If so, under what conditions (and when and how)? Put in another way, what are the conditions that promote of trust/distrust? How do we explain/reveal/understand the behavior of agents (What they are doing, why they are doing what they are doing)? The ultimate goals are to study the effects of markets, characteristics of markets, and market mechanisms associated with systems of artificial agents. After its place of conception, a restaurant near the University of Pennsylvania, by Kimbrough and Tan. 2 2 The contributions of this paper are the integration of several strands of research literature: The trust literature (we focus here on the computational approach of social trust); The Electronic Communities and Electronic Markets literature (we focus here on what kind of market mechanisms would facilitate trust and cooperation); what kind of market mechanism would disrupt trust and cooperation. The rest of the paper is organized as follows. Section 2 provides a brief literature review. Section 3 outlines our key research methodologies and implementation details. Section 4 reports our experimental findings for two agents. Section 5 reports further experiments where an additional player, the third agent has been introduced. Section 6 summarizes and discusses future research. 2. Literature Review There are roughly two major streams in the trust literature. One stream is interested in developing trust technology (example: security technology such as password setting or digital watermarking). Representative work can be found in the recent special section of CACM (December 2000). The second stream focuses on social trust (Shapiro 1987) and the work on social capital (e.g. Uslaner 2000). Our interest is in the latter, e.g., an online trader can well have access to the trading system but not cooperate with the other online traders due to self-interest. In particular, we are interested in trust based on cooperation (Gűth et al. 1997), i.e., social trust is viewed as cooperative behavior. Methodologically, there are several approaches to the study of trust, illustrating a broad interest from several disciplines in social sciences. These include the behavioral approach (e.g., Das and Teng 1995; Mayer, Davis and Schoorman 1995); the philosophical and logical approach (e.g., Tan 2000; Tan and Thoen 2001); the computer science approach (e.g., Holland and Lockett 1998; Zacharia, 3 Moukas and Maes 1999); the sociology approach (e.g. Shapiro 1987); the psychology approach (e.g. Gűth et al. 1997); the classical economics approach (e.g. Ellison 1993) and the experimental economic approach (e.g. Engle-Warnick 2000; Erev, E. and Roth 1998; Sundali, Israeli and Janicki 2000). In this paper, we use an interdisciplinary approach that integrates the economics and computer science approaches, or the computational economics approach. Details of our methodology and framework are provided later in section 3. We now define our specific trust game. As mentioned earlier, there are several versions of the trust game. The following is our version of the game, known in the literature as the investment game. There are two players, the principal and the agent. The principal has some amount of money to invest, say x, so he hires an agent (she) to do this for him. The agent, in term, gives the money to an investor or a broker, who invests the money in the market, and truthfully reports to the agent on the return of the investment, say 3x. The agent then decides how to split with the principal on the profit. The game is played repeatedly, i.e., the principal has the choice to whether to hire the agent or not. Under some regularity conditions, it has been shown in the literature that trust can be built if the game is played repeatedly (Lahno, 1995). In the Mad Mex Trust game, the money is replaced with goods. This game cannot easily be represented in strategic form. In outline the game is as follows. N players of various types are free to negotiate with each other. The players and their types are identified and known to the other players. Players of a given type produce a particular commodity at a certain rate. The well being of the individual players depends upon having a mixture of commodities; hence the players have incentive to negotiate trades with players of other types. After arriving at an agreement, there is a fulfillment 4 stage. Players are free to renege on their agreements, and players are able to remember who has reneged and who hasn't. We now describe our research framework and methodology in more detail. 3. Methodology and Implementations In our framework, artificial agents are modeled as finite automata (Hopcroft and Ullman 1979). This framework has been adopted by a number of previous investigations. Among them, Rubinstein (1986), Sandholm and Crites (1995), Miller (1996) and many others used it to study interated prisoner’s dilemma (IPD). Kimbrough, Wu and Zhong (2001a) used it to study the MIT “Beer Game”, where genetic learning artificial agents played the game and managed a liner supply chain. Wu and Sun (2001a, b) investigate the electronic market off-equilibrium behavior of artificial agents in a price and capacity bidding game using genetic algorithms (Holland 1992). Arthur et al. (1996) modeled a realistic stock marketplace composed of various genetic learning agents. Zhang, Kimbrough and Wu (2001) study the ultimatum game using reinforcement learning agents. These are merely examples to illustrate the acceptance of this framework in the literature. The reader is referred a Kimbrough and Wu (2001) for a survey. In this study, we depart from previous research by integrating several stranded approaches. First, we study a different game, namely, the Mad Mex Trust game, rather than games such as the RPD, Beer, Bidding, or Ultimatum; Second, in studying this game, we use a computational/evolutionary approach, in comparing classical or behavioral game theoretical approaches with artificial agents; Third, our agents are using a reinforcement learning regime (Sutton and Barto 1998), Q-learning, as a 5 learning mechanism in game playing. Previous studies of the trust game are not computational (with the exception of Zacharia et al., where they employed a reputation rating mechanism). Finally, our agents are identity-centric rather than strategy-centric as used in previous studies (e.g., Kimbrough, Wu and Zhong 2001a; Wu and Sun 2001a, b). That is, our agents may meaningfully be said to have individual identities and behavior. They are not just naked strategies that play and succeed or fail. Individuals, rather than populations as a whole, learn and adapt over time and with experience. Fielding these kinds of agents, we believe, is needed for e-commerce applications. We now describe our model and prototype implementations in more detail in the framework of Q learning. This includes how the rules or state-action pairs are embedded in our artificial agents, how the rewards were set up, what was the long-term goal of the game (returns), and finally the specific Q learning algorithm we designed. Rules (State-Action pairs): The Q-learning algorithm estimates the values of state-action pairs Q(s, a). At each decision point, the state of an agent is decided by all the information in its memory history, e.g. its own and opponent’s last trade volume. The possible action at this decision point that an agent can take is any number between zero and its endowment. In this sense, the agent’s strategy is the mapping from its memory of the last iteration to its current action. To balance exploration and exploitation, we use the -greedy method to choose randomly with trivial probability. The value of  starts from 0.3, and then decreases to 0.01 by steps of 0.000001. Rewards: 6 The instant reward an agent can get is determined by a modified Cobb-Douglas function that is the mixture of the amounts of different types of sauces the agent posses after each episode. U = ∏ ai1/n Where n is the number of the types of comedies in the market. We chose this Cobb-Douglas utility function for our simulation because commodity A and B have equal weight for the agents. Returns: The long-run return is simply the total utility an agent obtained after playing every episode so far. R = ∑ Ui The goal of an agent at any iteration is to select actions that will maximize its discounted long-term return following certain policies. The use of Q-learning ensures that the artificial agents are nonmyopic. Q learning: The learning algorithm used by the artificial agents is one-step Q-learning described as following: Initialize Q(s, a) to zero Repeat From the current state s, select an action a using -greedy method Take action a, observe an immediate reward r, and the next state s’. Q(s, a) = Q(s, a) +  [r +  maxa’ Q(s’, a’) – Q(s, a)] s  s’ Until the end of a trial The experiment runs with the learning rate  = 0.05 and discount factor  = 0.95. The values of  and  are chosen to promote learning of cooperation. 4. Two-Agent Experiment Design and Results 7 We compare the following trust mechanisms: (1) Moving average of the past five observations; (2) Exponential smoothing; (3) Zacharia-Moukas-Maes reputation rating; (4) Tit-for-tat; and (5) Most recent move (or last move). We are interested to seeing if under each of the above five mechanisms, will agents trust each other and if so, will cooperative behavior converge. By comparing these five mechanisms, we are interested to see which mechanism does better in terms of building trust and promoting social welfare. Experiment one: Two Q-learner agents play against each other Two learning agents play the iterated trust game. In each iteration, both agents start with the same amount of endowment. Player A first offers its commodity to player B, upon receiving the commodity, player B decides how much of his commodity he will trade. Since there is no third party agent to deal with the transaction, the exact number of the trade volume is open to both parties and thus there is no asymmetric information. The whole experiment is a long trial, i.e. two artificial agents play the game an indefinite number of times. To test the validity of the experiment and analyze the results, the endowment is set to be rather small, 3, so that each agent has 4 * 4 * 4 = 64 state-action pairs. Based on the utility function, one agent learns to switch its trade volume between 3 and 0, the other agent learn to switch between 0 and 2 correspondingly through long iterations. This can be illustrated as: Agent A (trade volume): 3 0 3 0 3 0 … (utility) :060606… Agent B (trade volume): 0 2 0 2 0 2 … (utility) :909090… 8 Thus the utility of the first agent changes between 0 and 6, which gives it an average of 3; the second agent gets either 9 or 0 in turn, which gives it 4.5 on average. Although this is still not as good as the following result that will give both of the agents utility of 4.5 on average, it’s better than sticking on trading 1 unit or 2 units all the time. Agent A (trade volume): 3 0 3 0 3 0 … (utility) :090909… Agent B (trade volume): 0 3 0 3 0 3 … (utility) :909090… Experiment two: Two Q-learner agents with reputation index play against each other Experiment two includes a set of sub-experiments to test the efficiency of different reputation mechanisms. At the end of any time period t, both agents rate each other. The rating to one’s opponent, ri’, is simply the ratio of the opponent’s trade volume Vi’ and the endowment N: ri’ = Vi’ / N The reputation index will also be updated based on this rating according to different mechanisms. The value of this reputation index will be normalized in each mechanism, where 1 is perfect and 0 is terrible. Now the strategies of each agent are the mappings from the reputation information to possible actions. We specifically test the following four reputation mechanisms. 1. Moving Average The value of the reputation index is simply the arithmetic average of the most recent five ratings. 2. Exponential Smoothing The reputation index is the weighted average of the past ratings. 9 3. Reputation Rating (Zacharia-Moukas-Maes) Introduced by Zacharia, Moukas and Maes (1999), every agent’s reputation index will be updated after each iteration based on its reputation value in the last iteration Rt-1, its counterpart’s reputation R’ and the rating it received for this iteration Wt. So the recursive estimate of the reputation value of an agent at time t can be expressed by: Rt = Rt-1 + 1/θ * (Φ(Rt-1) R’(Wt – Et)), Φ(R) = 1 – 1/ (1 + e-(R – D)/σ), Et = Rt-1/D, Wt = Vt’ / N Where Vt’ is the trade volume of an agent’s counterpart, D is the range of the reputation values, which is 1 here. 4. Tit-for-tat Under this mechanism, the agents will trade the amount that its counterpart traded to it in the last time period, or Vt = Vt-1’. 5. Performance Comparison of various mechanisms The total utility of each agent under different reputation mechanisms is compared in figures 1 and 2. Furthermore, the joint utility of both agents under different reputation mechanisms is compared in figure 3. -----------------------------------------Insert Figure 1 here ------------------------------------------- 10 -----------------------------------------Insert Figure 2 here ----------------------------------------------------------------------------------Insert Figure 3 here ------------------------------------------Furthermore, we assign different reputation mechanisms to the two agents and compare the total points after 300,000 iterations. In the following set of figures, the performance of each reputation mechanism vs. other mechanism is described. -----------------------------------------Insert Figures 4a – 4e here ------------------------------------------- 5. Three-Agent Experiment Design and Results Three agents selling two goods In this experiment, three agents trade two types of goods, i.e., Agents B and C produce the same type of good, while agent A produces a different type. At the beginning of each episode, agent A chooses the agent with higher reputation value from agent B and C to give his goods. The reputation of the chosen agent and agent A will be upgraded after each episode. All three agents are assigned the same reputation mechanism. We test the performance of different reputation mechanisms: moving average, last move, exponential smoothing and Zacharia-Moukas-Maes reputation rating. Figure 5 displays the total utility of agent A under these mechanisms, the experiment shows that the “most recent” 11 reputation mechanism quickly wins out against the others, except tit-for-tat. Here, if all agents are using tit-for-tat, then obviously, all agents would cooperate and each agent would achieve its best performance. -----------------------------------------Insert Figure 5 here ------------------------------------------- We now study the impact of additional trade partner by comparing the total utility of agent A in two agent and three agent contexts. Not surprisingly, agent A benefits from the introduction of the third player as there is now competition between agent B and C. The results are summarized in Table 1. Table 1: Total Utility of Agent A in 2-agent and 3-agent context. Moving Average Most Recent Move Exponential Smoothing Zacharia-Moukas-Maes Three agents 506829.6 577475.8 423258.9 452451.9 Two agents 337441 279490.9 449496.1 449606.6 What if A uses a different fixed reputation mechanism such as tit-for-tat, while Agent B and C are using another but identical reputation mechanisms? Will agent A benefit from such a differentiation? The results are somewhat mixed and show that the performance depends on what the others are using, as summarized in Table 2. Table 2: Total Utility of Agent A with Tit-for-Tat strategy playing against other reputation mechanisms in twoagent and three-agent environments. Moving Average Most Recent Move Exponential Smoothing Zacharia-Moukas-Maes Two agents 413824.6 423078.2 394672.8 423752.8 12 Three agents 418158.5 389489.9 429542.0 382129.0 Three agents selling three goods We now let agent C sell the third type of sauce, i.e., we are considering the general case when each agent has a different good. The endowment at each period/episode of each agent is set to 3, reflecting a steady state production rate of reach agent (i.e., each agent can produce a fixed amount of goods during a period). Now we describe the trade game. At the beginning of each episode, each agent decides simultaneously how many to give to the other two agents, expecting an exchange from them. It turns out that the system can turn quickly into being too complicated to be tractable even in the two agent or the three agent learning situation (Sandholm and Crites 1995; Marimon, McGrattan and Sargent 1990). In our setting, it will be very difficult for the Q-learning agents to learn the true value of the state-action functions. In this initial exploration, we will start with one agent (A) learning by fixing the strategies of the other two agents (B and C). We note here in passing that we leave the case of three agents learning simultaneously for future research. We have identified from the literature the following heuristics for players B and C, which were suggested by previous research, including the nice, nasty, fair, and modified tit-for-tat strategies (e.g. Axelrod 1984; Axelrod and Hamilton 1981). For benchmarking purpose, as in the literature, we also add the random strategy. Table 3 describes these five strategies. Table 3: Possible Strategies of Player B and C. Random Nasty Fair Nice Modified Tit-4-Tat Agent randomly decides how many goods to give to the other two agents. IF V’t-1 = 0 THEN Vt = 1 ELSE Vt = 0 IF V’t-1 = 0 THEN Vt = 0 ELSE Vt = 1 Agent always gives 1 unit of its good to each of the other two agents. (Vt = 1) Agent always gives the amount its opponents gave to it in the last episode (Vt = V’t-1) when the total amount it got from last episode doesn’t exceed its 13 endowment. If the total amount exceeds its endowment, the agent will give its good to the other agents proportional to the amount he was given in the last episode. We experiment with the behavior of our learning agent (A) in the following typical scenarios of agent B and C3. Agents B and C are using the following strategies: random and random; tit-for-tat and random; tit-for-tat and nasty; tit-for-tat and fair; tit-for-tat and tit-for-tat; tit-for-tat and nice; and finally fair and nice. Our interest is to investigate the performance of agent A using the above five reputation mechanisms. Figure 6a - 6e display the results for moving average, most recent move, exponential smoothing, Zacharia-Moukas-Maes rating and tit-for-tat mechanisms accordingly. -----------------------------------------Insert Figure 6a – 6e here ------------------------------------------- Can the artificial intelligent agent learn to cooperate? The answer is yes based on the above experiment. When using exponential smoothing, agent A seems to learn slowly and performance is a bit inferior; otherwise it performs well under all other reputation mechanisms. The value of intelligence (Zhong, Kimbrough and Wu 2001) is further confirmed in this study, i.e., intelligence pays. The learning agent can quickly learn how to exploit the other agents’ fixed strategies. 3 We note here that it is straightforward to conduct a statistical significant test of all possible combinations of agent B and C’s strategies. However, we choose not to do so since we believe such statistical formalism would not add much additional insights into our interest here. 14 The results show that the emergence of trust depends on the strategies used by B and C, or the “climate”. When the climate is nice, then the agent will learn to cooperate and the social welfare is maximized and rather fairly distributed (almost equally split). In terms of the comparison of the five different reputation mechanisms, except for the exponential smoothing reputation mechanisms, there does not seem to exist any significant difference in building trust. This is so, due to the commonality of these mechanisms, which is to forgive or discount previous actions taken by other parities. This is interesting and the role of forgiveness in promoting trust building deserves further investigation. Again, we leave this for a subsequent project. Overall, this experiment demonstrates the promise of artificial intelligent agents in the Mad Mex trust game, and indeed in market negotiation contexts generally. 6. Conclusions and Future research It is well known in the literature that trust will emerge in repeated games such as the Max Med Trust game studied here. However, this study deepens previous work by examining when trust will and when trust will not emerge using a framework that allows parameter settings. The agents here are identity-centric using reinforcement Q-learning and the performance has been compared with the strategy-centric agents. Artificial agents using Q learning have been found to be capable of playing the Mad Mex Trust game efficiently and effectively. Cooperative behaviors have emerged and conditions for such cooperation or trust building has been studied experimentally. Several efficient and effective mechanisms for trust building in electronic markets have been tested and compared. The study explores, initially, how these mechanisms affect the emergence of trust and cooperative behavior. Key ingredients in building 15 as well as destroying distributed trust have been experimented. Can we find characteristics of trusting/distrusting systems? Our initial yet original exploration shed light on this. We believe our Mad Mex Trust game constitutes a more realistic model of negotiation support systems in electronic markets, particularly on the Internet. We are actively investigating other forms of trust games, including the classical investment game (or the trust game) as well as the ultimatum game (see Zhong, Kimbrough and Wu 2001 for initial results). Of particular interest, we plan to investigate a closed related game, the Santa Fe Bar Game, first proposed by Arthur (1994). In the long-run, we hope to develop computational principles for understanding social trust. 16 7. References 1. Arthur, B. “Inductive Reasoning and Bounded Rationality,” The American Economic Review, V. 84, No. 2, pp. 406-411, May 1994. 2. Arthur, B., Holland, J., LeBaron, B., Palmer, R., and Tayler, P. “Asset Pricing Under Endogenous Expectations in an Artificial Stock Market,” Working Paper, Santa Fe Institute, December 1996. 3. Axelrod, R. The Evolution of Cooperation. Basic Books, New York, N.Y., 1984. 4. Axelrod, R., and Hamilton, W. “The evolution of cooperation”, Science, Vol. 211, No. 27 pp. 1390-1396, March 1981. 5. Berg, J., Dickhault, J., and McCabe, K. “Trust, Reciprocity, and Social History,” Games and Economic Behavior, 10, pp. 122 – 142, 1994. 6. CACM (Communications of ACM), Special Section on Trusting Technology, http://www.acm.org/cacm/1200/1200toc.html, Vol. 43, No. 12, December 2000. 7. Das, T.K., and Teng, B.-S. “Between Trust and Control: developing confidence in partner cooperation in alliances”, Academy of Management Review, Vol. 23, No. 3, pp. 491-512, 1998. 8. Dusgopta, P. “Trust as a Commodity,” In D. Gambetta, editor, Trust: Making and Breaking Cooperattive Relations, pp. 49 – 72. Blackwell, Oxford and New Work, 1988. 9. Erev, E. and Roth, A. “Predicting How People Play Games: Reinforcement Learning in Experimental Games with Unique, Mixed Strategy Equilibria,” The American Economic Review, 88, pp. 848 – 881, 1998. 10. Ellison, G. “Learning, Local Interaction, and Coordination,” Econometrica, V. 61, No. 5, pp. 1047-1071, September 1993. 11. Gűth, W., Ockenfels, P., and Wendel, M. “Cooperation Based on Trust: An Experimental Investigation,” Journal of Economic Psychology, 18, 15 – 43, 1997. 12. Hardin, R. “Exchanges Theory on Strategic Bases,” Social Science Information, Vol. 21, No. 2, pp. 251-272, 1982. 13. Holland, C.P., Lockett, A.G., “Business Trust and the Formation of Virtual Organizations”, Proceedings of the 31st Annual Hawaii International Conference on System Sciences (HICSS31), IEEE Computer Society, 1998. 17 14. Holland, J. Artificial Adaptive Agents in Economic Theory. The American Economic Review, 81, pp. 365 – 370, 1975. 15. Hopcroft, J. and Ullman, J. Introduction to Automata Theory, Languages and Computation. Addison-Wesley, Reading, MA, 1979. 16. Kimbrough, S., Wu, D.J., and Zhong, F. “Computers Play the Beer Game: Can Artificial Agents Manage the Supply Chain?” HICSS-34, 2001. 17. Lahno, B. “Trust, Reputation, and Exit in Exchange Relationships”, Journal of Conflict Resolution Vol. 39, No. 3, pp. 495-510, 1995. 18. Marimon, R., McGrattan, E., and Sargent, T. “Money as a Medium of Exchange in an Economy with Artificially Intelligent Agents”, Journal of Economics Dynamics and Control. Vol. 14, pp. 329-373, 1990. 19. Mayer, R.C., Davis, J.H., and Schoorman, F.D., “An Integrative Model of Organizational Trust”, Academy of Management Review, Vol. 20, No 3, pp. 709-734, 1995. 20. Miller, J. “The Coevolution of Automata in the Repeated Presioner’s Dilemma,” Journal of Economic Behavior and Organization, 29, pp. 87-112, 1996. 21. Rubinstein, A. “Finite Automata Play the Repeated Prisoner’s Dilemma”, Journal of Economic Theory 39, pp. 83-96, 1986. 22. Sandholm, T., and Crites, R. “Multiagent Reinforcement Learning in Iterated Prisoner's Dilemma,” Biosystems, 37, pp. 147 - 166, 1995. Special Issue on the Prisoner's Dilemma. 23. Schillo, M. and Funk, P. “Who Can You Trust: Dealing with Deception”, in Proceedings of the Workshop Deception, Fraud and Trust in Agent Societies at the Autonomous Agents Conference, pp. 95-106, 1999. 24. Shapiro, S. P. “The Social Control of Impersonal Trust”, The American Journal of Sociology, Vol. 93, No. 3, pp. 623-658, 1987. 18 25. Sundali, J., Israeli, A., and Janicki, T. “Reputation and Deterrence: Experimental Evidence from the Chain Store Game,” Journal of Business and Economic Studies, Vol. 6, No. 1, pp. 1– 19, Spring 2000. 26. Tan, Y. H. and Thoen, W. “Formal Aspects of a Generic Model of Trust for Electronic Commerce,” Working Paper, Erasmus University Research Institute for Decision and Information Systems (EURIDIS), Erasmus University Rotterdam, The Netherlands, 2001. 27. Uslaner, E.M. “Social Capital and the Net,” Communications http://www.acm.org/cacm/1200/1200toc.html, Vol. 43, No. 12, December 2000. of ACM, 28. Van der Heijden E.C.M., Nelissen, J.H.M., Potters, J.J.M, and Vernon, H.A.A. “Simple and Complex Gift Exchange in the Laboratory,” Working Paper, Department of Economics and CentER, Tilburg University. 29. Wolfram, S. Cellular Automata and Complexity. Addison-Wesley Publishing Company, Reading, MA, 1994. 30. Zacharia, G., Moukas, A., and Maes, P. “Collaborative Reputation Mechanisms in Electronic Marketplace,” HICSS-33, 1999. 31. Zhong, F., Kimbrough, S. and Wu, D.J. “Cooperative Agent Systems: Artificial Agents Play the Ultimatum Game”, Working Paper, The Wharton School, University of Pennsylvania, 2001. 19 500000 450000 400000 Utility 350000 Avg. Rrecency Smoothing Maes Tit-4-Tat 300000 250000 200000 150000 100000 50000 0 0 00 0 30 00 0 0 27 00 0 0 24 21 00 0 0 0 18 00 0 0 00 0 15 00 0 00 0 12 90 00 0 60 00 0 30 0 0 Time Figure 1: Total utility of agent A under different mechanisms for 300,000 episodes 600000 500000 Avg. Recency 300000 Smoothing Maes Tit-4-Tat 200000 100000 00 30 00 00 27 00 00 24 00 00 00 00 21 00 00 18 00 15 12 00 00 0 90 00 0 00 0 60 00 0 0 30 Utility 400000 Time Figure 2: Total utility of agent B under different mechanisms for 300,000 episodes 20 900000 800000 700000 Avg. Recency Smoothing Maes Tit-4-Tat 500000 400000 300000 200000 100000 0 0 00 0 30 0 00 0 27 00 0 0 24 21 00 0 0 0 18 00 0 0 00 0 15 00 0 00 0 12 90 00 0 60 00 0 0 0 30 Utiltiy 600000 Time Figure 3: Joint utility of both agents under different mechanisms for 300,000 episodes 900000 800000 700000 600000 Avg. 500000 Opponent 400000 Joint 300000 200000 100000 0 aa ar as 4a 21 am at 900000 800000 700000 600000 Recency Opponent Joint 500000 400000 300000 200000 100000 0 ra rr rs rm rt 4b 900000 800000 700000 600000 Smoothing 500000 Opponent 400000 Joint 300000 200000 100000 0 sa sr ss sm st 4c 900000 800000 700000 600000 Maes 500000 Opponent 400000 Joint 300000 200000 100000 0 ma mr ms 4d 22 mm mt 900000 800000 700000 600000 Tit-4-Tat 500000 Opponent 400000 Joint 300000 200000 100000 0 ta tr ts tm tt 4e Figure 4:The performances of different reputation mechanisms when playing against each other. ( a - moving average, r - most recently, s - exponential smoothing, m - Maes, t - tit4tat) 700000 600000 avg. 400000 recent smoothing 300000 mae 200000 100000 90 00 0 12 00 00 15 00 00 18 00 00 21 00 00 24 00 00 27 00 00 30 00 00 60 00 0 0 0 30 00 0 Utility 500000 Time Figure 5: Total Utility of Agent A in three agents context. 23 450000 400000 350000 300000 Agent A 250000 Agent B 200000 Agent C 150000 100000 50000 0 random- t-random random t-nasty t-fair t-t t-nice fair-nice 6a 450000 400000 350000 300000 Agent A 250000 Agent B 200000 Agent C 150000 100000 50000 0 random- t-random random t-nasty t-fair t-t t-nice fair-nice 6b 450000 400000 350000 300000 Agent A 250000 Agent B 200000 Agent C 150000 100000 50000 0 randomrandom t-random t-nasty t-fair 6c 24 t-t t-nice fair-nice 450000 400000 350000 300000 Agent A 250000 Agent B 200000 Agent C 150000 100000 50000 0 randomrandom t-random t-nasty t-fair t-t t-nice fair-nice 6d 350000 300000 250000 Agent A 200000 Agent B 150000 Agent C 100000 50000 0 randomrandom t-random t-nasty t-fair t-t t-nice fair-nice 6e Figure 6: Total Utilities of the three agents under different reputation mechanisms and strategy combinations. 25

Artificial Agents Play the “Mad Mex Trust Game”:

Related documents

Products

Support

Artificial Agents Play the “Mad Mex Trust Game”:

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib