CMSC 477/677
March 13, 2007
Prof. Marie desJardins
Multi-Agent Systems
–
–
Cooperative multi-agent systems
Competitive multi-agent systems
MAS Research Directions
– Organizational structures
–
–
Communication limitations
Learning in multi-agent systems
Jennings et al.’s key properties:
– Situated
–
–
Autonomous
Flexible:
Responsive to dynamic environment
Pro-active / goal-directed
Social interactions with other agents and humans
Research questions: How do we design agents to interact effectively to solve a wide range of problems in many different environments?
Cooperative vs. competitive
Homogeneous vs. heterogeneous
Macro vs. micro
Interaction protocols and languages
Organizational structure
Mechanism design / market economics
Learning
Cooperative MAS:
–
–
Distributed problem solving: Less autonomy
Distributed planning: Models for cooperation and teamwork
Competitive or self-interested MAS:
– Distributed rationality: Voting, auctions
– Negotiation: Contract nets
Distributed sensor network establishment
Distributed vehicle monitoring
Distributed delivery
Track vehicle movements using multiple sensors
Distributed sensor network establishment:
– Locate sensors to provide the best coverage
– Centralized vs. distributed solutions
Distributed vehicle monitoring:
–
–
Control sensors and integrate results to track vehicles as they move from one sensor’s “region” to another’s
Centralized vs. distributed solutions
Logistics problem: move goods from original locations to destination locations using multiple delivery resources (agents)
Dynamic, partially accessible, nondeterministic environment (goals, situation, agent status)
Centralized vs. distributed solution
Cooperative agents, working together to solve complex problems with local information
Partial Global Planning (PGP): A planningcentric distributed architecture
SharedPlans: A formal model for joint activity
Joint Intentions: Another formal model for joint activity
STEAM: Distributed teamwork; influenced by joint intentions and SharedPlans
Problem solving in the classical AI sense, distributed among multiple agents
– That is, formulating a solution/answer to some complex question
– Agents may be heterogeneous or homogeneous
– DPS implies that agents must be cooperative (or, if self-interested, then rewarded for working together)
(Grosz) -“Bratman (1992) describes three properties that must be met to have ‘shared cooperative activity’:
– Mutual responsiveness
–
–
Commitment to the joint activity
Commitment to mutual support”
Theoretical framework for joint commitments and communication
Intention : Commitment to perform an action while in a specified mental state
Joint intention : Shared commitment to perform an action while in a specified group mental state
Communication : Required/entailed to establish and maintain mutual beliefs and join intentions
SharedPlan for group action specifies beliefs about how to do an action and subactions
Formal model captures intentions and commitments towards the performance of individual and group actions
Components of a collaborative plan (p. 5):
–
–
–
–
Mutual belief of a (partial) recipe
Individual intentions-to perform the actions
Individual intentions-that collaborators succeed in their subactions
Individual or collaborative plans for subactions
Very similar to joint intentions
Implementation of joint intentions theory
–
–
–
Built in Soar framework
Applied to three “real” domains
Many parallels with SharedPlans
General approach:
–
–
–
Build up a partial hierarchy of joint intentions
Monitor team and individual performance
Communicate when need is implied by changing mental state & joint intentions
Key extension: Decision-theoretic model of communication selection (Teamcore)
Techniques to encourage/coax/force self-interested agents to play fairly in the sandbox
Voting : Everybody’s opinion counts (but how much?)
Auctions : Everybody gets a chance to earn value
(but how to do it fairly?)
Contract nets : Work goes to the highest bidder
Issues :
–
–
–
–
Global utility
Fairness
Stability
Cheating and lying
S is a Pareto-optimal solution iff
–
–
S’ ( x U x
(S’) > U x
(S) → y U y
(S’) < U y
(S)) i.e., if X is better off in S’, then some Y must be worse off
Social welfare, or global utility, is the sum of all agents’ utility
– If S maximizes social welfare, it is also Pareto-optimal (but not vice versa)
Which solutions are Pareto-optimal?
Y’s utility
Which solutions maximize global utility
(social welfare)?
X’s utility
If an agent can always maximize its utility with a particular strategy (regardless of other agents’ behavior) then that strategy is dominant
A set of agent strategies is in Nash equilibrium if each agent’s strategy S locally optimal, given the other agents’ i is strategies
–
–
No agent has an incentive to change strategies
Hence this set of strategies is locally stable
A
B Cooperate
Cooperate 3, 3
Defect
0, 5
Defect 5, 0 1, 1
Pareto-optimal and social welfare maximizing solution: Both agents cooperate
Dominant strategy and Nash equilibrium: Both agents defect
A
B
Cooperate
Cooperate 3, 3
Defect 5, 0
Defect
0, 5
1, 1
Why?
How should we rank the possible outcomes, given individual agents’ preferences (votes)?
Six desirable properties (which can’t all simultaneously be satisfied ):
– Every combination of votes should lead to a ranking
–
–
–
–
–
Every pair of outcomes should have a relative ranking
The ranking should be asymmetric and transitive
The ranking should be Pareto-optimal
Irrelevant alternatives shouldn’t influence the outcome
Share the wealth : No agent should always get their way
Pepperoni
Onions
Feta cheese
Sausage
Mushrooms
Anchovies
Peppers
Spinach
Plurality voting : the outcome with the highest number of votes wins
–
–
Irrelevant alternatives can change the outcome: The Ross
Perot factor
Borda voting : Agents’ rankings are used as weights, which are summed across all agents
Agents can “spend” high rankings on losing choices, making their remaining votes less influential
Binary voting : Agents rank sequential pairs of choices (“elimination voting”)
–
–
Irrelevant alternatives can still change the outcome
Very order-dependent
Many different types and protocols
All of the common protocols yield Paretooptimal outcomes
But … Bidders can agree to artificially lower prices in order to cheat the auctioneer
What about when the colluders cheat each other?
–
(Now that’s really not playing nicely in the sandbox!)
Simple form of negotiation
Announce tasks, receive bids, award contracts
Many variations: directed contracts, timeouts, bundling of contracts, sharing of contracts, …
There are also more sophisticated dialoguebased negotiation models
“Large-scale problem solving technologies”
Multiple (human and/or artificial) agents
Goal-directed (goals may be dynamic and/or conflicting)
Affects and is affected by the environment
Has knowledge, culture, memories, history, and capabilities (distinct from individual agents)
Legal standing is distinct from single agent
Q: How are MAS organizations different from human organizations?
Exploit structure of task decomposition
–
Establish “channels of communication” among agents working on related subtasks
Organizational structure:
–
–
Defines (or describes) roles, responsibilities, and preferences
Use to identify control and communication patterns:
Who does what for whom : Where to send which task announcements/allocations
Who needs to know what : Where to send which partial or complete results
Theoretical models: Speech act theory
Practical models:
– Shared languages like KIF, KQML, DAML
– Service models like DAML-S
– Social convention protocols
Send only relevant results at the right time
– Conserve bandwidth, network congestion, computational overhead of processing data
–
–
–
Push vs. pull
Reliability of communication (arrival and latency of messages)
Use organizational structures, task decomposition, and/or analysis of each agent’s task to determine relevance
Connectivity (network topology) strongly influences the effectiveness of an organization
Changes in connectivity over time can impact team performance:
– Move out of communication range
coordination
– failures
Changes in network structure
reduced (or increased) bandwidth, increased (or reduced) latency
Emerging field to investigate how teams of agents can learn individually and as groups
Distributed reinforcement learning: Behave as an individual, receive team feedback, and learn to individually contribute to team performance
Distributed reinforcement learning: Iteratively allocate “credit” for group performance to individual decisions
Genetic algorithms: Evolve a society of agents (survival of the fittest)
Strategy learning: In market environments, learn other agents’ strategies
Potential for change:
– Change parameters of organization over time
– That is, change the structures, add/delete/move agents, …
Adaptation techniques:
–
–
–
–
–
Genetic algorithms
Neural networks
Heuristic search / simulated annealing
Design of new processes and procedures
Adaptation of individual agents
Different types of “multi-agent systems”:
– Cooperative vs. competitive
– Heterogeneous vs. homogeneous
– Micro vs. macro
Lots of interesting/open research directions:
–
–
Effective cooperation strategies
“Fair” coordination strategies and protocols
–
–
Learning in MAS
Resourcelimited MAS (communication, …)