Presentation

advertisement
Metacognition for Effective
Deliberation in Artificial Agents
Darsana Josyula
18 November 2011
Bowie State University
Artificial Agents

Deliberation
–

Process by which agents plan the tasks to
perform in order to accomplish current goals
Action
–
Process by which agents perform each task in
the plan
Deliberation Time

Static Environments
–

Deadlines
Dynamic Environments
–
Change in Environment
–
Preconditions not in effect
–
Plans not effective
Approaches to Managing Deliberation Time


Hybrid architectures with deliberative, reactive and
action selection components
–
Gerhard Lakemeyer (Golog for Robotic Soccer)
–
Deliberation time decided by the action
selection component
Anytime algorithms (estimates the efficiency of a
solution as a factor of algorithm run time)
–
Nikos Vlassis
–
Deliberation time decided by the process that
invokes the anytime algorithm
Approaches to Deliberation Time Management

Metacognition
–
Awareness of one's own thoughts and the
factors that influence one's thinking
–
Based on estimates of allowable deliberation
and action times
–
Monitoring deliberation and action
–
Making adjustments to deliberation and action
processes
Metacognition in Deliberation and Action
Metacognition
Deliberation
Action
Factors that influence Metacognition

Goals

Emotions

Resource Constraints

Plans

Performance Optimization

Influence of other Agents
Goals

Set of goals to be achieved


Choosing relevant goals
Type of goals
– Mandatory versus Desirable (Needs versus Wants)
– Desires versus Intentions (BDI agents)
–
Desires -> Intentions -> Actions
– Maintenance versus Achievement
– Conflicting goals

Source of goals

Progress towards goals
Metacognitive Monitoring and Control of Goal Processing

Marking the type of goals

Setting priorities for goals

Maintaining Expectations on progress towards goal
Metacognitive Monitoring and Control of Goal Processing

Expectation failures
–
Possible anomalies to be evaluated
–
Create new goals to deal with expectation failures
–
NAG cycle
–
–
MCL
Indications, Failures and Resonses
Emotions

Animals that exhibit more emotional behavior tend to be better
suited for survival


D. Keltner, “Darwin’s touch: Survival of the kindest,”
Psychology Today, February 11 2009.
Emotion has been shown to alter the cognitive process in
human beings allowing for different responses to the same
problem based on different emotional states.

L. Berkowitz, Causes and Consequences of Feelings,
Cambridge University Press, 2000.
Emotions - Russel's Circumplex model of Positive and
Negative Affect



Affective states organized in
a circular structure in a 2D
plane
Emotions arise from
cognitive interpretations of
neural sensations that are
the product of two
independent
neurophysiological systems
These neurophysiological
systems correspond to the
pleasure axis and activation
axis in the circumplex model
Emotions - Russel's Circumplex model of Positive and
Negative Affect


The circumplex model of
affects is consistent with
findings in cognitive
neuroscience, neuroimaging,
and developmental studies of
affects.
The circumplex model has
been used to study the
development of affective
disorders as well as the
genetic and cognitive
underpinnings of affective
processing within the central
nervous system.
Emotions – Pleasure Axis Transitions

Represent the agent’s feelings about its own performance

Correspond to the number of expectation violations that occur



When no expectation violation occurs, the agent is pleased with
its performance and hence moves its state to the right on the
pleasure axis
When expectation violations occur, the system is frustrated by
its inability to quell the violations and hence moves to the left on
the pleasure axis
The intensity of the expectation violation (the difference
between the observed value and the expected value) decides
how far to the left the system moves with respect to its current
emotional state
Emotions - Activation Axis Transitions



The activation axis transitions are based on the observations of
the system and represent the system’s feeling of stress
As the number of observables that the system has to deal with
increases, the system becomes more stressed and hence its
emotional state moves upward in the activation axis
As the number of observables decrease, the system can relax
and hence its emotional state moves downward in the activation
axis
Metacognitive Monitoring and Control of Emotions

Monitoring number of observables


Maintaining Expectations on number of observables
Monitoring number of expectation violations
Metacognitive Monitoring and Control of Emotions


In the MCL model, failure nodes activate a set of possible
responses and instantiates the highest utility response that
corresponds to the type of failure.
Which action is deemed to be the best is a learned metric that
could be altered for different emotional states; for instance,
when stressed the best action may simply be the quickest,
when relaxed the best action may be the slowest.
Plans/Actions to Perform

How a goal is achieved

Is a plan to achieve the goal known ?

If a plan is unknown, agent has to create a plan

Are the pros and cons of the plans known or unknown?

Which plan is better?


Success rate

Costs

Resource Usage
Conflicting plans
Metacognitive Monitoring and Control of Plans

Monitoring the success rate of plans adopted


Monitoring the actual costs for adopted plans and resource
usage


Maintaining plan cost expectations
Monitoring the resource usage of adopted plans


Maintaining expectations on rate of success for plans adopted
Maintaining resource usage expectations
Monitoring Contradictions

Active Logic
Resource Constraints

Time Constraints


Deadlines
Resource Constraints

Minimize resource usage
Metacognitive Monitoring and Control of Resource
Constraints


Monitoring passage of Time

Maintaining expectations on time requirements

Active Logic
Monitoring usage of Resources

Minimize resource usage
Influence of Other Agents

Competitive versus Cooperative

Cooperative agent changing to competitive or vice versa
Metacognitive Monitoring and Control of Influence of other
Agents

Maintaining Expectations on the influence of other agents
Performance Optimization



May adversely influence resource constraints
Very important in competitive settings, but may be important in
single agent settings as well
Examples

Get better in achieving a goal

Win against an opponent

Collect more rewards even at the expense of
spending more resources
Metacognitive Monitoring and Control of Performance
Optimization

Monitoring Performance Metrics

Maintaining Expectations of Performance Metrics
Metacognition – Vedantic Underpinnings

Goals
–

Emotions
–

(Mōha – delusion / Not seeing pros and cons)
Performance Optimization
–

(Lōbha – greed / miserliness)
Plans
–

(Krōdha – anger / emotional state)
Resource Constraints
–

(Kāma – desire / goal to be achieved)
(Mada – pride / vanity)
Influence of Other Agents
–
(Mātsarya – competitiveness)
Conclusion
Monitoring Expectations is the key?
Download