Metacognition for Effective Deliberation in Artificial Agents Darsana Josyula 18 November 2011 Bowie State University Artificial Agents Deliberation – Process by which agents plan the tasks to perform in order to accomplish current goals Action – Process by which agents perform each task in the plan Deliberation Time Static Environments – Deadlines Dynamic Environments – Change in Environment – Preconditions not in effect – Plans not effective Approaches to Managing Deliberation Time Hybrid architectures with deliberative, reactive and action selection components – Gerhard Lakemeyer (Golog for Robotic Soccer) – Deliberation time decided by the action selection component Anytime algorithms (estimates the efficiency of a solution as a factor of algorithm run time) – Nikos Vlassis – Deliberation time decided by the process that invokes the anytime algorithm Approaches to Deliberation Time Management Metacognition – Awareness of one's own thoughts and the factors that influence one's thinking – Based on estimates of allowable deliberation and action times – Monitoring deliberation and action – Making adjustments to deliberation and action processes Metacognition in Deliberation and Action Metacognition Deliberation Action Factors that influence Metacognition Goals Emotions Resource Constraints Plans Performance Optimization Influence of other Agents Goals Set of goals to be achieved Choosing relevant goals Type of goals – Mandatory versus Desirable (Needs versus Wants) – Desires versus Intentions (BDI agents) – Desires -> Intentions -> Actions – Maintenance versus Achievement – Conflicting goals Source of goals Progress towards goals Metacognitive Monitoring and Control of Goal Processing Marking the type of goals Setting priorities for goals Maintaining Expectations on progress towards goal Metacognitive Monitoring and Control of Goal Processing Expectation failures – Possible anomalies to be evaluated – Create new goals to deal with expectation failures – NAG cycle – – MCL Indications, Failures and Resonses Emotions Animals that exhibit more emotional behavior tend to be better suited for survival D. Keltner, “Darwin’s touch: Survival of the kindest,” Psychology Today, February 11 2009. Emotion has been shown to alter the cognitive process in human beings allowing for different responses to the same problem based on different emotional states. L. Berkowitz, Causes and Consequences of Feelings, Cambridge University Press, 2000. Emotions - Russel's Circumplex model of Positive and Negative Affect Affective states organized in a circular structure in a 2D plane Emotions arise from cognitive interpretations of neural sensations that are the product of two independent neurophysiological systems These neurophysiological systems correspond to the pleasure axis and activation axis in the circumplex model Emotions - Russel's Circumplex model of Positive and Negative Affect The circumplex model of affects is consistent with findings in cognitive neuroscience, neuroimaging, and developmental studies of affects. The circumplex model has been used to study the development of affective disorders as well as the genetic and cognitive underpinnings of affective processing within the central nervous system. Emotions – Pleasure Axis Transitions Represent the agent’s feelings about its own performance Correspond to the number of expectation violations that occur When no expectation violation occurs, the agent is pleased with its performance and hence moves its state to the right on the pleasure axis When expectation violations occur, the system is frustrated by its inability to quell the violations and hence moves to the left on the pleasure axis The intensity of the expectation violation (the difference between the observed value and the expected value) decides how far to the left the system moves with respect to its current emotional state Emotions - Activation Axis Transitions The activation axis transitions are based on the observations of the system and represent the system’s feeling of stress As the number of observables that the system has to deal with increases, the system becomes more stressed and hence its emotional state moves upward in the activation axis As the number of observables decrease, the system can relax and hence its emotional state moves downward in the activation axis Metacognitive Monitoring and Control of Emotions Monitoring number of observables Maintaining Expectations on number of observables Monitoring number of expectation violations Metacognitive Monitoring and Control of Emotions In the MCL model, failure nodes activate a set of possible responses and instantiates the highest utility response that corresponds to the type of failure. Which action is deemed to be the best is a learned metric that could be altered for different emotional states; for instance, when stressed the best action may simply be the quickest, when relaxed the best action may be the slowest. Plans/Actions to Perform How a goal is achieved Is a plan to achieve the goal known ? If a plan is unknown, agent has to create a plan Are the pros and cons of the plans known or unknown? Which plan is better? Success rate Costs Resource Usage Conflicting plans Metacognitive Monitoring and Control of Plans Monitoring the success rate of plans adopted Monitoring the actual costs for adopted plans and resource usage Maintaining plan cost expectations Monitoring the resource usage of adopted plans Maintaining expectations on rate of success for plans adopted Maintaining resource usage expectations Monitoring Contradictions Active Logic Resource Constraints Time Constraints Deadlines Resource Constraints Minimize resource usage Metacognitive Monitoring and Control of Resource Constraints Monitoring passage of Time Maintaining expectations on time requirements Active Logic Monitoring usage of Resources Minimize resource usage Influence of Other Agents Competitive versus Cooperative Cooperative agent changing to competitive or vice versa Metacognitive Monitoring and Control of Influence of other Agents Maintaining Expectations on the influence of other agents Performance Optimization May adversely influence resource constraints Very important in competitive settings, but may be important in single agent settings as well Examples Get better in achieving a goal Win against an opponent Collect more rewards even at the expense of spending more resources Metacognitive Monitoring and Control of Performance Optimization Monitoring Performance Metrics Maintaining Expectations of Performance Metrics Metacognition – Vedantic Underpinnings Goals – Emotions – (Mōha – delusion / Not seeing pros and cons) Performance Optimization – (Lōbha – greed / miserliness) Plans – (Krōdha – anger / emotional state) Resource Constraints – (Kāma – desire / goal to be achieved) (Mada – pride / vanity) Influence of Other Agents – (Mātsarya – competitiveness) Conclusion Monitoring Expectations is the key?