CS 416 Artificial Intelligence Lecture 2 Agents Chess Article Deep Blue (IBM) • 418 processors, 200 million positions per second Deep Junior (Israeli Co.) • 8 processors, 3 million positions per second Kasparov • 100 billion neurons in brain, 2 moves per second But there are 85 billion ways to play the first four moves Chess Article 1997 - Kasparov Lost to Deep Blue 2002 - Kramnik tied Deep Junior (current World Champion) 2003 - Kasparov (current number 1) plays Deep Junior Jan 26 – Feb 7 Chess Article Cognitive psychologists report chess is a game of pattern matching for humans • But what patterns do we see? • What rules do we use to evaluate perceived patterns? What is an agent? Perception • Sensors receive input from environment – Keyboard clicks – Camera data – Bump sensor Action • Actuators impact the environment – Move a robotic arm – Generate output for computer display Perception Percept • Perceptual inputs at an instant • May include perception of internal state Percept Sequence • Complete history of all prior percepts Do you need a percept sequence to play Chess? An agent as a function Agent maps percept sequence to action • Agent: f ( ps) a; ps p* – Set of all inputs known as state space Agent Function • If inputs are finite, a table can store mapping • Scalable? • Reverse Engineering? Evaluating agent programs We agree on what an agent must do Can we evaluate its quality? Performance Metrics • Very Important • Frequently the hardest part of the research problem • Design these to suit what you really want to happen Rational Agent For each percept sequence, a rational agent should select an action that maximizes its performance measure Example: autonomous vacuum cleaner • What is the performance measure? • Penalty for eating the cat? How much? • Penalty for missing a spot? • Reward for speed? • Reward for conserving power? Learning and Autonomy Learning • To update the agent function in light of observed performance of percept-sequence to action pairs – Explore new parts of state space Learn from trial and error – Change internal variables that influence action selection Adding intelligence to agent function At design time • Some agents are designed with clear procedure to improve performance over time. Really the engineer’s intelligence. – Camera-based user identification At run-time • Agent executes complicated equation to map input to output Between trials • With experience, agent changes its program (parameters) How big is your percept? Dung Beetle • Largely feed forward Sphex Wasp • Reacts to environment (feedback) but not learning A Dog • Reacts to environment and can significantly alter behavior Qualities of a task environment Fully Observable • Agent need not store any aspects of state – The Brady Bunch as intelligent agents – Volume of observables may be overwhelming Partially Observable • Some data is unavailable – Maze – Noisy sensors Qualities of a task environment Deterministic • Always the same outcome for state/action pair Stochastic • Not always predictable – random Partially Observable vs. Stochastic • My cats think the world is stochastic • Physicists think the world is deterministic Qualities of a task environment Markovian • Future state only depends on current state Episodic • Percept sequence can be segmented into independent temporal categories – Behavior at traffic light independent of previous traffic Sequential • Current decision could affect all future decisions Which is easiest to program? Qualities of a task environment Static • Environment doesn’t change over time – Crossword puzzle Dynamic • Environment changes over time – Driving a car Semi-dynamic • Environment is static, but performance metrics are dynamic – Drag racing Qualities of a task environment Discrete • Values of a state space feature (dimension) are constrained to distinct values from a finite set – Blackjack: f(your cards, exposed cards) = action Continuous • Variable has infinite variation – Antilock brakes: f (vehicle speed, wheel velocity) = unlock – Are computers really continuous? Qualities of a task environment Towards a terse description of problem domains • State space: features, dimensionality, degrees of freedom • Observable? • Predictable? • Dynamic? • Continuous? • Performance metric Building Agent Programs The table approach • Build a table mapping states to actions – Chess has 10150 entries (1080 atoms in the universe) – I’ve said memory is free, but keep it within the confines of the boundable universe • Still, tables have their place Discuss four agent program principles Simple Reflex Agents • Sense environment • Match sensations with rules in database • Rule prescribes an action Reflexes can be bad • Don’t put your hands down when falling backwards! Inaccurate information • Misperception can trigger reflex when inappropriate But rules databases can be made large and complex Simple Reflex Agents Randomization • The vacuum cleaner problem Dirty Left Right Model-based Reflex Agents So when you can’t see something, you model it! • Create an internal variable to store your expectation of variables you can’t observe • If I throw a ball to you and it falls short, do I know why? – Aerodynamics, mass, my energy levels… – I do have a model Ball falls short, throw harder Model-based Reflex Agents Admit it, you can’t see and understand everything Models are very important! • We all use models to get through our lives – Psychologists have many names for these contextsensitive models • Agents need models too Goal-based Agents Lacking moment-to-moment performance measure Overall goal is known How to get from A to B? • Current actions have future consequences • Search and Planning are used to explore paths through state space from A to B Utility-based Agents Goal-directed agents that have a utility function • Function that maps internal and external states into a scalar – A scalar is a number Learning Agents Learning Element • Making improvements Performance Element • Selecting actions Critic • Provides learning element with feedback about progress Problem Generator • Provides suggestions for new tasks to explore state space A taxi driver Performance Element • Knowledge of how to drive in traffic Critic • Observes tips from customers and horn honking from other cars Learning Element • Relates low tips to actions that may be the cause Problem Generator • Proposes new routes to try and improved driving skills Review Outlined families of AI problems and solutions Next class we study search problems