CS 771 Artificial Intelligence Intelligent Agents What is AI? • Views of AI fall into four categories 1. Thinking humanly 2. Acting humanly 3. Thinking rationally 4. Acting rationally Acting/Thinking Humanly/Rationally • Acting humanly: – – – – – Natural language Knowledge representation automated reasoning machine learning (vision, robotics.) for full test • Thinking humanly: – Introspection, the general problem solver (Newell and Simon 1961) – Cognitive sciences • Thinking rationally: – Logic – Problems: how to represent and reason in a domain • Acting rationally: – Agents: Perceive and act Agents and environments • An agent perceives its environment via sensors and acts in the environment with its actuators • Environment is an agent's world • Agents include humans, robots, softbots, thermostats, etc. • The agent function maps from percept histories to actions: [f: P* A] • The agent program runs on the physical architecture to produce f • agent = architecture + program Perception and action • Perception: Sensors receive input from environment – Keyboard clicks – Camera data – Thermostat Percept: The state of all perceptual inputs at an instant Precept sequence: Complete history of all prior percepts • Action: Actuators impact the environment – Move a robotic arm – Generate output for computer display – Turn on the furnace Agents • Human agent – Sensors: eyes, ears, nose, etc – Actuators: hands, legs, mouth, and other body parts • Air-conditioning agent – Sensors: thermostat – Actuators: furnace • Robotic agent: Part-picking robot – Sensors: cameras and infrared range finders – Actuators: various motors • Computer agent: Softbot – Sensors: keyboard, microphone, touchpad, joystick, ports (USB) – Actuators: screen, speaker, ports (network) Vacuum-cleaner agent • Percepts: location and state of the environment, e.g., [A,Dirty], [B,Clean] • Actions: Left, Right, Suck, NoOp Vacuum-cleaner agent Is this intelligent behavior? Vacuum-cleaner agent • • A table of all possible percept sequences and associated actions Pro: Designer determines intelligent behavior – • External characterization of the agent Con: Table is exponentially large – – – – – Chess would have 10150 entries (1080 atoms in the universe) a lot of memory Long search time Human errors while constructing the table Time consuming to build Inflexible once built Good behavior : The concept of rationality • Rational behavior is doing the best you can with what you know – To do the “right thing” you need goals • Given a percept sequence, choose the action expected to maximize success in terms of some predefined goals for behavior – Choosing involves combining percepts with knowledge – Choosing sometimes necessitates seeking additional percepts • Rational behavior is based on 4 things: – – – – Performance measure Agent’s prior knowledge Actions agent can perform Agent’s percept sequence Rational agents • Rational Agent: For each possible percept sequence, a rational agent should select an action that is expected to maximize its performance measure, based on the evidence provided by the percept sequence and whatever built-in knowledge the agent has. • Performance measure: An objective criterion for success of an agent's behavior • E.g., performance measure of a vacuum-cleaner agent could be amount of dirt cleaned up, amount of time taken, amount of electricity consumed, amount of noise generated, etc. Rational agents • Rationality is distinct from omniscience (all-knowing with infinite knowledge) • An agent is autonomous if its behavior is determined by its own percepts & experience (with ability to learn and adapt) without depending solely on buildin knowledge Task Environment • Before we design an intelligent agent, we must specify its task environment – Task environments are essentially the problems to which rational agents are the solutions • Specification of task environment : PEAS – – – – Performance measure Environment Actuators Sensors PEAS • Example: Agent = taxi driver – Performance measure: Safe, fast, legal, comfortable trip, maximize profits – Environment: Roads, other traffic, pedestrians, customers – Actuators: Steering wheel, accelerator, brake, signal, horn – Sensors: Cameras, sonar, speedometer, GPS, odometer, engine sensors, keyboard PEAS • Example: Agent = Medical diagnosis system – Performance measure: Healthy patient, minimize costs, lawsuits – Environment: Patient, hospital, staff – Actuators: Screen display (questions, tests, diagnoses, treatments, referrals) – Sensors: Keyboard (entry of symptoms, findings, patient's answers) PEAS • Example: Agent = Part-picking robot – Performance measure: Percentage of parts in correct bins – Environment: Conveyor belt with parts, bins – Actuators: Jointed arm and hand – Sensors: Camera, joint angle sensors Properties of task environment • Fully observable (vs. partially observable): An agent's sensors give it access to the complete state of the environment at each point in time. – Fully observable environments are convenient because the agent need not maintain any internal state to keep track of the world – An environment might be partially observable because of noisy and inaccurate sensors or because part of the state is simply missing from the sensor data • A vacuum cleaning agent with local dirt sensor can not tell whether there is dirt in the other sensor • An automated taxi can not see what the other drivers are thinking Properties of task environment • Deterministic (vs. stochastic): The next state of the environment is completely determined by the current state and the action executed by the agent. (other wise it is stochastic) – Automated taxi agent is stochastic • One can not predict the traffic • Engine might seize up without warning • Possibility of flat tire – Vacuum cleaning agent as described is deterministic Properties of task environment • Episodic (vs. sequential): An agent’s action is divided into atomic episodes. Decisions during current episode do not depend on decisions/actions taken during previous episodes – An agent that has to spot defective parts on an assembly line bases each decision on the current part, regardless of previous decisions, also current decision does not affect to decide whether the next part is defective or not – Chess and automated taxi agents are sequential • Short term actions can have long term consequences Properties of task environment • Static (vs. dynamic): The environment is unchanged while an agent is deliberating. (The environment is semidynamic if the environment itself does not change with the passage of time but the agent's performance score does) – Automated taxi agent is dynamic • Other cars and the taxi itself keep moving while the driving algorithm decides what to do next – Chess agent with a clock is semi-dynamic – Crossword puzzles are static Properties of task environment • Discrete (vs. continuous): A limited number of distinct, clearly defined percepts and actions. – Automated taxi agent is continuous state and continuous time problem • Speed and location of taxi and other vehicles sweep through a range of continuous values – Chess environment has finite number of discrete states (without the clock) and also a discrete set of percepts and actions Properties of task environment • Single agent (vs. multi-agent): An agent operating by itself in an environment. (multi-agent if other agent(s) interferes with the agent’s performance measure) – An agent solving a crossword puzzle by itself is clearly in a single agent environment – An agent playing chess is in a two agent environment task environm. observable determ./ stochastic episodic/ sequential static/ dynamic discrete/ continuous agents crossword puzzle fully determ. sequential static discrete single chess with clock fully strategic sequential semi discrete multi poker partial stochastic sequential static discrete multi back gammon fully stochastic sequential static discrete multi taxi driving partial stochastic sequential dynamic continuous multi medical diagnosis partial stochastic sequential dynamic continuous single image analysis fully determ. episodic semi continuous single partpicking robot partial stochastic episodic dynamic continuous single refinery controller partial stochastic sequential dynamic continuous single interact. Eng. tutor partial stochastic sequential dynamic discrete multi Structure of agents • Five basic agent types in order of increasing generality: – Table Driven agents – Simple reflex agents – Model-based reflex agents – Goal-based agents – Utility-based agents Table Driven Agent. current state of decision process table lookup for entire history Simple reflex agents NO MEMORY Fails if environment is partially observable example: vacuum cleaner world Simple reflex agents • Simple reflex models are of limited intelligence – Works only if the environment is fully observable • Problem with partial observability – Infinite loops are often unavoidable for simple reflex agents working under partially observable environment Model-based reflex agent • Handles partial observability by keeping track of the part of the world it can’t see – Maintains some sort of internal state that depends on the percept history – Reflects at least some of the unobserved aspect of the current state • Updating internal state requires two kind of knowledge – Information about how the world evolves independent of the agent – How the agent’s own action affects the world Model-based reflex agents description of current world state Model the state of the world by: modeling how the world changes how it’s actions change the world •This can work even with partial information •It’unclear what to do without a clear goal Goal-based agents Goals provide reason to prefer one action over the other. We need to predict the future: we need to plan & search Utility-based agents Some solutions to goal states are better than others. Which one is best is given by a utility function. Conflicting goals only some of which can be achieved (speed and safety) Learning agents How does an agent improve over time? By monitoring it’s performance and suggesting better modeling, new action rules, etc. Tells the learning element how well the agent is doing wrt a fixed performance measure changes action rules Suggests exploration:actions that will lead to new and informative experience Previously considered entire agent