Rational Agent - Witchita State University

advertisement
CS 771
Artificial Intelligence
Intelligent Agents
What is AI?
• Views of AI fall into four categories
1. Thinking humanly
2. Acting humanly
3. Thinking rationally
4. Acting rationally
Acting/Thinking Humanly/Rationally
• Acting humanly:
–
–
–
–
–
Natural language
Knowledge representation
automated reasoning
machine learning
(vision, robotics.) for full test
• Thinking humanly:
– Introspection, the general problem solver (Newell and
Simon 1961)
– Cognitive sciences
• Thinking rationally:
– Logic
– Problems: how to represent and reason in a domain
• Acting rationally:
– Agents: Perceive and act
Agents and environments
• An agent perceives its environment via sensors and acts in the
environment with its actuators
• Environment is an agent's world
• Agents include humans, robots, softbots, thermostats, etc.
• The agent function maps from percept histories to actions:
[f: P*  A]
• The agent program runs on the physical architecture to produce f
• agent = architecture + program
Perception and action
• Perception: Sensors receive input from environment
– Keyboard clicks
– Camera data
– Thermostat
Percept: The state of all perceptual inputs at an instant
Precept sequence: Complete history of all prior percepts
• Action: Actuators impact the environment
– Move a robotic arm
– Generate output for computer display
– Turn on the furnace
Agents
• Human agent
– Sensors: eyes, ears, nose, etc
– Actuators: hands, legs, mouth, and other body parts
• Air-conditioning agent
– Sensors: thermostat
– Actuators: furnace
• Robotic agent: Part-picking robot
– Sensors: cameras and infrared range finders
– Actuators: various motors
• Computer agent: Softbot
– Sensors: keyboard, microphone, touchpad, joystick, ports (USB)
– Actuators: screen, speaker, ports (network)
Vacuum-cleaner agent
• Percepts: location and state of the environment, e.g.,
[A,Dirty], [B,Clean]
• Actions: Left, Right, Suck, NoOp
Vacuum-cleaner agent
Is this intelligent behavior?
Vacuum-cleaner agent
•
•
A table of all possible percept sequences and associated actions
Pro: Designer determines intelligent behavior
–
•
External characterization of the agent
Con: Table is exponentially large
–
–
–
–
–
Chess would have 10150 entries (1080 atoms in the universe) a lot of memory
Long search time
Human errors while constructing the table
Time consuming to build
Inflexible once built
Good behavior : The concept of rationality
• Rational behavior is doing the best you can with what you
know
– To do the “right thing” you need goals
• Given a percept sequence, choose the action expected to
maximize success in terms of some predefined goals for
behavior
– Choosing involves combining percepts with knowledge
– Choosing sometimes necessitates seeking additional percepts
• Rational behavior is based on 4 things:
–
–
–
–
Performance measure
Agent’s prior knowledge
Actions agent can perform
Agent’s percept sequence
Rational agents
• Rational Agent: For each possible percept sequence, a
rational agent should select an action that is expected to
maximize its performance measure, based on the evidence
provided by the percept sequence and whatever built-in
knowledge the agent has.
• Performance measure: An objective criterion for success of
an agent's behavior
• E.g., performance measure of a vacuum-cleaner agent could
be amount of dirt cleaned up, amount of time taken, amount
of electricity consumed, amount of noise generated, etc.
Rational agents
• Rationality is distinct from omniscience (all-knowing
with infinite knowledge)
• An agent is autonomous if its behavior is determined
by its own percepts & experience (with ability to
learn and adapt) without depending solely on buildin knowledge
Task Environment
• Before we design an intelligent agent, we must
specify its task environment
– Task environments are essentially the problems to which
rational agents are the solutions
• Specification of task environment : PEAS
–
–
–
–
Performance measure
Environment
Actuators
Sensors
PEAS
• Example: Agent = taxi driver
– Performance measure: Safe, fast, legal, comfortable trip,
maximize profits
– Environment: Roads, other traffic, pedestrians, customers
– Actuators: Steering wheel, accelerator, brake, signal, horn
– Sensors: Cameras, sonar, speedometer, GPS, odometer,
engine sensors, keyboard
PEAS
• Example: Agent = Medical diagnosis system
– Performance measure: Healthy patient, minimize costs, lawsuits
– Environment: Patient, hospital, staff
– Actuators: Screen display (questions, tests, diagnoses,
treatments, referrals)
– Sensors: Keyboard (entry of symptoms, findings, patient's
answers)
PEAS
• Example: Agent = Part-picking robot
– Performance measure: Percentage of parts in correct
bins
– Environment: Conveyor belt with parts, bins
– Actuators: Jointed arm and hand
– Sensors: Camera, joint angle sensors
Properties of task environment
• Fully observable (vs. partially observable): An agent's
sensors give it access to the complete state of the
environment at each point in time.
– Fully observable environments are convenient because the agent
need not maintain any internal state to keep track of the world
– An environment might be partially observable because of noisy
and inaccurate sensors or because part of the state is simply
missing from the sensor data
• A vacuum cleaning agent with local dirt sensor can not tell whether there is
dirt in the other sensor
• An automated taxi can not see what the other drivers are thinking
Properties of task environment
• Deterministic (vs. stochastic): The next state of the
environment is completely determined by the current state
and the action executed by the agent. (other wise it is
stochastic)
– Automated taxi agent is stochastic
• One can not predict the traffic
• Engine might seize up without warning
• Possibility of flat tire
– Vacuum cleaning agent as described is deterministic
Properties of task environment
• Episodic (vs. sequential): An agent’s action is divided into
atomic episodes. Decisions during current episode do not
depend on decisions/actions taken during previous
episodes
– An agent that has to spot defective parts on an assembly line bases
each decision on the current part, regardless of previous decisions,
also current decision does not affect to decide whether the next
part is defective or not
– Chess and automated taxi agents are sequential
• Short term actions can have long term consequences
Properties of task environment
• Static (vs. dynamic): The environment is unchanged while an
agent is deliberating. (The environment is semidynamic if the
environment itself does not change with the passage of time
but the agent's performance score does)
– Automated taxi agent is dynamic
• Other cars and the taxi itself keep moving while the driving algorithm decides what
to do next
– Chess agent with a clock is semi-dynamic
– Crossword puzzles are static
Properties of task environment
• Discrete (vs. continuous): A limited number of distinct, clearly
defined percepts and actions.
– Automated taxi agent is continuous state and continuous time
problem
• Speed and location of taxi and other vehicles sweep through a range of continuous
values
– Chess environment has finite number of discrete states (without the
clock) and also a discrete set of percepts and actions
Properties of task environment
• Single agent (vs. multi-agent): An agent operating by itself in
an environment. (multi-agent if other agent(s) interferes with
the agent’s performance measure)
– An agent solving a crossword puzzle by itself is clearly in a single agent
environment
– An agent playing chess is in a two agent environment
task
environm.
observable
determ./
stochastic
episodic/
sequential
static/
dynamic
discrete/
continuous
agents
crossword
puzzle
fully
determ.
sequential
static
discrete
single
chess with
clock
fully
strategic
sequential
semi
discrete
multi
poker
partial
stochastic
sequential
static
discrete
multi
back
gammon
fully
stochastic
sequential
static
discrete
multi
taxi
driving
partial
stochastic
sequential
dynamic
continuous
multi
medical
diagnosis
partial
stochastic
sequential
dynamic
continuous
single
image
analysis
fully
determ.
episodic
semi
continuous
single
partpicking
robot
partial
stochastic
episodic
dynamic
continuous
single
refinery
controller
partial
stochastic
sequential
dynamic
continuous
single
interact.
Eng. tutor
partial
stochastic
sequential
dynamic
discrete
multi
Structure of agents
• Five basic agent types in order of increasing generality:
– Table Driven agents
– Simple reflex agents
– Model-based reflex agents
– Goal-based agents
– Utility-based agents
Table Driven Agent.
current state of decision process
table lookup
for entire history
Simple reflex agents
NO MEMORY
Fails if environment
is partially observable
example: vacuum cleaner world
Simple reflex agents
• Simple reflex models are of limited intelligence
– Works only if the environment is fully observable
• Problem with partial observability
– Infinite loops are often unavoidable for simple reflex agents working
under partially observable environment
Model-based reflex agent
• Handles partial observability by keeping track of the part
of the world it can’t see
– Maintains some sort of internal state that depends on the
percept history
– Reflects at least some of the unobserved aspect of the current
state
• Updating internal state requires two kind of knowledge
– Information about how the world evolves independent of the
agent
– How the agent’s own action affects the world
Model-based reflex agents
description of
current world state
Model the state of the world by:
modeling how the world changes
how it’s actions change the world
•This can work even with partial information
•It’unclear what to do
without a clear goal
Goal-based agents
Goals provide reason to prefer one action over the other.
We need to predict the future: we need to plan & search
Utility-based agents
Some solutions to goal states are better than others.
Which one is best is given by a utility function.
Conflicting goals only some of which can be achieved (speed and safety)
Learning agents
How does an agent improve over time?
By monitoring it’s performance and suggesting
better modeling, new action rules, etc.
Tells the learning
element how well
the agent is doing
wrt a fixed
performance
measure
changes
action
rules
Suggests
exploration:actions
that will lead to
new and
informative
experience
Previously
considered
entire agent
Download