belief state - Knowledge Representation & Reasoning at UIUC!

advertisement
Decision Making Under Uncertainty
Lec #4: Planning and Sensing
UIUC CS 598: Section EA
Professor: Eyal Amir
Spring Semester 2006
Uses slides by José Luis Ambite, Son Tran, Chitta Baral and…
Paolo Traverso’s (http://sra.itc.it/people/traverso/) tutorial:
http://prometeo.ing.unibs.it/sschool/slides/traverso/traverso-slides.ps.gz, Some slides from http://www2.cs.cmu.edu/~mmv/planning/handouts/BDDplanning.pdf by Rune Jensen http://www.itu.dk/people/rmj
Last Time: Planning by Regression
• OneStepPlan(S) in the regression algorithm is
the backward image of the set of states S.
• Can computed as the QBF formula:
xt+1 (Statest+1(xt+1)  R(xt, a, xt+1))
• Quantified Boolean Formula (QBF):
x (x y) = (0 y)  (1 y)
x (x y) = (0 y)  (1 y)
Last Time
• Planning with no observations:
– Can be done using belief states (sets of
states)
– Belief states can be encoded as OBDDs
• Complexity? – later today
• Other approaches:
– Use model-checking approaches
– Approximate belief state, e.g.,
(Petrick & Bacchus ’02, ‘04)
The Model Checking Problem
Determine whether a formula is true in a model
1. A domain of interest is described by a
semantic model
2. A desired property of the domain is described
by a logical formula
3. Check if the domain satisfy the desired
property by checking whether the formula is
true in the model
Motivation: Formal verification of dynamic systems
Now: Sensing Actions
• Current solutions for Nondeterministic
Planning:
– Conditional planning: condition on
observations that you make now
– Condition on belief state
Medication Example (Deterministic)
• Problem
– A patient is infected. He can take medicine and get cured if he
were hydrated; otherwise, the patient will be dead. To become
hydrated, the patient can drink. The check action allows us to
determine if the patient is hydrated or not.
• Goal: not infected and not dead.
• Classical planners cannot solve such kind of problems
because
– it contains incomplete information: we don’t know whether he is
initially hydrated or not.
– it has a sensing action: in order to determine whether he is
hydrated, the check action is required.
Planning with sensing actions and
incomplete information
• How to reason about the knowledge of agents?
• What is a plan?
– Conditional plans: contain sensing actions and
conditionals such as “if-then-else” structure
• In contrast - Conformant plans: a sequence of
actions which leads to the goal regardless of the
value of the unknown fluents in the initial state
Plan tree examples
nil
a
a
a
a
f
f
b
b
f
b1
f
g
g
h
h
d
c1
c2
d1
d2
c
[]
[a]
[a;b]
b2
[a;b;if(f,c,d)]
a;if(f,[b1;if(g,c1,c2)]; [b2;if(h,d1,d2)])
(1,1)
Plan trees (cont)
Example
Path
chk
hyd
chk
(1,1)
hyd
hyd
med
dr
(2,2)
(2,1)
med
(3,2)
hyd
med
(2,1)
Time
dr
(2,2)
(3,2) med
Why plan trees?
• Think of each node as a
state that the agent might
be in during the plan
execution.
• The root is the initial state.
• Every leaf can be the final
state.
• The goal is satisfied if it
holds in every final states,
i.e., “leaves” of the tree
Path
(1,1)
(2,1)
(2,2)
Time
(3,2)
Limitations of Approach
• Can condition only on current sensing
• No accumulation of knowledge
• Forward-search approach – can we do
better?
• Our regression algorithm from last time:
– Regress, and allow merging of sets/actions
A,B when there is a sensing action that can
distinguish the members of A,B
Sensing Actions
• Current solutions for Nondeterministic
Planning:
– Conditional planning: condition on
observations that you make now
– Condition on belief state
Conditioning on Belief State
• Planning Domain D=<S,A,O,I,T,X>
– S set of states
– A set of actions
– O set of observations
– I  S initial belief state
– T  SAS transition relation (trans. model)
– X  SO observation relation (obs. model)
Due to (Bertoli & Pistore; ICAPS 2004)
Conditioning on Belief State
• Plan P=<C,c0,act,evolve> for planning
domain D – what we need to find
– C set of belief states
• belief states = contexts in (Bertoli & Pistore ‘04)
– c0C initial belief state
– act: CxO  A action function
– evolve: CxO  C belief-state evolution func.
• Very similar to belief-state MDPs
• Represents an infinite set of executions
Conditioning on Belief State
• Configuration (s,o,c,a) for planning domain
D – a state of the executor
– sS world state
– oX(s) observation made in state s
– cC belief state that the executor holds
– a = act(c,o) the action to be taken with this
belief state and observation
• How do we evolve a configuration?
Example
A planning problem P for a planning Domain
Planning Domain D=<S,A,O,I,T,X>:
• I  S is the set of initial states
• G  S is the set of goal states
I
G
Example: Patient + Wait between
Check and Medication
(1,1)
Path
chk
hyd
chk
(1,1)
hyd
hyd
med
dr
(2,2)
(2,1)
med
(3,2)
hyd
med
(2,1)
Time
dr
(2,2)
(3,2) med
Left-Over Issues
• Limitation
• Languages for specifying nondeterministic
effects, sensing (similar to STRIPS?)
– Your Presentation
• Complexity
• Probabilistic domains – next class
Homework
1. Read readings for next time:
[Michael Littman; Brown U Thesis 1996]
chapter 2 (Markov Decision Processes)
Download