clear_b - Subbarao Kambhampati

advertisement
Towards Model-lite Planning
A Proposal For Learning & Planning with Incomplete Domain Models
Sungwook Yoon
Subbarao Kambhampati
Supported by DARPA Integrated Learning Program
A Planning Problem
Suppose you have a super fast planner and a target application.
What is the first problem you have to solve? Is it a problem from the application?
Domain Engineering is hard  Model-lite Planning
Towards Model-lite Planning - Sungwook Yoon
Snapshot of the talk
• This is a proposal. We formulate learning and planning
problems and solution methods for them. We tested
our idea on some problems. But the verification is still
an undergoing process
• We propose
– Representation for model-lite planning
• probabilistic logic, incompleteness is quantified
• Explicit consideration of domain invariant
– Learning of the domain model
• Update of the probability and finding of the new axioms
– Planning with the model
• Deterministic planning domain needs probabilistic planning
• Most plausible plan that respects the current domain model
Towards Model-lite Planning - Sungwook Yoon
Representation
• Precondition Axiom: pAi, A → prei
• Uncertainty is quantified as a probability
• Effect Axiom: eAi, A → effecti
• Facilitates learning
Towards Model-lite Planning - Sungwook Yoon
Domain Model - Blocksworld
•
•
•
•
•
•
0.9, Pickup (x) -> armempty()
1, Pickup (x) -> clear(x)
1, Pickup (x) -> ontable(x)
0.8, Pickup (x) –> holding(x)
0.8, Pickup (x) -> not armempty()
0.8, Pickup (x) -> not ontable(x)
Towards Model-lite Planning - Sungwook Yoon
Precondition Axiom:
Relates Actions with
Current state facts
Effect Axiom:
Relates Actions with
Next state facts
Representation
• One modeling problem
• Conjunction of the effect have different semantics, if the
probability of each effect is independently specified
• Add hidden variable, O , (e, A → O), then add deterministic
axioms for each effect, (1,O → eff1), (1,O → eff2), …
• We can alleviate this problem also with explicit domain
invariant property
• 0.8, Pickup (x) –> holding(x)
Static Property:
• 1, holding(x) -> not armempty() Effect Axiom:
RelatesRelates
ActionsFacts
with in a
• 0.8, Pickup (x) -> not armempty()
• • 1,0.8,
holding(x)
-> not ontable(x) Next state
State
facts
Pickup (x) -> not ontable(x)
• Writing explicit domain invariant property is easier than
writing initial state generator and a set of operators that
respects such property
Towards Model-lite Planning - Sungwook Yoon
Learning the domain model
• Given a trajectory of states and actions, S1,A1,S2,A2, … , Sn,An,Sn+1
–
–
–
–
We can learn precondition axioms from (S1,A1), (S2,A2), …, (Sn,An)
We can learn effect axioms from (A1,S2), (A2,S3), … , (An,Sn+1)
We can learn domain invariant properties from each state (S1), … , (Sn+1)
The weights (probabilities) of the axioms can be updated with simple
perceptron update
• There are readily available package for weighted logic learning
– Alchemy (MLN)
– Problog
• Structure learning
– Alchemy provides structure learning too
– We can also enumerate all the possible axioms (very costly for planning)
Towards Model-lite Planning - Sungwook Yoon
Model-lite planning Probabilistic
Planning
• As stated before, with incomplete domain
knowledge, a deterministic planning domain
should be treated as a probabilistic domain
• The resulting plan should be maximally consistent
with the current domain model
• We develop a planning technique for this purpose
– A plan that is maximally plausible, given the
probabilistic axioms, initial state and goal
• MPE solution to a Bayes Net problem
– Build on plangraph
Towards Model-lite Planning - Sungwook Yoon
Probabilistic Plangraph
A
Domain Invariant Property
Can be asserted too
clear_a
clear_b
armempty
ontable_a
ontable_b
pickup_a
pickup_b
noop_clear_a
noop_clear_b
noop_armempty
noop_ontable_a
noop_ontable_b
clear_a
clear_b
armempty
ontable_a
ontable_b
holding_a
holding_b
pickup_a
pickup_b
stack_a_b
stack_b_a
noop_clear_a
noop_clear_b
noop_armempty
noop_ontable_a
noop_ontable_b
noop_holding_a
noop_holding_b
0.8
How do we generate a weighted clause?
0.95, pickup_b’ v holding_b
B
0.8
Red lines indicate Mutexes
Towards Model-lite Planning - Sungwook Yoon
A
B
clear_a
clear_b
armempty
ontable_a
ontable_b
holding_a
holding_b
on_a_b
on_b_a
Can we view the probabilistic
plangraph as Bayes net?
0.5
clear_a
clear_b
armempty
ontable_a
ontable_b
A
Domain Invariant Property
Can be asserted too, 0.9
pickup_a
pickup_b
noop_clear_a
noop_clear_b
noop_armempty
noop_ontable_a
noop_ontable_b
clear_a
clear_b
armempty
ontable_a
ontable_b
holding_a
holding_b
pickup_a
pickup_b
stack_a_b
stack_b_a
noop_clear_a
noop_clear_b
noop_armempty
noop_ontable_a
noop_ontable_b
noop_holding_a
noop_holding_b
0.8
Evidence Variables
B
0.8
How we find a solution?
MPE (most probabilistic explanation)
There are some solvers out there
Towards Model-lite Planning - Sungwook Yoon
A
B
clear_a
clear_b
armempty
ontable_a
ontable_b
holding_a
holding_b
on_a_b
on_b_a
MPE as Maxsat
• There has been a work by James D. Park, AAAI
2002
• Set –log(P) as the weight of the clauses
A/B
P
T
T
0.7
F
T
0.3
T
F
0.2
F
F
0.8
Weighted Clauses
-log0.7 -A v –B
-log0.3 A V –B
-log0.2 –A v B
-log0.8 A v B
Intuitive explanation
Violating the clause is easier for
High probability instances
Thus the MaxSat Problem
Gives you the highest probability
instantiations
A->B, T T 1, T F 0, InfinityWeight for –A v B,
(complies with our intuitive understanding)
Towards Model-lite Planning - Sungwook Yoon
Probabilistic Plangraph to MaxSat
-log0.5
clear_a
clear_b
armempty
ontable_a
ontable_b
Domain Invariant Property
Can be asserted too, -log0.9
pickup_a
pickup_b
noop_clear_a
noop_clear_b
noop_armempty
noop_ontable_a
noop_ontable_b
clear_a
clear_b
armempty
ontable_a
ontable_b
holding_a
holding_b
B
pickup_a
pickup_b
stack_a_b
stack_b_a
noop_clear_a
noop_clear_b
noop_armempty
noop_ontable_a
noop_ontable_b
noop_holding_a
noop_holding_b
-log0.8
Evidence Variables
A
-log0.8
For each probabilistic weight, we give –log(1-p)!
That’s it.
Towards Model-lite Planning - Sungwook Yoon
A
B
clear_a
clear_b
armempty
ontable_a
ontable_b
holding_a
holding_b
on_a_b
on_b_a
Exploding Blocksworld
Towards Model-lite Planning - Sungwook Yoon
Current Status (ongoing)
• Learning test
– Generated Blocksworld Random Wandering Data and feed them
to Alchemy with correct and incorrect axioms
– Alchemy found higher weight on the correct axioms and lower
weight on the incorrect axioms
• Planning test – Tested on probabilistic planning problems
– Hand tested on a couple of instances of Slippery Gripper
Domain
• Hand encoded the clauses and assigned the weight
• Put the resulting clauses to MaxSat solve
• Got desired results
– On Exploding Blocksworld
• Implemented generic MaxSat encoder for probabilistic planning
problems
• Tested on a couple of problems from Exploding Blocksworld
• Finds desired output frequently (not always)
Towards Model-lite Planning - Sungwook Yoon
Summary
• We can learn precondition axioms and effect
axioms separately.
– A -> Prec, A->Effect
– Facilitates the learning
• Domain axiom or Invariant Property can be,
provided, learned and used explicitly
– It is better for domain modeler
• For planning, we can apply probabilistic
plangraph approach
– We proposed using MaxSat to solve probabilistic
planning problems
– Interesting parallel to deterministic planning to SAT
Towards Model-lite Planning - Sungwook Yoon
Domain Learning – Related Work
• Logical Filtering (Chang & Eyal, ICAPS’06)
– Update belief state and domain transition model
– Experiments involved planning
• Probabilistic operator learning (Zettlemoyer,
Pasula and Kaelbling, AAAI’05)
– Experiments involved planning
• ARMS (Yang, Wu and Jiang, ICAPS ‘05)
– No observation besides initial state and goal
Towards Model-lite Planning - Sungwook Yoon
Probabilistic Planning in Plangraph
– Related Work
• Pgraphplan, Paragraph
• Both search plans in the graphplan
framework.
• pGraphplan searches for a consistent plan that
maximizes the goal-reaching probability
– Forward probability propagation
• Paragraph searches for a plan that minimizes
the cost to reach the goal
– Backward plan search
Towards Model-lite Planning - Sungwook Yoon
Download