AIPS2000 Planning Competition Results

advertisement
AIPS-2000
Planning Competition
Fahiem Bacchus
University of Toronto
4/17/00
Fahiem Bacchus
1
Overview
• AIPS-98 featured the first competition.
• 4 competitors in a STRIPS track, 2 in an ADL track
• This years competition:
• 15 competitors
• A fully automatic track with STRIPS & ADL domains
• A hand-tailored track allowing domain dependent
information.
4/17/00
Fahiem Bacchus
2
Make Possible By
• Michael Ady, Winter-City Software, Edmonton, Alberta,
Canada.
• John DiMarco, and the Department of Computer Science,
University of Toronto, systems support staff.
• The other members of the organizing committee: Henry
Kautz, David E. Smith, Derek Long, Hector Geffner, &
Jana Koller
• The competitors who were willing to subject their work to
a very public scrutiny.
• Franz Inc. For providing a free copy of Allegro Common
lisp for Linux.
4/17/00
Fahiem Bacchus
3
The Competitors
1. Blackbox
Yi-Cheng Huang
Bart Selman
Cornell University
Henry Kautz
AT&T Research
•
•
•
Constructs a graphplan-graph, converts it into a Boolean satisfiability
problems, and then attempts to solve the problem with various
satisfiability engines.
Competed in AIPS-98.
Competed in the fully automated track.
4/17/00
Fahiem Bacchus
4
The Competitors
2. MIPS
Stefan Edelkamp
Malte Helmert
University of Freiburg
•
•
•
“Intelligent Model checking and Planning System”
Uses BDDs compactly store and maintain sets of propositionally
represented states, and a heuristic symbolic search engine as well as a
heuristic single state search engine.
Competed in the fully automated track.
4/17/00
Fahiem Bacchus
5
The Competitors
3. System R
Fangzhen Lin
Hong Kong University of Science and Technology
•
•
•
Based on a regression/progression algorithm like a sound version of
the original STRIPS algorithm.
Competed in both the fully automated and hand tailored tracks.
In the hand tailored track it used domain specific information about (1)
the ordering of subgoals; (2) pruning of unachievable goals; and (3) the
way a subgoal is solved by regressing it to a new conjunctive goal.
4/17/00
Fahiem Bacchus
6
The Competitors
4. FF
Joerg Hoffmann
Albert Ludwigs University
•
•
•
FF (Fast-Forward) employs heuristic search like HSP, but extending
the HSP heuristic to include information from GRAPHPLAN's plan
extraction phase.
Employs a local search strategy that combines Hill-climbing with
systematic search.
Competed the fully automated track.
4/17/00
Fahiem Bacchus
7
The Competitors
5. HSP2
Hector Geffner
Blai Bonet
Universidad Simon Bolivar
•
•
•
•
A heuristic-search planner that descends from the HSP planner that
competed in the AIPS98 Contest.
Planning instances are mapped into state-space search problems that
are solved with heuristics extracted from the representation.
HSP2 supports both forward and backward search and several
heuristics, and uses a weighted A* search.
Competed in the fully automated track.
4/17/00
Fahiem Bacchus
8
The Competitors
6. IPP
Jana Koehler
Schindler Lifts Ltd.
Joerg Hoffmann
Michael Brenner
University of Freiburg
•
•
•
IPP is based on searching GraphPlan planning graphs that have been
extended to handle ADL actions.
Essentially the same system as that entered in the AIPS-98
competition.
Competed in the fully automated track.
4/17/00
Fahiem Bacchus
9
The Competitors
7. PropPlan
Michael Fourman
University of Edinburgh
•
•
•
•
PropPlan uses naive breadth-first state-space search but employs
Ordered Binary Decision Diagrams to optimize its state space
exploration.
Operators are represented directly by efficient operations on BDDs.
Like GraphPlan, the competition version of PropPlan utilizes forward
chaining to establish a layered set of reachable states until the goal-set
is reached, then backward chaining for plan extraction.
Competed in the fully automated track.
4/17/00
Fahiem Bacchus
10
The Competitors
8. TokenPlan
Yannick Meiller
Patrick Fabiani
ONERA - Center of Toulouse
•
•
Based on the use of colored Petri nets which can encode mutex
relations through the token’s colors. Dependent on how tokens are
propagated in the net a range of search techniques can be emulated.
Competed in the fully automated track.
4/17/00
Fahiem Bacchus
11
The Competitors
9. STAN
Maria Fox
Derek Long
University of Durham
•
•
•
•
STAN employs a hybrid of two planning strategies:
1. The original Graphplan-based STAN algorithm.
2. A forward planner using a heuristic function based on the length
of the relaxed plan (as in HSP and FF).
Uses automatic of domain analysis to select between these strategies.
The domain analysis techniques include type, and invariant detection,
as well as the automatic identification of certain combinatorial
optimization sub-problems
Competed in the fully automated track.
4/17/00
Fahiem Bacchus
12
The Competitors
10. BDDPlan
Hans-Peter Stoerr
Dresden University of Technology
•
•
•
BDDPlan uses BDDs to support reasoning in the Fluent Calculus, an
framework for reasoning about actions in first order logic.
Model checking algorithms are used to do an implicit breadth first
search.
Competed in the fully automated track.
4/17/00
Fahiem Bacchus
13
The Competitors
11. AltAlt
Biplav Srivastava
Terry Zimmerman
BinhMinh Do
XuanLong Nguyen
Zaiqing Nie
Ullas Nambiar
Romeo Sanchez
Arizona State University
•
•
AltAlt uses effective and admissible heuristics extracted from the
planning graph to drive the backward state space.
Competed in the fully automatic track.
4/17/00
Fahiem Bacchus
14
The Competitors
12. GRT
Ioannis Refanidis
Ioannis Vlahavas
Dimitris Vrakas
Aristotle University
•
•
GRT (Greedy Regression Table) planner is a heuristic state space
planner. Like HSP it employs a best-first search using estimated
distances to the goal
Competed in the fully automatic track.
4/17/00
Fahiem Bacchus
15
The Competitors
13. PbR
Jose Luis Ambite
Craig Knoblock
Steve Minton
University of Southern California/Information Sciences Institute
•
•
•
•
Planning by Rewriting (PbR) generates plans by using a set of plan
rewriting rules and local search techniques to transform an easy-togenerate poor quality initial plans into a higher-quality plans.
The rewriting rules were developed semi-automatically: some
proposed by a learning algorithm, some defined manually.
The initial plan generators were hand-coded for each domain.
Competed in the hand tailored track.
4/17/00
Fahiem Bacchus
16
The Competitors
14. SHOP
Dana Nau
Yue (Jason) Cao
Hector Munoz-Avila
Amnon Lotem
University of Maryland
•
•
•
•
SHOP (Simple Hierarchical Ordered Planner) is an HTN planning
system that plans for tasks in the same order that they will later be
executed.
This simplifies goal-interactions and provides a complete world-state
at each step of the planning process as with forward chaining planners.
The complete state information allows SHOP to encode effective
domain specific planning algorithms and knowledge.
Competed in the hand tailored track.
4/17/00
Fahiem Bacchus
17
The Competitors
14. TALPlanner
Jonas Kvarnstrom
Patrick Doherty
Patrik Haslum
Linkoping University
•
•
•
•
TALplanner is a forward-chaining planner based on the TLPlan system
(Bacchus & Kabanza).
Domain-dependent search control knowledge expressed declaratively
as formulas of a temporal logic and used to control forward chaining.
TALplanner uses TAL a narrative temporal logic for reasoning about
action and change.
Competed in the hand tailored track.
4/17/00
Fahiem Bacchus
18
The Results
4/17/00
Fahiem Bacchus
19
Domain 1: Logistics World
• Move a set of packages between locations
using trucks within the same city and
airplanes between cities.
• Limited interaction between goals.
4/17/00
Fahiem Bacchus
20
BlackBox
Fully Automated Logistics Time Comparison
Mips
10000
System R
FF
HSP2
100
Seconds
IPP
PropPlan
TokenPlan
1
4
4
4
5
5
5
6
6
6
6
7
7
8
8
9
9 10 10 11 11 12 12 13 13 14 14 15 15
STAN
BDDPlan
AltAlt
0.01
GRT
Planners doing well enough to scale to bigger
problems:
•
•
•
•
•
•
System R
GRT
HSP2
Stan
Mips
FF
4/17/00
Fahiem Bacchus
22
Fully Automated Logistics Time Comparison
System
R
FF
1000
Seconds
HSP2
STAN
10
GRT
Mips
0.1
Logistics Domain Time
• FF, MIPS a bit better
• Stan and HSP2
• Then GRT
4/17/00
Fahiem Bacchus
24
Fully Automated Logistics # Steps Comparison
BlackBox
Mips
160
System R
140
FF
120
HSP2
100
IPP
80
PropPlan
60
TokenPlan
40
STAN
BDDPlan
20
AltAlt
0
4
4
4
5
5
5
6
6
6
6
7
7
8
8
9
9 10 10 11 11 12 12 13 13 14 14 15 15
GRT
Fully Automated Logistics # Steps Comparison
Mips
600
System R
500
400
FF
300
HSP2
200
100
STAN
0
GRT
Logistics Plan Length
•
•
•
•
Stan is generating the shortest plans
GRT, MIPS, FF about the same
HSP a bit worse
System R generates very long plans in this
domain.
4/17/00
Fahiem Bacchus
27
Hand Tailored
4/17/00
Fahiem Bacchus
28
Hand Tailored Logistic Time Comparison
1000
System R
100
10
Seconds
SHOP
1
0.1
0.01
TALplanner
Hand Tailored Logistics #Steps Comparison
4000
System R
3500
3000
2500
SHOP
2000
1500
1000
500
16
18
21
23
26
28
31
33
36
38
41
43
46
48
51
53
56
58
61
63
66
68
71
73
76
78
81
83
86
88
91
93
96
98
0
TALplanner
Hand Tailored Logistics # Steps Comparison
Shop
700
600
500
400
300
200
100
0
TALplanner
Hand Tailored Logistics
• TALplan is extremely fast (the largest
problems in less than a second
• Shop and then System R
• Shop and TALplan are generating short
plans (with Shop a bit better)
• System R is generating very long plans
4/17/00
Fahiem Bacchus
32
Domain 2: Blocks World
• Stack a set of blocks.
• A high degree of interaction between goals.
• Easy for people, can be hard for automated
planners.
4/17/00
Fahiem Bacchus
33
Fully Automated Blocks Time Comparison
BlackBox
Mips
System R
1000
FF
HSP2
Seconds
IPP
10
PropPlan
TokenPlan
0.1
STAN
BDDPlan
AltAlt
0.001
GRT
Blocks
• A wide variation of performance.
• Only FF, System R, and HSP2 are able to
solve the larger problems.
4/17/00
Fahiem Bacchus
35
Fully Automated Blocks Time Comparison
10000
System R
100
Seconds
FF
1
HSP2
0.01
Blocks
• System R scales very consistently in this
domain.
• FF can occasionally solve large problems
fast, but it has many misses.
4/17/00
Fahiem Bacchus
37
BlackBox
Fully Automated Blocks # Steps Comparison
Mips
250
System R
200
FF
HSP2
150
IPP
PropPlan
100
TokenPlan
STAN
50
BDDPlan
0
AltAlt
GRT
Fully Automated Blocks # Steps Comparison
250
System R
200
150
FF
100
50
HSP2
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
0
Blocks
• System R generates short plans.
• FF when it succeeds also generates short
plans.
• HSP2 can generate very long plans in this
domain.
4/17/00
Fahiem Bacchus
40
Blocks Hand Tailored
4/17/00
Fahiem Bacchus
41
Blocks Hand Taliored Time Comparison
System R
100
Seconds
PbR
1
SHOP
TALplanner
0.01
Hand Tailored Blocks # Steps Comparison
System R
400
350
300
PbR
250
200
150
SHOP
100
50
0
TALplanner
Blocks—Hand Tailored
• TALPlan is very fast.
• System R, then Shop and PBR.
• They all generate plans of a similar length.
4/17/00
Fahiem Bacchus
44
Blocks Some Harder Problems
4/17/00
Fahiem Bacchus
45
Hand Tailored Blocks Time Comparison
100000
PbR
Seconds
1000
System R
SHOP
10
100 100 200 200 250 250 300 300 350 350 400 400 425 425 450 450 475 475 500 500
TALplanner
0.1
Blocks
• TALplan scales very well in this domain
solving 500 block problems in about 1.5
seconds.
• System R also scales well enough to solve
the hardest problems of the .
4/17/00
Fahiem Bacchus
47
Domain 3: Schedule World
• Machine a collection of parts.
• Goals are mostly non-interacting, but they
compete for “resources” (time on machines)
and on the same part different goals clobber
other goals.
• Originally a Prodigy domain.
4/17/00
Fahiem Bacchus
48
Schedule World
• This is an ADL domain with many actions
requiring conditional effects.
• Only Mips, FF, HSP2, IPP, PropPlan, and
BDDPlan could deal with this domain.
4/17/00
Fahiem Bacchus
49
Fully Automated Schedule Time Comparison
Mips
10000
FF
HSP2
Seconds
100
IPP
48
50
47
43
45
42
38
40
37
33
35
32
28
30
27
23
25
22
18
20
17
13
15
12
8
10
7
3
5
2
1
PropPlan
0.01
BDDPlan
Schedule
• FF is the only planner that scales to the
harder problems on this domain.
• The length of the solutions are roughly
comparable
4/17/00
Fahiem Bacchus
51
Schedule—Hand Tailored
4/17/00
Fahiem Bacchus
52
Hand Tailored Schedule Time Comparison
10000
PbR
100
Seconds
BDDPlan
48
50
47
43
45
42
38
40
37
33
35
32
28
30
27
23
25
22
18
20
17
13
15
12
8
10
7
3
5
2
1
TALplanner
0.01
Hand Tailored Schedule # Steps Comparison
100
PbR
90
80
70
60
BDDPlan
50
40
30
20
10
0
TALplanner
Schedule—Hand Tailored
• TALplan is generating short solutions in
about 0.15 seconds on the largest problems.
• PbR takes longer and generates inferior
plans.
• Interestingly FF is taking 10-25 seconds on
the largest problems, generating slightly
longer solutions than TALplan, but fully
automatically.
4/17/00
Fahiem Bacchus
55
Domain 4: Freecell World
• Freecell is a solitaire card game that comes
with Microsoft Windows.
• Freecell demo.
4/17/00
Fahiem Bacchus
56
Fully Automatic FreeCell Time Comparison
10000
BlackBox
Mips
FF
HSP2
100
Seconds
IPP
PropPlan
TokenPlan
1
STAN
BDDPlan
0.01
GRT
Fully Automatic Freecell # Steps Comparison
250
BlackBox
Mips
System R
200
FF
150
HSP2
100
IPP
TokenPlan
50
STAN
0
GRT
Freecell
• Stan plan takes the least time but does not
solve the hardest problems, and generates
long solutions.
• FF is best at the harder problems, but cannot
solve all the problems
• HSP2 also solves some larger problems (but
takes a long time on them)
4/17/00
Fahiem Bacchus
59
Freecell—Fully Automatic
4/17/00
Fahiem Bacchus
60
Hand Tailored Freecell Time Comparison
10000
System R
100
Seconds
SHOP
13
13
12
12
12
11
10
11
10
10
9
9
8
8
8
7
7
6
6
6
5
4
5
4
4
3
3
2
2
2
1
TALplanner
0.01
Hand Tailored Freecell # Steps Comparison LOG SCALE
10000
System R
1000
100
SHOP
10
TALplanne
1
Freecell
• Talplan can be fast and solves all of the
problems, but it can generate very long
plans 4500 steps.
• Shop generates reasonable plans in a
reasonable amount of time. System R takes
longer.
• None of these planners perform that much
better than the fully automatic planners in
this domain.
4/17/00
Fahiem Bacchus
63
Domain 5: Mic10 Elevator World
• Based on the Miconic-10 Elevator controller
developed by Schindler Lifts Ltd.
• Contributed by Jana Koehler.
• Problems involve controlling a sophisticated
elevator to move passengers to their destination.
• Various constraints on movement, including
priority passengers, passengers that must go nonstop, passengers that must be accompanied.
4/17/00
Fahiem Bacchus
64
Miconic 10
•
•
•
•
•
Four Separate problems:
A simple strips version.
A simplified ADL version.
Jana’s original ADL version.
The original ADL version plus the
constraint that only 6 people can be on
board the elevator at a time.
4/17/00
Fahiem Bacchus
65
STRIPS Miconic 10 Time Comparison
System R
10000
TokenPlan
100
Seconds
STAN
58
60
56
52
54
50
46
48
44
40
42
38
34
36
32
28
30
26
22
24
20
16
18
14
10
12
8
4
6
2
1
AltAlt
GRT
0.01
STRIPS Miconic # Steps Comparison
System R
180
160
140
TokenPlan
120
100
STAN
80
60
AltAlt
40
20
0
GRT
Strips Miconic10
• STAN is slightly faster and generates
slightly shorter solutions.
• GRT also does well on both criteria.
4/17/00
Fahiem Bacchus
68
Simple Miconic 10 Time Comparison
70
60
Seconds
50
40
HSP2
30
20
10
0
Simple ADL Miconic10 # Steps Comparison
120
100
80
HSP2
60
40
20
0
Simple Miconic 10
• HSP was the only planner to successfully
solve the problems.
• Generates reasonable solutions quickly in
this domain.
4/17/00
Fahiem Bacchus
71
Full Miconic 10 Time Comparison
10000
FF
100
Seconds
IPP
1
PropPlan
0.01
2
2
4
6
8
10
10
12
14
16
18
18
20
22
24
26
28
28
30
32
34
36
38
40
42
44
46
48
52
52
54
56
58
60
Full Miconic 10 # Steps Comparison
120
0
FF
100
80
IPP
60
40
20
PropPlan
Full ADL-Miconic10
• PropPlan generates minimal length
solutions.
• FF is faster.
4/17/00
Fahiem Bacchus
74
Miconic10—ADL+Constraint
4/17/00
Fahiem Bacchus
75
Miconic 10 + Constraint Time Comparison
10000
PropPlan
Seconds
100
1
TALplanner
0.01
Miconic 10 + Constraint # Steps Comparison
40
35
PropPlan
30
25
20
15
10
TALplanner
5
0
Miconic10 + Constraints
• PropPlan is generating minimal length
plans.
• TALPlan is generating reasonably good
plans, quickly.
• PropPlan is not using any domain specific
control.
4/17/00
Fahiem Bacchus
78
Distinguished Planners
• Schindler Lifts Ltd. Is providing a special
award for performance on the Miconic-10
domain.
4/17/00
Fahiem Bacchus
79
Distinguished Planners
• Celcorp is providing a set of awards for
performance in the competition.
• There are many metrics, and it is impossible
to say that any one planner was the best.
• But some planners did distinguish
themselves as performing with distinction in
different ways.
• I have selected a set of such planners as
worthy of “special distinction.”
Distinguished Planners
Group B
•
•
•
•
STAN
HSP2
MIPS
System R
4/17/00
Fahiem Bacchus
81
Distinguished Planners
Group A
• Two Planners demonstrated performance
that was even more distinguished.
• TalPlanner
• FF
4/17/00
Fahiem Bacchus
82
Download