slides

Flow-based Heuristics for Optimal
Planning: Landmarks and Merges
Blai Bonet
Universidad Simon Bolivar
Caracas, Venezuela
[email protected]
Menkes van den Briel
NICTA & Australian National University
Canberra, Australia
[email protected]
Outline
• Background and motivation
• Base model
• Strengthening the base model
– Adding constraints derived from landmarks
– Adding constraints derived from merges
• Results
• Conclusions
Background
An LP-Based Heuristic for Optimal
Planning. Van den Briel, Benton,
Kambhampati, and Vossen. CP2007.
An Admissible Heuristic for SAS+
Planning Obtained from the State
Equation. Bonet. IJCAI2013.
Flow-based linear programming model
• SAS+ planning task with action cost 𝑃 =< 𝑉, 𝐴, 𝑠0 , 𝑠∗ , 𝑐 >
– 𝑉 set of variables each with a finite domain 𝐷𝑋
– 𝐴 set of actions < 𝑃𝑟𝑒, 𝑃𝑜𝑠𝑡, 𝑃𝑟𝑒𝑣 >
– 𝑠0 initial state
– 𝑠∗ goal state (partial valuation)
– 𝑐 non-negative action cost
• Domain transition graph (DTG)
– Node for each value in 𝐷𝑋
– Labeled arc for each transition 𝑇𝑋
𝑇𝑋 = 𝑥, 𝑎, 𝑥 ′ : 𝑋 = 𝑥 ∈ 𝑃𝑟𝑒, 𝑋 = 𝑥′ ∈ 𝑃𝑜𝑠𝑡 ∪
⊥, 𝑎, 𝑥 ′ : 𝑋 ∉ 𝑉𝑎𝑟 𝑃𝑟𝑒 , 𝑋 = 𝑥′ ∈ 𝑃𝑜𝑠𝑡 ∪
𝑥, 𝑎, 𝑥 : 𝑋 = 𝑥 ∈ 𝑃𝑟𝑒𝑣
Example
T
P
1
2
Load(P, T, 1)
Unload(P, T, 1)
1
1
Load(P, T, 1)
Unload(P, T, 1)
2
Drive(T, 1, 2)
Drive(T, 2, 1)
2
Load(P, T, 2)
Unload(P, T, 2)
Load(P, T, 2)
Unload(P, T, 2)
T
DTG(P)
DTG(T)
Flow-based linear programming model
• A flow for a planning task 𝑃 =< 𝑉, 𝐴, 𝑠0 , 𝑠∗ , 𝑐 > is a function
𝑓: 𝐴 → ℝ+ mapping action labels into non-negative real
numbers
• Minimize
𝑎∈𝐴 𝑐
𝑎 𝑓(𝑎)
• Subject to
(𝑥 ′ ,𝑎,𝑥)∈𝑇𝑋 𝑓(𝑎)
−
𝑥,𝑎,𝑥 ′ ∈𝑇𝑋 𝑓
𝑎 =0
Flow-based linear programming model
• Minimize
𝑎∈𝐴 𝑐
𝑎 𝑓(𝑎)
• Subject to
(𝑥 ′ ,𝑎,𝑥)∈𝑇𝑋 𝑓(𝑎)
−
𝑥,𝑎,𝑥 ′
1 if 𝑥 ∉ 𝑠0 , 𝑥 ∈ 𝑠∗
−1 if 𝑥 ∈ 𝑠0 , 𝑥 ∉ 𝑠∗
∈𝑇𝑋 𝑓 𝑎 =
0 otherwise
Flow-based linear programming model
• Minimize
𝑎∈𝐴 𝑐
𝑎 𝑓(𝑎)
• Subject to
(𝑥 ′ ,𝑎,𝑥)∈𝑇𝑋 𝑓(𝑎)
−
𝑥,𝑎,𝑥 ′
1 if 𝑥 ∉ 𝑠0 , 𝑥 ∈ 𝑠∗
−1 if 𝑥 ∈ 𝑠0 , 𝑥 ∉ 𝑠∗
∈𝑇𝑋 𝑓 𝑎 ≥
0 otherwise
Due to incomplete actions
(see paper for details)
Theorem 1. The solution to the base model 𝑕𝑏𝑎𝑠𝑒
can be used as an admissible heuristic
Flow-based heuristic
• Base heuristic is effective and efficient
– Bonet. IJCAI2013
Domain
Airport(50)
Barman-opt11(20)
Blocks(35)
Depot(22)
Driverlog(20)
Elevators-opt08(30)
Elevators-opt11(20)
Floortile-opt11(20)
Freecell(80)
Grid(5)
Gripper(20)
Logistics00(28)
Logistics98(35)
Miconic(150)
Mprime(35)
Mystery(30)
Nomystery-opt11(20)
Openstacks-opt08(30
Openstacks-opt11(20)
Openstacks(30)
Parcprinter-08(30)
Parcprinter-opt11(20)
LM-cut
28
4
28
7
13
22
17
6
15
2
7
20
6
141
22
19
14
20
15
7
18
13
base
20
4
28
7
11
9
7
4
35
1
6
15
2
50
18
15
10
15
7
7
28
20
#Problems solved in 30mins
Domain
Parking-opt11(20)
Pathways-noneg(30)
Pegsol-08(30)
Pegsol-opt11(30)
Pipesworld-notank(50)
Pipesworld-tank(50)
Psr-small(50)
Rovers(40)
Satellite(36)
Scanalyzer-08(30)
Scanalyzere-opt11(20)
Sokoban-opt08(30)
Sokoban-opt11(20)
Tidybot-opt11(20)
Tpp(30)
Transport-opt08(30)
Transport-opt11(20)
Trucks(30)
Visitall-opt11(20)
Woodworking-opt08(30)
Woodworking-opt11(20)
Zenotravel(20)
Total(1396)
LM-cut
2
5
27
17
17
11
49
7
7
15
12
28
20
13
6
11
6
10
10
16
11
12
756
base
1
4
28
18
15
10
50
6
6
13
10
16
15
5
8
10
6
9
17
12
7
9
594
Flow-based heuristic
• Base heuristic is weak and can be improved significantly
– Van den Briel, Benton, Kambhampati, and Vossen. CP2007
Problem
logitstics4-0
logitstics4-1
logitstics4-2
logitstics5-1
logitstics5-2
logistics6-1
logistics6-9
logistics12-0
logistics15-1
driverlog1
driverlog2
driverlog3
driverlog4
driverlog6
driverlog7
driverlog13
driverlog19
driverlog20
Opt
20
19
15
17
8
14
24
42
7
19
12
16
11
13
26
h+
19
17
13
15
8
13
21
39
66
6
14
11
12
10
12
21
89
84
LP–
16
14
10
12
6
10
18
32
54
3
12
8
11
8
11
15
60
60
Heuristic value at initial state
Problem
zenotravel1
zenotravel2
zenotravel3
zenotravel4
zenotravel5
zenotravel6
zenotravel13
zenotravel19
zenotravel20
tpp1
tpp2
tpp3
tpp4
tpp5
tpp6
tpp28
tpp29
tpp30
Opt
1
6
6
8
11
11
26
5
8
11
14
19
25
h+
1
4
5
6
11
11
23
62
4
7
10
13
17
21
LP–
1
3
4
5
8
8
18
46
50
3
6
9
12
15
21
150
174
Flow-based heuristic
• Base heuristic is weak and can be improved significantly
– Van den Briel, Benton, Kambhampati, and Vossen. CP2007
Problem
logitstics4-0
logitstics4-1
logitstics4-2
logitstics5-1
logitstics5-2
logistics6-1
logistics6-9
logistics12-0
logistics15-1
driverlog1
driverlog2
driverlog3
driverlog4
driverlog6
driverlog7
driverlog13
driverlog19
driverlog20
Opt
20
19
15
17
8
14
24
42
7
19
12
16
11
13
26
h+
19
17
13
15
8
13
21
39
66
6
14
11
12
10
12
21
89
84
LP–
16
14
10
12
6
10
18
32
54
3
12
8
11
8
11
15
60
60
LP
20
19
15
17
8
14
24
42
67
7
19
11
16
11
13
24
97
90
Heuristic value at initial state
Problem
zenotravel1
zenotravel2
zenotravel3
zenotravel4
zenotravel5
zenotravel6
zenotravel13
zenotravel19
zenotravel20
tpp1
tpp2
tpp3
tpp4
tpp5
tpp6
tpp28
tpp29
tpp30
Opt
1
6
6
8
11
11
26
5
8
11
14
19
25
h+
1
4
5
6
11
11
23
62
4
7
10
13
17
21
LP–
1
3
4
5
8
8
18
46
50
3
6
9
12
15
21
150
174
LP
1
6
6
8
11
11
24
67
69
5
8
11
14
19
25
Motivation
• Base heuristic is effective and efficient yet can be improved
significantly
– Adding constraints derived from landmarks
Bonet. IJCAI2013
– Adding constraints derived from merges
Van den Briel, Benton, Kambhampati, and Vossen. CP2007
Landmarks
• A landmark 𝐿 ⊆ 𝐴 for a planning task 𝑃 =< 𝑉, 𝐴, 𝑠0 , 𝑠∗ , 𝑐 > is
a subset of actions such that every plan 𝑃 contains at least
one action in 𝐿
• A set of landmarks ℒ (state dependent) introduces the
following constraints
∀𝐿 ∈ ℒ
𝑎∈𝐿 𝑓(𝑎) ≥ 1
Theorem 2. The set of landmark constraints is
admissible. As a result, the solution to the base
ℒ
model with landmark constraints 𝑕𝑏𝑎𝑠𝑒
can be used
as an admissible heuristic.
ℒ
Further, 𝑕𝑏𝑎𝑠𝑒
dominates 𝑕ℒ∗ . In particular,
𝐿𝑀−𝑐𝑢𝑡
∗
𝑕𝑏𝑎𝑠𝑒
≥ 𝑕𝐿𝑀−𝑐𝑢𝑡
≥ 𝑕𝐿𝑀−𝑐𝑢𝑡
Results
• Heuristic performance across common solved tasks
– Base model versus Base model with LM-cut landmarks
Base
LM-cut
Base
LM-cut
Base
Heuristic comparison
Base
Results
• Coverage results
– f10 Zhu-Givan method, f20 LM-cut method
Domain
Airport(50)
Barman-opt11(20)
Blocks(35)
Depot(22)
Driverlog(20)
Elevators-opt08(30)
Elevators-opt11(20)
Floortile-opt11(20)
Freecell(80)
Grid(5)
Gripper(20)
Logistics00(28)
Logistics98(35)
Miconic(150)
Mprime(35)
Mystery(30)
Nomystery-opt11(20)
Openstacks-opt08(30
Openstacks-opt11(20)
Openstacks(30)
Parcprinter-08(30)
Parcprinter-opt11(20)
base
f 10
20
4
28
7
11
9
7
4
35
1
6
15
2
50
18
15
10
15
7
7
28
20
f 20
26
4
28
7
11
9
7
4
47
2
7
14
3
57
20
16
8
12
7
8
28
20
#Problems solved in 30mins
25
4
29
7
13
18
15
6
27
1
5
20
6
140
20
16
12
15
7
7
29
20
Domain
Parking-opt11(20)
Pathways-noneg(30)
Pegsol-08(30)
Pegsol-opt11(30)
Pipesworld-notank(50)
Pipesworld-tank(50)
Psr-small(50)
Rovers(40)
Satellite(36)
Scanalyzer-08(30)
Scanalyzer-opt11(20)
Sokoban-opt08(30)
Sokoban-opt11(20)
Tidybot-opt11(20)
Tpp(30)
Transport-opt08(30)
Transport-opt11(20)
Trucks(30)
Visitall-opt11(20)
Woodworking-opt08(30)
Woodworking-opt11(20)
Zenotravel(20)
Total(1396)
base
f 10
f 20
1
4
28
18
15
10
50
6
6
13
10
16
15
5
8
10
6
9
17
12
7
9
1
4
26
16
16
10
50
6
6
13
9
20
16
11
8
10
5
9
17
14
9
9
1
5
27
17
11
9
50
7
7
11
8
27
20
8
8
11
6
9
19
20
15
11
594
630
749
Example
T
P
1
2
L(1)
U(1)
1
1
L(1)
U(1)
2
U(2)
D(1, 2)
D(2, 1)
L(1)
2
L(2)
U(2)
T
D(2, 1)
DTG(T)
D(1, 2)
U(1)
D(2, 1)
D(1, 2)
L(2)
T
DTG(P)
2
1
2
L(2)
1
D(2, 1)
D(1, 2)
DTG(P, T)
U(2)
Merges
• A merge combines two or more variables into one “super”
variable.
• A merged variable introduces the following constraints
1 if 𝑥 ∉ 𝑠0 , 𝑥 ∈ 𝑠∗
(𝑥 ′ ,𝑎,𝑥)∈𝑇𝑋 𝑓(𝑎) −
𝑥,𝑎,𝑥 ′ ∈𝑇𝑋 𝑓 𝑎 ≥ −1 if 𝑥 ∈ 𝑠0 , 𝑥 ∉ 𝑠∗
0 otherwise
𝑓(𝑎) ≥
′)
𝑓(𝑎
𝑎′∈𝐶𝑜𝑝𝑖𝑒𝑠(𝑎,𝑍)
Theorem 3. The set of merge constraints is
admissible. As a result, the solution to the base
model with merge constraints can be used as an
admissible heuristic
Merges
T
P
1
2
Q: 1
T: 1
Q
2
3
2
3
1
2
3
3
1
P:
1
2
T
DTG(P, T, Q)
2
T
3
1
2
3
Dynamic merging
• Dynamic merging is dynamic in all ways
– Incrementally merge more variables
– Incrementally merge variables
• Very simple merge strategy
– Merge prevail conditions with the preconditions of an action
𝑚𝑒𝑟𝑔𝑒(𝑝, 𝑞) for all atoms 𝑝 ∈ 𝑃𝑟𝑒𝑣(𝑎) and 𝑞 ∈ 𝑃𝑟𝑒(𝑎)
Example
T
P
1
2
L(1)
U(1)
1
1
L(1)
U(1)
2
D(1, 2)
D(2, 1)
2
L(2)
U(2)
L(2)
U(2)
T
DTG(P)
DTG(T)
Example
T
P
1
2
L(1)
U(1)
1
1
L(1)
U(1)
D(1, 2)
2
D(2, 1)
2
L(2)
L(2)
U(2)
U(2)
T
h=2
DTG(P)
DTG(T)
Example
T
P
1
2
L(1)
U(1)
1
1
L(1)
U(1)
D(2, 1)
2
L(2)
L(1)
2
L(2)
U(2)
U(2)
T
D(2, 1)
h=2
DTG(T)
D(1, 2)
U(1)
D(2, 1)
D(1, 2)
L(2)
T
DTG(P)
2
1
D(1, 2)
2
1
D(2, 1)
D(1, 2)
DTG(P, T)
U(2)
Example
T
P
1
2
L(1)
U(1)
1
1
L(1)
U(1)
D(2, 1)
2
L(2)
L(1)
2
L(2)
U(2)
U(2)
T
D(2, 1)
h=4
DTG(T)
D(1, 2)
U(1)
D(2, 1)
D(1, 2)
L(2)
T
DTG(P)
2
1
D(1, 2)
2
1
D(2, 1)
D(1, 2)
DTG(P, T)
U(2)
Results
• All experiments run with Fast Downward
– Helmert, Journal of Artificial Intelligence Research, 2006
• Setup
–
–
–
–
AMD Opteron 6378 CPUs running at 2.4GHz
2Gb memory limit
1800 seconds timeout
Using IBM CPLEX 12.5.1 as the LP solver
• Heuristics
–
–
–
–
LM-cut
h*LM-cut
base
fXY
• X = 0, without landmarks, X = 1, with Zhu-Givan, X = 2, with LM-cut
• Y = 0, without merge, Y = 1, with merge
Results
• Coverage results
Base
Merge
LM-cut
Domain
Airport(50)
Barman-opt11(20)
Blocks(35)
Depot(22)
Driverlog(20)
Elevators-opt08(30)
Elevators-opt11(20)
Floortile-opt11(20)
Freecell(80)
Grid(5)
Gripper(20)
Logistics00(28)
Logistics98(35)
Miconic(150)
Mprime(35)
Mystery(30)
Nomystery-opt11(20)
Openstacks-opt08(30
Openstacks-opt11(20)
Openstacks(30)
Parcprinter-08(30)
Parcprinter-opt11(20)
LM-cut
28
4
28
7
13
22
17
6
15
2
7
20
6
141
22
19
14
20
15
7
18
13
Base
LM-cut
Base
LM-cut Merge
h*LM-cut
28
4
28
7
13
19
16
6
15
2
6
20
6
141
22
18
14
17
12
7
18
13
base
#Problems solved in 30mins
f 01
20
4
28
7
11
9
7
4
35
1
6
15
2
50
18
15
10
15
7
7
28
20
f 10
22
0
28
6
15
21
17
2
33
2
20
22
7
58
24
20
14
10
5
7
30
20
f 11
26
4
28
7
11
9
7
4
47
2
7
14
3
57
20
16
8
12
7
8
28
20
22
0
28
5
15
21
17
2
28
2
20
22
7
140
24
20
14
10
5
7
30
20
f 20
25
4
29
7
13
18
15
6
27
1
5
20
6
140
20
16
12
15
7
7
29
20
f 21
22
0
29
5
15
21
17
5
29
2
20
22
10
141
25
17
14
10
5
5
30
20
𝒉𝑳𝑴−𝒄𝒖𝒕
≥ 𝒉∗𝑳𝑴−𝒄𝒖𝒕 ≥ 𝒉𝑳𝑴−𝒄𝒖𝒕
𝒃𝒂𝒔𝒆
Results
• Coverage results
Base
Merge
LM-cut
Domain
Parking-opt11(20)
Pathways-noneg(30)
Pegsol-08(30)
Pegsol-opt11(30)
Pipesworld-notank(50)
Pipesworld-tank(50)
Psr-small(50)
Rovers(40)
Satellite(36)
Scanalyzer-08(30)
Scanalyzer-opt11(20)
Sokoban-opt08(30)
Sokoban-opt11(20)
Tidybot-opt11(20)
Tpp(30)
Transport-opt08(30)
Transport-opt11(20)
Trucks(30)
Visitall-opt11(20)
Woodworking-opt08(30)
Woodworking-opt11(20)
Zenotravel(20)
Total(1396)
Base
LM-cut
Base
LM-cut Merge
LM-cut
2
5
27
17
17
11
49
7
7
15
12
28
20
13
6
11
6
10
10
16
11
12
h*LM-cut
1
5
27
17
16
9
49
7
7
13
10
28
20
13
6
11
6
10
10
16
11
12
756
736
base
f 01
f 10
f 11
f 20
f 21
1
4
28
18
15
10
50
6
6
13
10
16
15
5
8
10
6
9
17
12
7
9
1
4
28
18
13
10
50
6
6
12
9
18
15
1
11
10
5
10
17
25
18
13
1
4
26
16
16
10
50
6
6
13
9
20
16
11
8
10
5
9
17
14
9
9
1
4
26
16
15
10
50
7
6
11
8
20
16
1
11
10
5
12
17
28
20
13
1
5
27
17
11
9
50
7
7
11
8
27
20
8
8
11
6
9
19
20
15
11
1
5
27
17
13
10
50
7
7
11
8
27
19
1
10
11
6
11
19
28
20
13
594
683
630
766
749
785
Results
• Heuristic performance across common solved tasks
– LM-cut versus Base model with landmarks and merges
Base
LM-cut
Merge
Base
LM-cut
Merge
LM-cut
Heuristic comparison
LM-cut
Results
• Time spent by 𝐴∗ search for finding a solution
– LM-cut versus Base model with LM-cut landmarks and merges
Base
LM-cut
Merge
LM-cut
Time comparison
Conclusions
• Flow-based heuristics are effective and efficient yet can be
improved significantly
– Adding constraints derived from landmarks
– Adding constraints derived from merges
• Dynamic merging incrementally merges more and more
variables
– So far, only incorporated a very simple merge strategy
• Flow-based heuristics provide a rich area for future research
Blai Bonet
Universidad Simon Bolivar
Caracas, Venezuela
[email protected]
Menkes van den Briel
NICTA & Australian National University
Canberra, Australia
[email protected]