Flow-based Heuristics for Optimal Planning: Landmarks and Merges Blai Bonet Universidad Simon Bolivar Caracas, Venezuela [email protected] Menkes van den Briel NICTA & Australian National University Canberra, Australia [email protected] Outline • Background and motivation • Base model • Strengthening the base model – Adding constraints derived from landmarks – Adding constraints derived from merges • Results • Conclusions Background An LP-Based Heuristic for Optimal Planning. Van den Briel, Benton, Kambhampati, and Vossen. CP2007. An Admissible Heuristic for SAS+ Planning Obtained from the State Equation. Bonet. IJCAI2013. Flow-based linear programming model • SAS+ planning task with action cost 𝑃 =< 𝑉, 𝐴, 𝑠0 , 𝑠∗ , 𝑐 > – 𝑉 set of variables each with a finite domain 𝐷𝑋 – 𝐴 set of actions < 𝑃𝑟𝑒, 𝑃𝑜𝑠𝑡, 𝑃𝑟𝑒𝑣 > – 𝑠0 initial state – 𝑠∗ goal state (partial valuation) – 𝑐 non-negative action cost • Domain transition graph (DTG) – Node for each value in 𝐷𝑋 – Labeled arc for each transition 𝑇𝑋 𝑇𝑋 = 𝑥, 𝑎, 𝑥 ′ : 𝑋 = 𝑥 ∈ 𝑃𝑟𝑒, 𝑋 = 𝑥′ ∈ 𝑃𝑜𝑠𝑡 ∪ ⊥, 𝑎, 𝑥 ′ : 𝑋 ∉ 𝑉𝑎𝑟 𝑃𝑟𝑒 , 𝑋 = 𝑥′ ∈ 𝑃𝑜𝑠𝑡 ∪ 𝑥, 𝑎, 𝑥 : 𝑋 = 𝑥 ∈ 𝑃𝑟𝑒𝑣 Example T P 1 2 Load(P, T, 1) Unload(P, T, 1) 1 1 Load(P, T, 1) Unload(P, T, 1) 2 Drive(T, 1, 2) Drive(T, 2, 1) 2 Load(P, T, 2) Unload(P, T, 2) Load(P, T, 2) Unload(P, T, 2) T DTG(P) DTG(T) Flow-based linear programming model • A flow for a planning task 𝑃 =< 𝑉, 𝐴, 𝑠0 , 𝑠∗ , 𝑐 > is a function 𝑓: 𝐴 → ℝ+ mapping action labels into non-negative real numbers • Minimize 𝑎∈𝐴 𝑐 𝑎 𝑓(𝑎) • Subject to (𝑥 ′ ,𝑎,𝑥)∈𝑇𝑋 𝑓(𝑎) − 𝑥,𝑎,𝑥 ′ ∈𝑇𝑋 𝑓 𝑎 =0 Flow-based linear programming model • Minimize 𝑎∈𝐴 𝑐 𝑎 𝑓(𝑎) • Subject to (𝑥 ′ ,𝑎,𝑥)∈𝑇𝑋 𝑓(𝑎) − 𝑥,𝑎,𝑥 ′ 1 if 𝑥 ∉ 𝑠0 , 𝑥 ∈ 𝑠∗ −1 if 𝑥 ∈ 𝑠0 , 𝑥 ∉ 𝑠∗ ∈𝑇𝑋 𝑓 𝑎 = 0 otherwise Flow-based linear programming model • Minimize 𝑎∈𝐴 𝑐 𝑎 𝑓(𝑎) • Subject to (𝑥 ′ ,𝑎,𝑥)∈𝑇𝑋 𝑓(𝑎) − 𝑥,𝑎,𝑥 ′ 1 if 𝑥 ∉ 𝑠0 , 𝑥 ∈ 𝑠∗ −1 if 𝑥 ∈ 𝑠0 , 𝑥 ∉ 𝑠∗ ∈𝑇𝑋 𝑓 𝑎 ≥ 0 otherwise Due to incomplete actions (see paper for details) Theorem 1. The solution to the base model 𝑏𝑎𝑠𝑒 can be used as an admissible heuristic Flow-based heuristic • Base heuristic is effective and efficient – Bonet. IJCAI2013 Domain Airport(50) Barman-opt11(20) Blocks(35) Depot(22) Driverlog(20) Elevators-opt08(30) Elevators-opt11(20) Floortile-opt11(20) Freecell(80) Grid(5) Gripper(20) Logistics00(28) Logistics98(35) Miconic(150) Mprime(35) Mystery(30) Nomystery-opt11(20) Openstacks-opt08(30 Openstacks-opt11(20) Openstacks(30) Parcprinter-08(30) Parcprinter-opt11(20) LM-cut 28 4 28 7 13 22 17 6 15 2 7 20 6 141 22 19 14 20 15 7 18 13 base 20 4 28 7 11 9 7 4 35 1 6 15 2 50 18 15 10 15 7 7 28 20 #Problems solved in 30mins Domain Parking-opt11(20) Pathways-noneg(30) Pegsol-08(30) Pegsol-opt11(30) Pipesworld-notank(50) Pipesworld-tank(50) Psr-small(50) Rovers(40) Satellite(36) Scanalyzer-08(30) Scanalyzere-opt11(20) Sokoban-opt08(30) Sokoban-opt11(20) Tidybot-opt11(20) Tpp(30) Transport-opt08(30) Transport-opt11(20) Trucks(30) Visitall-opt11(20) Woodworking-opt08(30) Woodworking-opt11(20) Zenotravel(20) Total(1396) LM-cut 2 5 27 17 17 11 49 7 7 15 12 28 20 13 6 11 6 10 10 16 11 12 756 base 1 4 28 18 15 10 50 6 6 13 10 16 15 5 8 10 6 9 17 12 7 9 594 Flow-based heuristic • Base heuristic is weak and can be improved significantly – Van den Briel, Benton, Kambhampati, and Vossen. CP2007 Problem logitstics4-0 logitstics4-1 logitstics4-2 logitstics5-1 logitstics5-2 logistics6-1 logistics6-9 logistics12-0 logistics15-1 driverlog1 driverlog2 driverlog3 driverlog4 driverlog6 driverlog7 driverlog13 driverlog19 driverlog20 Opt 20 19 15 17 8 14 24 42 7 19 12 16 11 13 26 h+ 19 17 13 15 8 13 21 39 66 6 14 11 12 10 12 21 89 84 LP– 16 14 10 12 6 10 18 32 54 3 12 8 11 8 11 15 60 60 Heuristic value at initial state Problem zenotravel1 zenotravel2 zenotravel3 zenotravel4 zenotravel5 zenotravel6 zenotravel13 zenotravel19 zenotravel20 tpp1 tpp2 tpp3 tpp4 tpp5 tpp6 tpp28 tpp29 tpp30 Opt 1 6 6 8 11 11 26 5 8 11 14 19 25 h+ 1 4 5 6 11 11 23 62 4 7 10 13 17 21 LP– 1 3 4 5 8 8 18 46 50 3 6 9 12 15 21 150 174 Flow-based heuristic • Base heuristic is weak and can be improved significantly – Van den Briel, Benton, Kambhampati, and Vossen. CP2007 Problem logitstics4-0 logitstics4-1 logitstics4-2 logitstics5-1 logitstics5-2 logistics6-1 logistics6-9 logistics12-0 logistics15-1 driverlog1 driverlog2 driverlog3 driverlog4 driverlog6 driverlog7 driverlog13 driverlog19 driverlog20 Opt 20 19 15 17 8 14 24 42 7 19 12 16 11 13 26 h+ 19 17 13 15 8 13 21 39 66 6 14 11 12 10 12 21 89 84 LP– 16 14 10 12 6 10 18 32 54 3 12 8 11 8 11 15 60 60 LP 20 19 15 17 8 14 24 42 67 7 19 11 16 11 13 24 97 90 Heuristic value at initial state Problem zenotravel1 zenotravel2 zenotravel3 zenotravel4 zenotravel5 zenotravel6 zenotravel13 zenotravel19 zenotravel20 tpp1 tpp2 tpp3 tpp4 tpp5 tpp6 tpp28 tpp29 tpp30 Opt 1 6 6 8 11 11 26 5 8 11 14 19 25 h+ 1 4 5 6 11 11 23 62 4 7 10 13 17 21 LP– 1 3 4 5 8 8 18 46 50 3 6 9 12 15 21 150 174 LP 1 6 6 8 11 11 24 67 69 5 8 11 14 19 25 Motivation • Base heuristic is effective and efficient yet can be improved significantly – Adding constraints derived from landmarks Bonet. IJCAI2013 – Adding constraints derived from merges Van den Briel, Benton, Kambhampati, and Vossen. CP2007 Landmarks • A landmark 𝐿 ⊆ 𝐴 for a planning task 𝑃 =< 𝑉, 𝐴, 𝑠0 , 𝑠∗ , 𝑐 > is a subset of actions such that every plan 𝑃 contains at least one action in 𝐿 • A set of landmarks ℒ (state dependent) introduces the following constraints ∀𝐿 ∈ ℒ 𝑎∈𝐿 𝑓(𝑎) ≥ 1 Theorem 2. The set of landmark constraints is admissible. As a result, the solution to the base ℒ model with landmark constraints 𝑏𝑎𝑠𝑒 can be used as an admissible heuristic. ℒ Further, 𝑏𝑎𝑠𝑒 dominates ℒ∗ . In particular, 𝐿𝑀−𝑐𝑢𝑡 ∗ 𝑏𝑎𝑠𝑒 ≥ 𝐿𝑀−𝑐𝑢𝑡 ≥ 𝐿𝑀−𝑐𝑢𝑡 Results • Heuristic performance across common solved tasks – Base model versus Base model with LM-cut landmarks Base LM-cut Base LM-cut Base Heuristic comparison Base Results • Coverage results – f10 Zhu-Givan method, f20 LM-cut method Domain Airport(50) Barman-opt11(20) Blocks(35) Depot(22) Driverlog(20) Elevators-opt08(30) Elevators-opt11(20) Floortile-opt11(20) Freecell(80) Grid(5) Gripper(20) Logistics00(28) Logistics98(35) Miconic(150) Mprime(35) Mystery(30) Nomystery-opt11(20) Openstacks-opt08(30 Openstacks-opt11(20) Openstacks(30) Parcprinter-08(30) Parcprinter-opt11(20) base f 10 20 4 28 7 11 9 7 4 35 1 6 15 2 50 18 15 10 15 7 7 28 20 f 20 26 4 28 7 11 9 7 4 47 2 7 14 3 57 20 16 8 12 7 8 28 20 #Problems solved in 30mins 25 4 29 7 13 18 15 6 27 1 5 20 6 140 20 16 12 15 7 7 29 20 Domain Parking-opt11(20) Pathways-noneg(30) Pegsol-08(30) Pegsol-opt11(30) Pipesworld-notank(50) Pipesworld-tank(50) Psr-small(50) Rovers(40) Satellite(36) Scanalyzer-08(30) Scanalyzer-opt11(20) Sokoban-opt08(30) Sokoban-opt11(20) Tidybot-opt11(20) Tpp(30) Transport-opt08(30) Transport-opt11(20) Trucks(30) Visitall-opt11(20) Woodworking-opt08(30) Woodworking-opt11(20) Zenotravel(20) Total(1396) base f 10 f 20 1 4 28 18 15 10 50 6 6 13 10 16 15 5 8 10 6 9 17 12 7 9 1 4 26 16 16 10 50 6 6 13 9 20 16 11 8 10 5 9 17 14 9 9 1 5 27 17 11 9 50 7 7 11 8 27 20 8 8 11 6 9 19 20 15 11 594 630 749 Example T P 1 2 L(1) U(1) 1 1 L(1) U(1) 2 U(2) D(1, 2) D(2, 1) L(1) 2 L(2) U(2) T D(2, 1) DTG(T) D(1, 2) U(1) D(2, 1) D(1, 2) L(2) T DTG(P) 2 1 2 L(2) 1 D(2, 1) D(1, 2) DTG(P, T) U(2) Merges • A merge combines two or more variables into one “super” variable. • A merged variable introduces the following constraints 1 if 𝑥 ∉ 𝑠0 , 𝑥 ∈ 𝑠∗ (𝑥 ′ ,𝑎,𝑥)∈𝑇𝑋 𝑓(𝑎) − 𝑥,𝑎,𝑥 ′ ∈𝑇𝑋 𝑓 𝑎 ≥ −1 if 𝑥 ∈ 𝑠0 , 𝑥 ∉ 𝑠∗ 0 otherwise 𝑓(𝑎) ≥ ′) 𝑓(𝑎 𝑎′∈𝐶𝑜𝑝𝑖𝑒𝑠(𝑎,𝑍) Theorem 3. The set of merge constraints is admissible. As a result, the solution to the base model with merge constraints can be used as an admissible heuristic Merges T P 1 2 Q: 1 T: 1 Q 2 3 2 3 1 2 3 3 1 P: 1 2 T DTG(P, T, Q) 2 T 3 1 2 3 Dynamic merging • Dynamic merging is dynamic in all ways – Incrementally merge more variables – Incrementally merge variables • Very simple merge strategy – Merge prevail conditions with the preconditions of an action 𝑚𝑒𝑟𝑔𝑒(𝑝, 𝑞) for all atoms 𝑝 ∈ 𝑃𝑟𝑒𝑣(𝑎) and 𝑞 ∈ 𝑃𝑟𝑒(𝑎) Example T P 1 2 L(1) U(1) 1 1 L(1) U(1) 2 D(1, 2) D(2, 1) 2 L(2) U(2) L(2) U(2) T DTG(P) DTG(T) Example T P 1 2 L(1) U(1) 1 1 L(1) U(1) D(1, 2) 2 D(2, 1) 2 L(2) L(2) U(2) U(2) T h=2 DTG(P) DTG(T) Example T P 1 2 L(1) U(1) 1 1 L(1) U(1) D(2, 1) 2 L(2) L(1) 2 L(2) U(2) U(2) T D(2, 1) h=2 DTG(T) D(1, 2) U(1) D(2, 1) D(1, 2) L(2) T DTG(P) 2 1 D(1, 2) 2 1 D(2, 1) D(1, 2) DTG(P, T) U(2) Example T P 1 2 L(1) U(1) 1 1 L(1) U(1) D(2, 1) 2 L(2) L(1) 2 L(2) U(2) U(2) T D(2, 1) h=4 DTG(T) D(1, 2) U(1) D(2, 1) D(1, 2) L(2) T DTG(P) 2 1 D(1, 2) 2 1 D(2, 1) D(1, 2) DTG(P, T) U(2) Results • All experiments run with Fast Downward – Helmert, Journal of Artificial Intelligence Research, 2006 • Setup – – – – AMD Opteron 6378 CPUs running at 2.4GHz 2Gb memory limit 1800 seconds timeout Using IBM CPLEX 12.5.1 as the LP solver • Heuristics – – – – LM-cut h*LM-cut base fXY • X = 0, without landmarks, X = 1, with Zhu-Givan, X = 2, with LM-cut • Y = 0, without merge, Y = 1, with merge Results • Coverage results Base Merge LM-cut Domain Airport(50) Barman-opt11(20) Blocks(35) Depot(22) Driverlog(20) Elevators-opt08(30) Elevators-opt11(20) Floortile-opt11(20) Freecell(80) Grid(5) Gripper(20) Logistics00(28) Logistics98(35) Miconic(150) Mprime(35) Mystery(30) Nomystery-opt11(20) Openstacks-opt08(30 Openstacks-opt11(20) Openstacks(30) Parcprinter-08(30) Parcprinter-opt11(20) LM-cut 28 4 28 7 13 22 17 6 15 2 7 20 6 141 22 19 14 20 15 7 18 13 Base LM-cut Base LM-cut Merge h*LM-cut 28 4 28 7 13 19 16 6 15 2 6 20 6 141 22 18 14 17 12 7 18 13 base #Problems solved in 30mins f 01 20 4 28 7 11 9 7 4 35 1 6 15 2 50 18 15 10 15 7 7 28 20 f 10 22 0 28 6 15 21 17 2 33 2 20 22 7 58 24 20 14 10 5 7 30 20 f 11 26 4 28 7 11 9 7 4 47 2 7 14 3 57 20 16 8 12 7 8 28 20 22 0 28 5 15 21 17 2 28 2 20 22 7 140 24 20 14 10 5 7 30 20 f 20 25 4 29 7 13 18 15 6 27 1 5 20 6 140 20 16 12 15 7 7 29 20 f 21 22 0 29 5 15 21 17 5 29 2 20 22 10 141 25 17 14 10 5 5 30 20 𝒉𝑳𝑴−𝒄𝒖𝒕 ≥ 𝒉∗𝑳𝑴−𝒄𝒖𝒕 ≥ 𝒉𝑳𝑴−𝒄𝒖𝒕 𝒃𝒂𝒔𝒆 Results • Coverage results Base Merge LM-cut Domain Parking-opt11(20) Pathways-noneg(30) Pegsol-08(30) Pegsol-opt11(30) Pipesworld-notank(50) Pipesworld-tank(50) Psr-small(50) Rovers(40) Satellite(36) Scanalyzer-08(30) Scanalyzer-opt11(20) Sokoban-opt08(30) Sokoban-opt11(20) Tidybot-opt11(20) Tpp(30) Transport-opt08(30) Transport-opt11(20) Trucks(30) Visitall-opt11(20) Woodworking-opt08(30) Woodworking-opt11(20) Zenotravel(20) Total(1396) Base LM-cut Base LM-cut Merge LM-cut 2 5 27 17 17 11 49 7 7 15 12 28 20 13 6 11 6 10 10 16 11 12 h*LM-cut 1 5 27 17 16 9 49 7 7 13 10 28 20 13 6 11 6 10 10 16 11 12 756 736 base f 01 f 10 f 11 f 20 f 21 1 4 28 18 15 10 50 6 6 13 10 16 15 5 8 10 6 9 17 12 7 9 1 4 28 18 13 10 50 6 6 12 9 18 15 1 11 10 5 10 17 25 18 13 1 4 26 16 16 10 50 6 6 13 9 20 16 11 8 10 5 9 17 14 9 9 1 4 26 16 15 10 50 7 6 11 8 20 16 1 11 10 5 12 17 28 20 13 1 5 27 17 11 9 50 7 7 11 8 27 20 8 8 11 6 9 19 20 15 11 1 5 27 17 13 10 50 7 7 11 8 27 19 1 10 11 6 11 19 28 20 13 594 683 630 766 749 785 Results • Heuristic performance across common solved tasks – LM-cut versus Base model with landmarks and merges Base LM-cut Merge Base LM-cut Merge LM-cut Heuristic comparison LM-cut Results • Time spent by 𝐴∗ search for finding a solution – LM-cut versus Base model with LM-cut landmarks and merges Base LM-cut Merge LM-cut Time comparison Conclusions • Flow-based heuristics are effective and efficient yet can be improved significantly – Adding constraints derived from landmarks – Adding constraints derived from merges • Dynamic merging incrementally merges more and more variables – So far, only incorporated a very simple merge strategy • Flow-based heuristics provide a rich area for future research Blai Bonet Universidad Simon Bolivar Caracas, Venezuela [email protected] Menkes van den Briel NICTA & Australian National University Canberra, Australia [email protected]
© Copyright 2024 ExpyDoc