Searching for Macro- operators with Automatically Generated Heuristics István T. Hernádvölgyi University of Ottawa [email protected]
Dec 30, 2015
Searching for Macro-operators with Automatically Generated Heuristics
István T. Hernádvölgyi
University of Ottawa
Motivation
Time
Quality
1 sec
1 week
18 85
Korf 97
Korf 85
50 455
Sims 70, Osterlund 95us
Subgoals and Macros
1 2 3 4 5
o1
o2
o3
o4
1 2 3 4 5
1
2
3
4
1
1
1
2
2 3
1 2 3 4 5
1
1
1
1
1
2
2
2
2
3
3
3
4
4
1
1
1
1
1
1
1
1
1
2
2
2
2
2
3
3
1 2 3 4 5
1
1
1
1
1
2
2
2
2
3
3
3
4
4
1
1
1
1
1
1
1
1
1
2
2
2
2
2
3
3
{}
{o1}
{o2}
{o3}
{o4}
{}
{o2,o1,o2}
{o3,o2,o3}
{o4,o3,o4}
{o3,o1,o3}
{}
{o4,o2,o4}
{}
{o4,o1,o4}
1 2 3 4 5
1
1
1
1
1
2
2
2
2
3
3
3
4
4
1
1
1
1
1
1
1
1
1
2
2
2
2
2
3
3
{}
{o1}
{o2}
{o3}
{o4}
{}
{o2,o1,o2}
{o3,o2,o3}
{o4,o3,o4}
{o3,o1,o3}
{}
{o4,o2,o4}
{}
{o4,o1,o4}
4 1 2 5 3
1
1
1
1
1
2
2
2
2
3
3
3
4
4
1
1
1
1
1
1
1
1
1
2
2
2
2
2
3
3
{}
{o1}
{o2}
{o3}
{o4}
{}
{o2,o1,o2}
{o3,o2,o3}
{o4,o3,o4}
{o3,o1,o3}
{}
{o4,o2,o4}
{}
{o4,o1,o4}
4 1 2 5 3
{o1}
1
1
1
1
1
2
2
2
2
3
3
3
4
4
1
1
1
1
1
1
1
1
1
2
2
2
2
2
3
3
{}
{o1}
{o2}
{o3}
{o4}
{}
{o2,o1,o2}
{o3,o2,o3}
{o4,o3,o4}
{o3,o1,o3}
{}
{o4,o2,o4}
{}
{o4,o1,o4}
1 4 2 5 3
{o1,o2,o1,o2}
1
1
1
1
1
2
2
2
2
3
3
3
4
4
1
1
1
1
1
1
1
1
1
2
2
2
2
2
3
3
{}
{o1}
{o2}
{o3}
{o4}
{}
{o2,o1,o2}
{o3,o2,o3}
{o4,o3,o4}
{o3,o1,o3}
{}
{o4,o2,o4}
{}
{o4,o1,o4}
1 2 4 5 3
{o1,o2,o1,o2,o4,o2,o4}
1
1
1
1
1
2
2
2
2
3
3
3
4
4
1
1
1
1
1
1
1
1
1
2
2
2
2
2
3
3
{}
{o1}
{o2}
{o3}
{o4}
{}
{o2,o1,o2}
{o3,o2,o3}
{o4,o3,o4}
{o3,o1,o3}
{}
{o4,o2,o4}
{}
{o4,o1,o4}
1 2 3 5 4
{o1,o2,o1,o2,o4,o2,o4,o4,o1,o4} {o1,o2,o1,o2,o4,o2,o1,o4} {o2,o4,o3,o2}
2
2
2
2
3
3
3
1
1
1
1
1
1
1
2
2
2
{}
{o2,o1,o2}
{o3,o2,o3}
{o4,o3,o4}
{o3,o1,o3}
{}
{o4,o2,o4}
2 31 {}
2 31
2 31
231
2 31
2 31
231
231
2 31
231
231
231
{o3,o1,o3}
{o4,o2,o4}
{o1,o2,o1}
{o1,o3,o2}
{o1,o4,o1,o3}
{o2,o3,o1}
{o3,o2,o3}
{o2,o4,o2}
{o3,o4,o1}
{o2,o3,o4,o2,o3}
{o4,o3,o4}
Results
The cost of merging subgoals
#macros avg/max length
States expanded to build table
Time to solve 10,000 problems
18 subgoals 258 5.6 / 13 108 million
(1 hour)
6 subgoals
(8,3,2,2,1)
2902
+ 96 million
9.7 / 13 ?
Solution: use heuristics to speed up search for each macro
• For each subgoal, create a heuristic specifically for that subgoal
• Can combine this with the heuristic for the previous subgoal (because that subgoal is a subset of the current subgoal)
• Need to create these automatically, they are unintuitive spaces and the heuristics are “throw away”
Previous WorkAutomatically Generated Heuristics• Absolver [Prieditis90]• Pattern Database [CulbersonSchaeffer96]
– 15 Puzzle, “don’t care tiles”, Rubik’s Cube [Korf97]
– Domain Abstraction [HolteHernadvolgyi99]
– Planning (Strips) [Edelkamp2000]
h s H s( ) [ ' ]
: , | | | |D D D D1 2 1 2
Pattern Database
1 2 3
4 5 6
7 8
1 2 3
6
181,440 15,120
Pattern Database
6 3 4
8 7 5
12
6 3
12
Domain Abstraction
1 2 3
4 5 6
7 8
181,440 5,040
Rubik’s Cube
ULF UF URF
LF RF
DLF DF DRF
F
DL DR
DLB DB DRB
D
ULB UB URB
UL URU
ULB UL
LB
DLB DL
L
UR URB
RB
DR DRB
R
UB ULB
LB
DB DLB
B
State Representation
URFURF
URBURB
ULBULB
ULFULF
DRFDRF
DRBDRB
DLBDLB
DLFDLF
UFUF
ULUL
URUR
UBUB
LFLF
LBLB
DLDL
RFRF
RBRB
DFDF
DBDB
DRDR
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
DRFURF
ULBURB
ULFULB
DLBULF
DLFDRF
DRFDRB
URBDLB
DRBDLF
DLUF
URUL
RBUR
LBUB
LBLF
LFLB
UFDL
DRRF
ULRB
DFDF
DBDB
RFDR
2 0 1 0 1 0 2 0
0 1 0 0 0 0 1 1 0 0 0 1
Goal state (subscripts indicate orientation)
Typical start state
Macro Search Space
URFURF
URBURB
ULBULB
ULFULF DRF DRB DLB DLF
UFUF
ULUL
URUR
UBUB
LFLF
LBLB
DLDL
RF RBDF DB DR
0 0 0 0
0 0 0 0 0 0 0 0 0
* * * *
* * *R F R B
URFURF
URBURB
ULBULB
ULFULF DRF DRB DLB DLF
UFUF
ULUL
URUR
UBUB
LFLF
LBLB
DLDL
RBDF DB
RFDR
0 0 0 0
0 0 0 0 0 0 0 1 0
* * * *
* * *R F R B
Goal state for the subgoal “fix RF and RB”
Typical start state for this subgoal
Domain Abstraction
• “don’t care” about a cubie’s orientation
• Make 2 (or more) cubies indistinguishable from each other
• Continue to add abstractions until the abstract space is sufficiently small (2 million entries)
Abstraction Example
AURF
AURB
BULB
BULF DRF DRB DLB DLF
CUF
CUL
CUR
CUB
DLF
DLB
DDL
RF RBDF DB DR
x x
x x x x x x x
0 0
0 0
* * * *
* * *R F R B
AURF
AURB
BULB
BULF DRF DRB DLB DLF
CUF
CUL
CUR
CUB
DLF
DLB
DDL
RBDF DB
RFDR
x x
x x x x x x x
0 0
1 0
* * * *
* * *R F R B
Abstracted goal state (x = orientation doesn’t matter)
Abstracted start state (typical)
The cost of merging subgoals
#macros avg/max length
States expanded to build table
18 subgoals 258 5.6 / 13 108 million
(1 hour)
6 subgoals
(8,3,2,2,1)
2902
+ 96 million
9.7 / 13 2 billion + 250,000/problem
(27 hours + 1 second/problem)
Contributions
• Automatically Generated Heuristics for Finding Macro-operators
• Merging Subgoals to Obtain Shorter Solutions
• 44% Improvement for Rubik’s Cube
• Method is General and Mostly Automatic
Using Symmetries
Using SymmetriesURF, URB,ULB,ULF,UF,UL,UR,UB
LF,LB,DL
RF,RB
DF,DB (> DR)
DRF,DRB
DLF (>DLB)
LF,RF,DR RF,RB,DR LB,RB,DL
DLF, DLB
DRF (> DRB)
LF,RB LF,LBLF, RF
Kociemba’s Two-phase Algorithm
U D R L F B, , , , ,2 2 2 2
x y z x C y E z UDo o, , , , C
E
UD
o
o
: ,
: ,
:
C o rn e r O rien ta tio n s 3
E d g e O rien ta tio n s
U D slice d g es
7
2 1 8 7
2 2 0 4 8
1 2
44 9 5
11
x y x x C y E z EUD face UD slice, , , ,
h h x y h x z h y zm ax ( * ( , ) , * ( , ) , * ( , ))
Open Questions
• Do the improvements carry over to other spaces?
• Does the order of the subgoals matter?
Motivation
4 3 1 01 9.
4 3 2 5 2 0 0 3 2 7 4 4 8 9 8 5 6 0 0 0, , , , , ,
Motivation
Time
Quality
1 sec
1 week
18 85
Korf 97
Korf 85Kociemba 92
50
Kloosterman 90 Thistlethwaite 80
455
Sims 70, Osterlund 95
human
Merging Consecutive Subgoals
, ( , , . . . , ) ( ) , ( , ) , . . . , ( , , . . . , )f 1 2 1 1 2 1 2 1 2k k kf f f
f ( , , . . . , ) ( , , . . . , ) , . . . , ( , , . . . , ) , 1 2 1 1 2 1k i l j j kg g l k
g f f fi i j i i i i i j i i j( , , . . . , ) ( ) ( , ) .. . ( , , . . . , ) 1 1 1 1
g mi i j i i j( , , . . . , ) ( , , . . . , ) 1 1
Serial Decomposability [Korf83]
Idea:
Previous WorkMacro-Operators
• Stabilizer Chain [Sims70, Osterlund95]
• Bi-directional Partial-Match [Korf85]
S tab G G GG k k( , , . . . , ) . . ., , ,..., 1 2 1 1 2 1 2
1
Domain Abstraction1. select two invariant cubies: c1 and c22. toss coin3. if (heads) // mask orientations4. if (both orientations masked)5. goto 16. mask orientations of c1 and c27. else // mask cubie identity8. if (mask(c1) == mask(c2) goto 19. select new mask A10. if (!masked(c1) && !masked(c2))11. c1 := A, c2 := A12. else if (masked(c1) && !masked(c2))13. all cubies with mask(c1) and c2 := A14. else if (!masked(c1) && masked(c2))15. all cubies with mask(c2) and c1 := A16. else // both has a mask17. cubies with mask(c1) and mask(c2) := A18. H := expand(N)19. if (H == 0) goto 120. return H
Search
• IDA* [Korf85]
• Maximum of 3 pattern databases (5M entries)
• 18 subgoals
• 6 subgoals
• 10,000 random instances
URF URB ULB ULF UF UL UR UB
LF LB DL RF RB DF DB DR
DRF DRB DLF DLB
, , , , , , , ,
, , , , , , , ( ) ,
, , , ( )
{ , , , , , , , } ,
{ , , } ,{ , } ,{ , , ( )} ,
{ , } ,{ , ( )}
URF URB ULB ULF UF UL UR UB
LF LB DL RF RB DF DB DR
DRF DRB DLF DLB
Statistics