Top Banner
Learning to Support Constraint Programmers Susan L. Epstein 1 Gene Freuder 2 and Rick Wallace 2 1 Department of Computer Science Hunter College and The Graduate Center of The City University of New York 2 Cork Constraint Computation Centre
38

Learning to Support Constraint Programmers Susan L. Epstein 1 Gene Freuder 2 and Rick Wallace 2 1 Department of Computer Science Hunter College and The.

Mar 27, 2015

Download

Documents

Austin Bartlett
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Learning to Support Constraint Programmers Susan L. Epstein 1 Gene Freuder 2 and Rick Wallace 2 1 Department of Computer Science Hunter College and The.

Learning to Support Constraint Programmers

Susan L. Epstein1

Gene Freuder2 and Rick Wallace2

1Department of Computer Science

Hunter College and The Graduate Center of

The City University of New York2Cork Constraint Computation Centre

Page 2: Learning to Support Constraint Programmers Susan L. Epstein 1 Gene Freuder 2 and Rick Wallace 2 1 Department of Computer Science Hunter College and The.

Facts about ACE

Learns to solve constraint satisfaction problems Learns search heuristics Can transfer what it learns on simple problems to solve

more difficult ones Can export knowledge to ordinary constraint solvers Both a learner and a test bed

Heuristic but complete: will find a solution, eventually, if one exists

Guarantees high-quality, not optimal, solutions Begins with substantial domain knowledge

Page 3: Learning to Support Constraint Programmers Susan L. Epstein 1 Gene Freuder 2 and Rick Wallace 2 1 Department of Computer Science Hunter College and The.

Outline

The task: constraint satisfaction Performance results Reasoning mechanism Learning Representations

Page 4: Learning to Support Constraint Programmers Susan L. Epstein 1 Gene Freuder 2 and Rick Wallace 2 1 Department of Computer Science Hunter College and The.

Constraint satisfaction problem <X, D, C> Solution: assign a value to every variable

consistent with constraints Many real-world problems can be represented and

solved this way (design and configuration, planning and scheduling, diagnosis and testing)

The Problem Space

DomainsA {1,2,3}B {1,2,4,5,6}C {1,2}D {1,3}

ConstraintsA = BA > DC ≠ D

VariablesA, B, C, D BA

C D

(1 1) (2 2)

(2 1) (3 1) (3 2)

(1 3) (2 1) (2 3)

Page 5: Learning to Support Constraint Programmers Susan L. Epstein 1 Gene Freuder 2 and Rick Wallace 2 1 Department of Computer Science Hunter College and The.

A Challenging Domain Constraint solving is NP-hard Problem class parameters: <n, k, d, t>

n = number of variables

k = maximum domain size

d = edge density (% of possible constraints)

t = tightness (% of value pairs excluded)

Complexity peak: values for d and t that make problems hardest

Heavy-tailed distribution difficulty [Gomes et al., 2002]

Problem may have multiple or no solutions Unexplored choices may be good

Page 6: Learning to Support Constraint Programmers Susan L. Epstein 1 Gene Freuder 2 and Rick Wallace 2 1 Department of Computer Science Hunter College and The.

Finding a Path to a Solution

Sequence of decision pairs (select variable, assign value)

Optimal length: 2n for n variables For n variables with domain size d, there are

(d+1)n possible states

Select a variableAssign a valueSolution

Page 7: Learning to Support Constraint Programmers Susan L. Epstein 1 Gene Freuder 2 and Rick Wallace 2 1 Department of Computer Science Hunter College and The.

B

D=3No

C=2

A=2…

Solution Method

Search from initial state to goal

DomainsA {1,2,3}B {1,2,4,5,6}C {1,2}D {1,3}

No

D

D=1No

D

D=1 D=3No

ConstraintsA = BA > DC ≠ D

BA

C D

(1 1) (2 2)

(2 1) (3 1) (3 2)

(1 3) (2 1) (2 3)

B=1

A

C D

AA=1

C D

C

C=1

D

Page 8: Learning to Support Constraint Programmers Susan L. Epstein 1 Gene Freuder 2 and Rick Wallace 2 1 Department of Computer Science Hunter College and The.

Consistency Maintenance Some values may initially be inconsistent Value assignment can restrict domains

B=2

A {1,2}C {1,2}D {1,3}

No

C {1,2}D No other

possibilities

ConstraintsA = BA > DC ≠ D

BB=1A

A=1

DomainsA {1,2,3}B {1,2,4,5,6}C {1,2}D {1,3}

BA

C D

(1 1) (2 2)

(2 1) (3 1) (3 2)

(1 3) (2 1) (2 3)

Page 9: Learning to Support Constraint Programmers Susan L. Epstein 1 Gene Freuder 2 and Rick Wallace 2 1 Department of Computer Science Hunter College and The.

When an inconsistency arises, a retraction method removes a value and returns to an earlier state

Retraction

Here!

B=2

A {1,2}C {1,2}D {1,3}

No!

C {1,2}D

BB=1A

A=1

BA

C D

(1 1) (2 2)

(2 1) (3 1) (3 2)

(1 3) (2 1) (2 3)

DomainsA {1,2,3}B {1,2,4,5,6}C {1,2}D {1,3}

Where’s the error?

Page 10: Learning to Support Constraint Programmers Susan L. Epstein 1 Gene Freuder 2 and Rick Wallace 2 1 Department of Computer Science Hunter College and The.

A=2B {1,2}C {1,2}D {1,2}

Variable OrderinglA good variable ordering can speed search

A

A=1

DomainsA {1,2,3}B {1,2,4,5,6}C {1,2}D {1,3}

B {1,2}C {1,2}D

No

BA

C D

(1 1) (2 2)

(2 1) (3 1) (3 2)

(1 3) (2 1) (2 3)

Page 11: Learning to Support Constraint Programmers Susan L. Epstein 1 Gene Freuder 2 and Rick Wallace 2 1 Department of Computer Science Hunter College and The.

Value OrderingA good value ordering can speed search too

AA=2

DomainsA {1,2,3}B {1,2,4,5,6}C {1,2}D {1,3}

B {1,2}C {1,2}D {1,3}

DD=1

B {1,2}C {1,2}

BB=2

CC=2

C {1,2}

Solution: A=2, B=2, C=2, D=1

BA

C D

(1 1) (2 2)

(2 1) (3 1) (3 2)

(1 3) (2 1) (2 3)

Page 12: Learning to Support Constraint Programmers Susan L. Epstein 1 Gene Freuder 2 and Rick Wallace 2 1 Department of Computer Science Hunter College and The.

Constraint Solvers Know…

Several consistency methods Several retraction methods Many variable ordering heuristics Many value ordering heuristics

… but the interactions among them are not well understood, nor is one combination best for all problem classes.

Page 13: Learning to Support Constraint Programmers Susan L. Epstein 1 Gene Freuder 2 and Rick Wallace 2 1 Department of Computer Science Hunter College and The.

Goals of the ACE Project

Characterize problem classes Learn to solve classes of problems well Evaluate mixtures of known heuristics Develop new heuristics Explore the role of planning in solution

Page 14: Learning to Support Constraint Programmers Susan L. Epstein 1 Gene Freuder 2 and Rick Wallace 2 1 Department of Computer Science Hunter College and The.

Outline

The task: constraint satisfaction Performance results ACE Reasoning mechanism Learning Representation

Page 15: Learning to Support Constraint Programmers Susan L. Epstein 1 Gene Freuder 2 and Rick Wallace 2 1 Department of Computer Science Hunter College and The.

Experimental Design

Specify problem class, consistency and retraction methods

Average performance across 10 runs Learn on L problems (halt at 10,000 steps) To-completion testing on T new problems During testing, use only heuristics judged accurate

during learning Evaluate performance on

Steps to solution Constraint checks Retractions Elapsed time

Page 16: Learning to Support Constraint Programmers Susan L. Epstein 1 Gene Freuder 2 and Rick Wallace 2 1 Department of Computer Science Hunter College and The.

ACE Learns to Solve Hard Problems

<30, 8, .24, .66> near the complexity peak Learn on 80 problems 10 runs, binned in sets of 10 learning problems Discards 26 of 38 heuristics Outperforms MinDomain, an “off-the-shelf” heuristic

Steps to solution

Steps in Total Learning

0

500

1000

1500

2000

2500

1 - 10 11 - 20 21 - 30 31 - 40 41 - 50 51 - 60 61 - 70 71 - 80

Tasks

Steps

Avg

Med

2500

1500

1000

500

2000

1 2 3 4 5 6 7 8Bin #

Means in blue, medians in red

Page 17: Learning to Support Constraint Programmers Susan L. Epstein 1 Gene Freuder 2 and Rick Wallace 2 1 Department of Computer Science Hunter College and The.

ACE Rediscovers Brélaz Heuristic

Graph coloring: assign different colors to adjacent nodes.

Graph coloring is a kind of constraint satisfaction problem.

Brélaz: Minimize dynamic domain, break ties with maximum forward degree.

ACE learned this consistently on different classes of graph coloring problems.

[Epstein & Freuder, 2001]

Color each vertex red, blue, or green so pair of adjacent vertices are different colors.

Page 18: Learning to Support Constraint Programmers Susan L. Epstein 1 Gene Freuder 2 and Rick Wallace 2 1 Department of Computer Science Hunter College and The.

ACE Discovers a New Heuristic

“Maximize the product of degree and forward degree at the top of the search tree”

Exported to several traditional approaches: Min Domain Min Domain/Degree Min Domain + degree preorder Learned on small problems but tested in 10 runs on n =

150, domain size 5, density .05, tightness .24 Reduced search tree size by 25% – 96%

[Epstein, Freuder, Wallace, Morozov, & Samuels 2002]

Page 19: Learning to Support Constraint Programmers Susan L. Epstein 1 Gene Freuder 2 and Rick Wallace 2 1 Department of Computer Science Hunter College and The.

Outline

The task: constraint satisfaction Performance results Reasoning mechanism Learning Representation

Page 20: Learning to Support Constraint Programmers Susan L. Epstein 1 Gene Freuder 2 and Rick Wallace 2 1 Department of Computer Science Hunter College and The.

Constraint-Solving Heuristic

Uses domain knowledge What problem classes does it work well on? Is it valid throughout a single solution? Can its dual also be valid? How can heuristics be combined?

… and where do new heuristics come from?

Page 21: Learning to Support Constraint Programmers Susan L. Epstein 1 Gene Freuder 2 and Rick Wallace 2 1 Department of Computer Science Hunter College and The.

FORR (For the Right Reasons)

General architecture for learning and problem solving Multiple learning methods, multiple representations,

multiple decision rationales Specialized by domain knowledge Learns useful knowledge to support reasoning Specify whether a rationale is correct or heuristic Learns to combine rationales to improve problem

solving

[Epstein 1992]

Page 22: Learning to Support Constraint Programmers Susan L. Epstein 1 Gene Freuder 2 and Rick Wallace 2 1 Department of Computer Science Hunter College and The.

An Advisor Implements a Rationale

Class-independent action-selection rationale Supports or opposes actions by comments Expresses opinion direction by strengths Limitedly-rational procedure

< strength, action, Advisor >

current problem state

Advisor

actions

Page 23: Learning to Support Constraint Programmers Susan L. Epstein 1 Gene Freuder 2 and Rick Wallace 2 1 Department of Computer Science Hunter College and The.

Advisor Categories

Tier 1: rationales that correctly select a single action Tier 2: rationales produce a set of actions directed to a

subgoal Tier 3: heuristic rationales that select a single action

Page 24: Learning to Support Constraint Programmers Susan L. Epstein 1 Gene Freuder 2 and Rick Wallace 2 1 Department of Computer Science Hunter College and The.

Choosing an Action

take actionyes

Tier 1: Reaction from perfect knowledge

Victory T-11 T-1n…

Decision?

begin planyes

no

Tier 3: Heuristic reactions

T-31 T-32 T-3m……

Voting take action

Tier 2: Planning triggered by situation recognition

no

P-1 P-2 P-k…

Decision?

Current statePossible actions

Page 25: Learning to Support Constraint Programmers Susan L. Epstein 1 Gene Freuder 2 and Rick Wallace 2 1 Department of Computer Science Hunter College and The.

ACE’s Domain Knowledge

Consistency maintenance methods: forward checking, arc consistency

Backtracking methods: chronological 21 variable ordering heuristics 19 value ordering heuristics 3 languages whose expressions have interpretations as

heuristics Graph theory knowledge, e.g., connected, acyclic Constraint solving knowledge, e.g., “only one arc

consistency pass is required on a tree”

Page 26: Learning to Support Constraint Programmers Susan L. Epstein 1 Gene Freuder 2 and Rick Wallace 2 1 Department of Computer Science Hunter College and The.

An Overview of ACE

The task: constraint satisfaction Performance results ACE Reasoning mechanism Learning Representation

Page 27: Learning to Support Constraint Programmers Susan L. Epstein 1 Gene Freuder 2 and Rick Wallace 2 1 Department of Computer Science Hunter College and The.

What ACE Learns Weighted linear combination for comment strengths

For voting in tier 3 only Includes only valuable heuristics Indicates relative accuracy of valuable heuristics

New, learned heuristics How to restructure tier 3 When random choice is the right thing to do Acquire knowledge that supports heuristics (e.g.,

typical solution path length)

Page 28: Learning to Support Constraint Programmers Susan L. Epstein 1 Gene Freuder 2 and Rick Wallace 2 1 Department of Computer Science Hunter College and The.

Learn from trace of each solved problem Reward decisions on perfect solution path Shorter paths reward variable ordering Longer paths reward value ordering Blame digression-producing decisions in

proportion to error Valuable Advisor’s weight > baseline’s

Digression-based Weight Learning

Select a variableAssign a value

Solutiondigression

error

Page 29: Learning to Support Constraint Programmers Susan L. Epstein 1 Gene Freuder 2 and Rick Wallace 2 1 Department of Computer Science Hunter College and The.

Learning New Advisors

Advisor grammar on pairs of concerns Maximize or minimize Product or quotient Stage

Monitor all expressions Use good ones collectively Use best ones individually

Page 30: Learning to Support Constraint Programmers Susan L. Epstein 1 Gene Freuder 2 and Rick Wallace 2 1 Department of Computer Science Hunter College and The.

Outline

The task: constraint satisfaction Performance results ACE Reasoning mechanism Learning Representation

Page 31: Learning to Support Constraint Programmers Susan L. Epstein 1 Gene Freuder 2 and Rick Wallace 2 1 Department of Computer Science Hunter College and The.

No No No No Yes

Representation of Experience State describes variables and value assignments,

impossible future values, prior state, connected components, constraint checks incurred, dynamic edges, trees

History of successful decisions … plus other significant decisions

become training examples

Is Can be Cannot beA — 1 2B 2 — —C — 1,2 —D — 1,3 —Checks incurred: 41 acyclic component: A,C,DDynamic edges: AD, CD

Page 32: Learning to Support Constraint Programmers Susan L. Epstein 1 Gene Freuder 2 and Rick Wallace 2 1 Department of Computer Science Hunter College and The.

Representation of Learned Knowledge

Weights for Advisors Solution size distribution Latest error: greatest number of variables bound at

retraction

Page 33: Learning to Support Constraint Programmers Susan L. Epstein 1 Gene Freuder 2 and Rick Wallace 2 1 Department of Computer Science Hunter College and The.

ACE’s Status Report 41 Advisors in tiers 1 and 3 3 languages in which to express additional Advisors 5 experimental planners Problem classes: random, coloring, geometric, logic,

n-queens, small world, and quasigroup (with and without holes)

Learns to solve hard problems Learns new heuristics Transfers to harder problems Divides and conquers problems Learns when not to reason

Page 34: Learning to Support Constraint Programmers Susan L. Epstein 1 Gene Freuder 2 and Rick Wallace 2 1 Department of Computer Science Hunter College and The.

Current ACE Research Further weight-learning refinements Learn appropriate restart parameters More problem classes, consistency methods,

retraction methods, planners, and Advisor languages Learn appropriate consistency checking methods Learn appropriate backtracking methods Learn to bias initial weights Metaheuristics to reformulate the architecture Modeling strategies

… and, coming soon, ACE on the Web

Page 35: Learning to Support Constraint Programmers Susan L. Epstein 1 Gene Freuder 2 and Rick Wallace 2 1 Department of Computer Science Hunter College and The.

Acknowledgements

Continued thanks for their ideas and efforts go to:

Diarmuid Grimes

Mark Hennessey

Tiziana Ligorio

Anton Morozov

Smiljana Petrovic

Bruce Samuels

Students of the FORR study group

The Cork Constraint Computation Centre

and, for their support, to:

The National Science Foundation

Science Foundation Ireland

Page 36: Learning to Support Constraint Programmers Susan L. Epstein 1 Gene Freuder 2 and Rick Wallace 2 1 Department of Computer Science Hunter College and The.

Is ACE Reinforcement Learning?

Similarities: Unsupervised learning through trial and error Delayed rewards Learns a policy

Primary differences: Reinforcement learning learns a policy represented

as the estimated values of states it has experienced repeatedly … but ACE is unlikely to revisit a state; instead it learns how to act in any state

Q-learning learns state-action preferences … but ACE learns a policy that combines action preferences

Page 37: Learning to Support Constraint Programmers Susan L. Epstein 1 Gene Freuder 2 and Rick Wallace 2 1 Department of Computer Science Hunter College and The.

How is ACE like STAGGER?

l STAGGER ACElLearns Boolean classifier Search control preference

function for a sequence of decisions in a class of problems

lRepresents Weighted booleans Weighted linear functionlSupervised Yes NolNew elements Failure-driven Success-drivenlInitial bias Yes Under constructionlReal attributes Yes No

[Schlimmer 1987]

Page 38: Learning to Support Constraint Programmers Susan L. Epstein 1 Gene Freuder 2 and Rick Wallace 2 1 Department of Computer Science Hunter College and The.

lBoth learn search control from unsupervised experience, reinforce decisions on a successful path, gradually introduce new factors, specify a threshold, and transfer to harder problems, but…l SAGE.2 ACElLearns on Same task Different problems in a class lRepresents Symbolic rules Weighted linear functionlReinforces Repeating rules Correct commentslFailure response Revise Reduce weightlProportional to error No YeslCompares states Yes NolRandom benchmarks No YeslSubgoals No YeslLearns during search Yes No

How is ACE like SAGE.2?

[Langley 1985]