Dimensions in Synthesis Sumit Gulwani sumitg@microsoft.com Microsoft Research, Redmond May 2012.

Post on 23-Dec-2015

215 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

Transcript

Dimensions in Synthesis

Sumit Gulwanisumitg@microsoft.com

Microsoft Research, Redmond

May 2012

• Synthesize a program in some underlying language from user intent using some search technique.

2

Program Synthesis

• Why today?– Variety of (cheap) computational devices and platforms

• Billions of non-experts have access to these devices!– Enabling technology is now available

• Better search algorithms• Faster machines (good application for multi-cores)

• Synthesize a program in some underlying language from user intent using some search technique.

3

Program Synthesis

• Why today?– Variety of (cheap) computational devices and platforms

• Billions of non-experts have access to these devices!– Enabling technology is now available

• Better search algorithms• Faster machines (good application for multi-cores)

• Concept Language– Programs

• Straight-line programs– Automata– Queries– Sequences

• User Intent– Logic, Natural Language– Examples, Demonstrations/Traces

• Search Technique– SAT/SMT solvers (Formal Methods)– A*-style goal-directed search (AI)– Version space algebras (Machine Learning)

4

Dimensions in Synthesis

PPDP 2010: “Dimensions in Program Synthesis”, Gulwani.

(Application)

(Ambiguity)

(Algorithm)

5

Compilers vs. Synthesizers

Dimension

Compilers Synthesizers

Concept Language

Executable Program

Variety of concepts: Program, Automata, Query, Sequence

User Intent Structured language

Variety/mixed form of constraints: logic, examples, traces

Search Technique

Syntax-directed translation (No new algorithmic insights)

Uses some kind of search (Discovers new algorithmic insights)

 

Students and Teachers

End-Users

Algorithm Designers

Software Developers

Most Transformational Target

Potential Users of Synthesis Technology

6

Most Useful Target

• Vision for End-users: Enable people to have (automated) personal assistants.

• Vision for Education: Enable every student to have access to free & high-quality education.

Lecture 1: Algorithms• Synthesis of Straight-line Programs from Logic

– Bit-vector Algorithms– Geometry Constructions

Lecture 2: Applications• Intelligent Tutoring Systems

Lecture 3: Ambiguity• Synthesis from Examples & Keywords

7

Organization

Intelligent Tutoring Systems

Technical Goals:• Identify a useful task that can be formalized as

a synthesis problem.• Propose an appropriate user interaction model.• Propose an appropriate search technique.

8

Lab

Synthesizing Bitvector Algorithms

PLDI 2011: Gulwani, Jha, Tiwari, Venkatesan

• Concept Language– Programs

• Straight-line programs– Automata– Queries– Sequences

• User Intent– Logic, Natural Language– Examples, Demonstrations/Traces

• Search Technique– SAT/SMT solvers (Formal Methods)– A*-style goal-directed search (AI)– Version space algebras (Machine Learning)

10

Dimensions in Synthesis

PPDP 2010: “Dimensions in Program Synthesis”, Gulwani.

Straight-line programs that use – Arithmetic Operators: +,-,*,/– Logical Operators: Bitwise and/or/not, Shift left/right

11

Bitvector Algorithms

1 0 1 0 1 1 0 0

Turn-off rightmost 1-bit

12

Examples of Bitvector Algorithms

1 0 1 0 1 1 0 0

1 0 1 0 1 0 0 0

Z

Z & (Z-1)

1 0 1 0 1 0 1 1

Z

Z-1

1 0 1 0 1 0 0 0

&

Z & (Z-1)

13

Examples of Bitvector Algorithms

Turn-off rightmost contiguous sequence of 1-bits

Z

Z & (1 + (Z | (Z-1)))

1 0 1 0 1 1 0 0 1 0 1 0 0 0 0 0

Ceil of average of two integers without overflowing

(Y|Z) – ((Y©Z) >> 1)

14

Examples of Bitvector Algorithms

Higher order half of product of x and yo1 := and(x,0xFFFF);o2 := shr(x,16);o3 := and(y,0xFFFF);o4 := shr(y,16);o5 := mul(o1,o3);o6 := mul(o2,o3);o7 := mul(o1,o4);o8 := mul(o2,o4);o9 := shr(o5,16);o10 := add(o6,o9);o11 :=

and(o10,0xFFFF);o12 := shr(o10,16);o13 := add(o7,o11);o14 := shr(o13,16);o15 := add(o14,o12);res := add(o15,o8);

Round up to nexthighest power of 2o1 := sub(x,1);o2 := shr(o1,1);o3 := or(o1,o2);o4 := shr(o3,2);o5 := or(o3,o4);o6 := shr(o5,4);o7 := or(o5,o6);o8 := shr(o7,8);o9 := or(o7,o8);o10 := shr(o9,16);o11 := or(o9,o10);res := add(o10,1);

Given:• Specification of desired

functionality• Specification of library components

Synthesize a straight-line program

15

Problem Definition

where• Each variable in is either or some where

k<j• is a permutation of 1...n

that meets the desired specification.

VerificationConstraint

• Specification of desired functionality

• Specification of library components

16

Problem Definition: Turn-off rightmost 1 bit

17

Synthesis Constraint

VerificationConstraint

SynthesisConstraint

represents which component goes on which location (line #) and from which location does it gets its input arguments. We encode this by location variables L.

18

Idea # 1: Reduce Second-order Quantification in Synthesis Constraint to First Order

19

Example: Possible programs that use 2 components and their Representation using

Location Variables

• Consistency Constraint: Every line in the program should have at most one component.

20

Encoding Well-formedness of Programs

• Acyclicity Constraint: A variable should be initialized before being used.

The following constraint ensures that L assignments correspond to well-formed programs.

21

Encoding data-flow

The following constraint describes connections between inputs and outputs of various components.

22

Idea # 1: Reduce Second-order Quantification in Synthesis Constraint to First Order

Synthesis constraint is of the form: 9L 8Y F(L,Y)

Finite Synthesis Step9L F(L,y1) Æ … Æ F(L,yn)

Verification StepDoes 8Y F(S,Y) hold?Or, equivalently 9Y :F(S,Y)

Solution Y = yn+1

return S 23

Choose some values y1,..,yn for y

Solution L = S

Failure

No Solution

No Solution

Idea # 2: Using CEGIS style procedure to solve the Synthesis Constraint

Experiments: Comparison with Brute-force Search

24

Program Brahma AHAtimeNam

elines

iters time

P1 2 2 3 0.1

P2 2 3 3 0.1

P3 2 3 1 0.1

P4 2 2 3 0.1

P5 2 3 2 0.1

P6 2 2 2 0.1

P7 3 2 1 2

P8 3 2 1 1

P9 3 2 6 7

P10 3 14 76 10

P11 3 7 57 9

P12 3 9 67 10

Program Brahma AHAtime

Name lines

iters time

P13 4 4 6 X

P14 4 4 60 X

P15 4 8 119 X

P16 4 5 62 X

P17 4 6 78 109

P18 6 5 46 X

P19 6 5 35 X

P20 7 6 108 X

P21 8 5 28 X

P22 8 8 279 X

P23 10 8 1668 X

P24 12 9 224 X

P25 16 11 2779 X

Synthesizing Geometry Constructions

PLDI 2011: Gulwani, Korthikanti, Tiwari.

• Concept Language– Programs

• Straight-line programs– Automata– Queries– Sequences

• User Intent– Logic, Natural Language– Examples, Demonstrations/Traces

• Search Technique– SAT/SMT solvers (Formal Methods)– A*-style goal-directed search (AI)– Version space algebras (Machine Learning)

26

Dimensions in Synthesis

PPDP 2010: “Dimensions in Program Synthesis”, Gulwani.

Given a triangle XYZ, construct circle C such that C passes through X, Y, and Z.

27

Ruler/Compass based Geometry Constructions

X

Z

Y

L1 L2

N

C

• Draw a regular hexagon given a side.

• Given 3 parallel lines, draw an equilateral triangle whose vertices lie on the parallel lines.

• Given 4 points, draw a square whose sides contain those points.

28

Other Examples of Geometry Constructions

• Good platform for teaching logical reasoning.

– Visual Nature:• Makes it more accessible.• Exercises both logical/visual abilities of left/right

brain.

– Fun Aspect: • Ruler/compass restrictions make it fun, as in

sports.

• Application in dynamic geometry or animations.– “Constructive” geometry macros (unlike numerical

methods) enable fast re-computation of derived objects from free (moving) objects.

29

Significance

Types: Point, Line, Circle

Methods:• Ruler(Point, Point) -> Line • Compass(Point, Point) -> Circle• Intersect(Circle, Circle) -> Pair of Points• Intersect(Line, Circle) -> Pair of Points• Intersect(Line, Line) -> Point

Geometry Program: A straight-line composition of the above methods.

30

Programming Language for Geometry Constructions

Given a triangle XYZ, construct circle C such that C passes through X, Y, and Z.

31

Example Problem: Program

1. C1 = Compass(X,Y);2. C2 = Compass(Y,X);3. <P1,P2> =

Intersect(C1,C2);4. L1 = Ruler(P1,P2);5. D1 = Compass(Z,X);6. D2 = Compass(X,Z);7. <R1,R2> =

Intersect(D1,D2);8. L2 = Ruler(R1,R2);9. N = Intersect(L1,L2);10.C = Compass(N,X);

X

Z

Y

C1

C2P1

P2

L1

D2

D1 R1

R2

L2

N

C

Conjunction of predicates over arithmetic expressions

Predicates p := e1 = e2

| e1 e2

| e1 · e2

Arithmetic Expressions e := Distance(Point, Point) | Slope(Point, Point) | e1 § e2

| c32

Specification Language for Geometry Programs

Given a triangle XYZ, construct circle C such that C passes through X, Y, and Z.

Precondition: Slope(X,Y) Slope(X,Z) Æ Slope(X,Y) Slope(Z,X)

Postcondition: LiesOn(X,C) Æ LiesOn(Y,C) Æ LiesOn(Z,C)

Where LiesOn(X,C) ´ Distance(X,Center(C)) = Radius(C)

Example Problem: Precondition/Postcondition

33

• Let P be a geometry program that computes outputs O from inputs I.

• Verification Problem: Check the validity of the following Hoare triple.

Assume Pre(I); P

Assert Post(I,O);

• Synthesis Problem: Given Pre(I), Post(I,O), find P such that the above Hoare triple is valid.

34

Verification/Synthesis Problem for Geometry Programs

Pre(I), P, Post(I,O)

a) Symbolic decision procedures are complex.

35

Approaches to Verification Problem

• Problem: Given two polynomials P1 and P2, determine whether they are equivalent.

• The naïve deterministic algorithm of expanding polynomials to compare them term-wise is exponential.

• A simple randomized test is probabilistically sufficient:– Choose random values r for polynomial variables x– If P1(r) ≠ P2(r), then P1 is not equivalent to P2.– Otherwise P1 is equivalent to P2 with high

probability,

36

Randomized Polynomial Identity Testing

Pre(I), P, Post(I,O)

a) Symbolic decision procedures are complex.

b) New efficient approach: Random Testing!1. Choose I’ randomly from the set { I | Pre(I) }.2. Compute O’ := P(I’).3. If O’ satisfies Post(I’,O’) output “Verified”.

Correctness Proof of (b):• Objects constructed by P can be described using

polynomial ops (+,-,*), square-root & division operator.

• The randomized polynomial identity testing algorithm lifts to square-root and division operators as well !

37

Approaches to Verification Problem

Synthesis Algorithm: // First obtain a random input-output example.1. Choose I’ randomly from the set { I | Pre(I) }.2. Compute O’ s.t. Post(I’,O’) using numerical

methods.// Now obtain a construction that can generate O’ from I’ (using exhaustive search).3. S := I’;4. While (S does not contain O’)5. S := S [ { M(O1,O2) | Oi 2 S, M 2 Methods }

6. Output construction steps for O’.

38

Idea 1 (from Theory): Symbolic Reasoning -> Concrete

Given a triangle XYZ, construct circle C such that C passes through X, Y, and Z.

39

Error Probability of the algorithm is extremely low.

…L1 = Ruler(P1,P2); …L2 = Ruler(R1,R2);N = Intersect(L1,L2);C = Compass(N,X);

39

• For an equilateral 4XYZ, incenter coincides with circumcenter N.

• But what are the chances of choosing a random 4XYZ to be an equilateral one?

X

Z

Y

L1 L2

N

C

Synthesis algorithm times out because programs are large.

• Identify a library of commonly used patterns (pattern = “sequence of geometry methods”)– E.g., perpendicular/angular bisector, midpoint, tangent, etc.

S := S [ { M(O1,O2) | Oi 2 S, M 2 Methods }

S := S [ { M(O1,O2) | Oi 2 S, M 2 LibMethods }

• Two key advantages:– Search space: large depth -> small depth– Easier to explain solutions to students.

40

Idea 2 (from PL): High-level Abstractions

Given a triangle XYZ, construct circle C such that C passes through X, Y, and Z.

41

Use of high-level abstractions reduces program size

1. C1 = Compass(X,Y);2. C2 = Compass(Y,X);3. <P1,P2> =

Intersect(C1,C2);4. L1 = Ruler(P1,P2);5. D1 = Compass(Z,X);6. D2 = Compass(X,Z);7. <R1,R2> =

Intersect(D1,D2);8. L2 = Ruler(R1,R2);9. N = Intersect(L1,L2);10.C = Compass(N,X);

1. L1 = PBisector(X,Y);2. L2 = PBisector(X,Z);3. N = Intersect(L1,L2);4. C = Compass(N,X);

Synthesis algorithm is inefficient because the search space is too wide and hence still huge.

• Prune forward search by using A* style heuristics.

S := S [ { M(O1,O2) | Oi 2 S, M 2 LibMethods }

S := S [ {M(O1,O2) | Oi2S, M2LibMethods, IsGood(M(O1,O2)) }

• Example: If a method constructs a line L that passes through a desired output point, then L is “good” (i.e., worth constructing).

42

Idea 3 (from AI): Goal Directed Search

Given a triangle XYZ, construct circle C such that C passes through X, Y, and Z.

43

Effectiveness of Goal-directed search

43

• L1 and L2 are immediately constructed since they pass through output point N.

• On the other hand, other lines like angular bisectors are not eagerly constructed.

X

Z

Y

L1 L2

N

C

25 benchmark problems.

• such as: Construct a square whose extended sides pass through 4 given points.

• 18 problems less than 1 second. 4 problems between 1-3 seconds. 3 problems 13-82 seconds.

• Idea 2 (high-level abstractions) reduces programs of size 3-45 to 3-13.

• Idea 3 (goal-directedness) improves performance by factor of 10-1000 times on most problems. 44

Experimental Results

45

Search space Exploration: With/without goal-directness

• Concept Language– Programs

• Straight-line programs– Automata– Queries– Sequences

• User Intent– Logic, Natural Language– Examples, Demonstrations/Traces

• Search Technique– SAT/SMT solvers (Formal Methods)– A*-style goal-directed search (AI)– Version space algebras (Machine Learning)

46

Dimensions in Synthesis

PPDP 2010: “Dimensions in Program Synthesis”, Gulwani.

• Lecture 2– Section 4 in WAMBSE 2012 keynote paper

“Synthesis from Examples”, Gulwani.

• Lab– Section 4 in WAMBSE 2012 keynote paper.– NCERT Online Book Website. http://ncert.nic.in/NCERTS/textbook/textbook.htm

• Lecture 3– Sections 1-3 in WAMBSE 2012 keynote paper

47

Optional Advance Preparation

• Motivation– Online learning sites: Khan academy, Edx, Udacity,

Coursera• Increasing class sizes with even less personal attention

– New technologies: Tablets/Smartphones, NUI, Cloud• Various Aspects

– Solution Generation– Problem Generation – Automated Grading– Content Entry

• Various Domains– K-12: Mathematics, Physics, Chemistry– Undergraduate: Introductory Programming, Automata

Theory – Language Learning 48

Intelligent Tutoring Systems

top related