Top Banner
A recommender system for generalizing and refining code templates Towards Coen De Roover, Tim Molderez
20
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: A recommender system for generalizing and refining code templates

A recommender system for generalizing and refining

code templates

Towards

Coen De Roover, Tim Molderez

Page 2: A recommender system for generalizing and refining code templates

User-specified templates are ubiquitouscode transformation

code search

code generation

Page 3: A recommender system for generalizing and refining code templates

Life without templates?manual implementations

tools taking exotic specifications

Imperative Program Transformation by Rewriting 53

It says that an assignment x := v (where v is a variable) can be replaced byx := c if the “last assignment” to v was v := c (where c is a constant). The sidecondition formalises the notion of “last assignment”, and will be explain later inthe paper:

n : (x := v) =⇒ x := cif

n ⊢ A△(¬def(v) U def(v) ∧ stmt(v := c))conlit(c)

The rewrite language has several important properties:

– The specification is in the form of a rewrite system with the advantages ofsuccinctness and intuitiveness mentioned above.

– The rewrite system works over a control flow graph representation of theprogram. It does this by identifying and manipulating graph blocks whichare based on the idea of basic blocks but with finer granularity.

– The rewrites are executable. An implementation exists to automatically de-termine when the rewrite applies and to perform the transformation justfrom the specification.

– The relation between the conditions on the control flow graph and the op-erational semantics of the program seems to lend itself to formal reasoningabout the transformation.

The paper is organised as follows. §2 covers earlier work in the area andprovides the motivation for this work. §3 describes our method of rewriting overcontrol graphs. §4 describes the form of side conditions for those rewrites. §5gives three examples of common transformations and their application whengiven as rewrites. §6 discusses what has been achieved and possible applicationsof this work.

2 Background

Implementing optimising transformations is hard: building a good optimisingcompiler is a major effort. If a programmer wishes to adapt a compiler to aparticular task, for example to improve the optimisation of certain library calls,intricate knowledge of the compiler internals is necessary. This contrasts with thedescription of such optimisations in textbooks [1,3,26], where they are often de-scribed in a few lines of informal English. It is not surprising, therefore, that theprogram transformation community has sought declarative ways of programmingtransformations, to enable experimentation without excessive implementation ef-fort. The idea to describe program transformations by rewriting is almost as oldas the subject itself. One early implementation can be found in the TAMPRsystem by Boyle, which has been under development since the early ’70s [8,9].TAMPR starts with a specification, which is translated to pure lambda calculus,and rewriting is performed on the pure lambda expressions. Because programs

Page 4: A recommender system for generalizing and refining code templates

But specifying templates is still hard…

often requires multiple iterations

no unwanted matches

no required matches are missed generalization

refinement

no support for editing process

code templatesource code + meta-variables + matching directives

no disciplined methods for generalizing/refining templates

no automated support in the form of recommender system

Page 5: A recommender system for generalizing and refining code templates

1/ advanced code templates in Ekeko/X2/ formal operators for template mutation3/ genetic search for mutation recommendations

I’m a “structural search and replace” on steroids

Page 6: A recommender system for generalizing and refining code templates

1/ advanced code templates in Ekeko/X2/ formal operators for template mutation3/ genetic search for mutation recommendations

template

matches

Page 7: A recommender system for generalizing and refining code templates

1/ advanced code templates in Ekeko/X2/ formal operators for template mutation3/ genetic search for mutation recommendations

template

matches

Page 8: A recommender system for generalizing and refining code templates

1/ advanced code templates in Ekeko/X2/ formal operators for template mutation3/ genetic search for mutation recommendations

template

matches

[<component>]@[<directive>]

Page 9: A recommender system for generalizing and refining code templates

match any match is in source code, with matching type and properties

orsimple qualified name match is any name resolving to name in template(equals ?var) any exposes match

child, child+, child* any match is corresponding child of parent match, nested within that child (+), or either (*)

match|set list match has at least given elements, in any order

match|regexp list match has elements described by regexp

(type ?type)type/variable declaration/referenceexpression

match resolves to, is of, or declares the type of its argument

(subtype ?type), (subtype+ ?type), (subtype* /type)

type/variable declaration/reference

match resolves to a (transitive +, reflexive *) subtype of the given argument

(refers-to ?var) expression match lexically refers to local, parameter or field denoted by its argument

(invokes ?method) invocation expression match invokes given argument

[<component>]@[<directive>]

[….acceptVisitor(…)]@[(invokes ?method)][public void acceptVisitor(ComponentVisitor v)…]@[(equals ?method)]

constraining syntax, structure, data flow, control flow of matches

grouping of templates

Page 10: A recommender system for generalizing and refining code templates

1/ advanced code templates in Ekeko/X2/ formal operators for template mutation3/ genetic search for mutation recommendations

? ?

?

Page 11: A recommender system for generalizing and refining code templates

1/ advanced code templates in Ekeko/X2/ formal operators for template mutation3/ genetic search for mutation recommendations

return age; return ?v;introduce-variable

generalize-aliases

atomicmutation

composite mutation

public class Book {private Integer count;public Integer getCount() {

return count;}

}

public class Book {private Integer ?v1;public Integer getCount() {

return [?v2]@[(refers-to ?v1)];}

}

Page 12: A recommender system for generalizing and refining code templates

1/ advanced code templates in Ekeko/X2/ formal operators for template mutation3/ genetic search for mutation recommendations

(Operator. "add-directive-invokes" operators/add-directive-invokes :refinement "Add directive invokes." opscope-subject applicability|methodinvocation "Requires matches to invoke the binding for the meta-variable." [(make-operand "Meta-variable (e.g., ?v)" opscope-variable validity|variable)])

constraints on their subject

constraints on their operands

constraints enable checking applicability of operator, validity of its operands + generating possible values!

(Operator. "remove-node" operators/remove-node :destructive "Remove from template." opscope-subject applicability|deleteable "Removes its selection from the template." [])

Page 13: A recommender system for generalizing and refining code templates

1/ advanced code templates in Ekeko/X2/ formal operators for template mutation3/ genetic search for mutation recommendations

populate determine fitness

best individual good

enough?

select parents crossover & mutation

given enumeration of desired matches

yay!

Page 14: A recommender system for generalizing and refining code templates

1/ advanced code templates in Ekeko/X2/ formal operators for template mutation3/ genetic search for mutation recommendations

individual = group of templates - copies of existing group (from editor state) - one template group per row,

consisting of a template per column in desired matches (from scratch)

populate

Page 15: A recommender system for generalizing and refining code templates

1/ advanced code templates in Ekeko/X2/ formal operators for template mutation3/ genetic search for mutation recommendations

a) match each template group in population against program

b) determine precision and recall w.r.t. desired matches

c) penalize excess use of directives (KISS)

determine fitness

concurrently!

Page 16: A recommender system for generalizing and refining code templates

1/ advanced code templates in Ekeko/X2/ formal operators for template mutation3/ genetic search for mutation recommendations

16 2 Tree-based GP

3

1y

+

yx

+

+

2x

/

CrossoverPoint

CrossoverPoint

3

+

2x

/

(x+y)+3

(y+1) (x/2)*

(x/2)+3

Parents Offspring

GARBAGE

Figure 2.5: Example of subtree crossover. Note that the trees on the leftare actually copies of the parents. So, their genetic material can freely beused without altering the original individuals.

to crossover operations frequently exchanging only very small amounts ofgenetic material (i.e., small subtrees); many crossovers may in fact reduceto simply swapping two leaves. To counter this, Koza (1992) suggested thewidely used approach of choosing functions 90% of the time and leaves 10%of the time. Many other types of crossover and mutation of GP trees arepossible. They will be described in Sections 5.2 and 5.3, pages 42–46.

The most commonly used form of mutation in GP (which we will callsubtree mutation) randomly selects a mutation point in a tree and substi-tutes the subtree rooted there with a randomly generated subtree. This isillustrated in Figure 2.6. Subtree mutation is sometimes implemented ascrossover between a program and a newly generated random program; thisoperation is also known as “headless chicken” crossover (Angeline, 1997).

Another common form of mutation is point mutation, which is GP’srough equivalent of the bit-flip mutation used in genetic algorithms (Gold-berg, 1989). In point mutation, a random node is selected and the primitivestored there is replaced with a di↵erent random primitive of the same aritytaken from the primitive set. If no other primitives with that arity ex-ist, nothing happens to that node (but other nodes may still be mutated).When subtree mutation is applied, this involves the modification of exactlyone subtree. Point mutation, on the other hand, is typically applied on a

[Genetic programming, a field guide]

select parents

mutation crossover

Page 17: A recommender system for generalizing and refining code templates

1/ advanced code templates in Ekeko/X2/ formal operators for template mutation3/ genetic search for mutation recommendations

best templates after 30 iterations

[public void acceptVisitor(ComponentVisitor v);]@[(invoked-by ?v17892744)]comp.acceptVisitor(v)

0.91

desired

[public void acceptVisitor(ComponentVisitor v){...}]@[(invoked-by ?v20420073)]comp.acceptVisitor(v)

0.91

[public void acceptVisitor(ComponentVisitor v) ??v23406365]@[(invoked-by ??v23499077)]?v23184877(v)

0.90

[….acceptVisitor(…)]@[(invokes ?method)][public void acceptVisitor(ComponentVisitor v)…]@[(equals ?method)]

Page 18: A recommender system for generalizing and refining code templates

Ongoing Experiment

RQ1: how effective is the search in findingtemplate changes?

RQ2: do users find the recommended changes helpful?

RQ3: do composite, template-specific mutations converge more quickly to a solution than generic code mutations?

finetuning: very sensitive to probabilities of crossover and mutation, quality of RNG, diversity in population, …

Page 19: A recommender system for generalizing and refining code templates
Page 20: A recommender system for generalizing and refining code templates

reco

mm

endi

ng e

dits