Top Banner
CREST Genetic Improvement Justyna Petke Centre for Research in Evolution, Search and Testing University College London
140

Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Oct 04, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST

Genetic Improvement

Justyna Petke

Centre for Research in Evolution, Search and TestingUniversity College London

Page 2: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

Thank you

Mark Harman Yue Jia Alexandru Marginean

Page 3: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

What does the word “Computer” mean?

Oxford Dictionary

“a person who makes calculations, especially with a calculating machine.”

Wikipedia“The term "computer", in use from the mid 17th

century, meant "one who computes": a person performing mathematical calculations.”

Page 4: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

What does the word “Computer” mean?

Oxford Dictionary

“a person who makes calculations, especially with a calculating machine.”

Wikipedia“The term "computer", in use from the mid 17th

century, meant "one who computes": a person performing mathematical calculations.”

Page 5: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

in the beginning ...

Page 6: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

The First Computer?

Page 7: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Different People have different opinions

Page 8: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.
Page 9: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.
Page 10: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.
Page 11: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.
Page 12: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.
Page 13: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.
Page 14: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.
Page 15: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.
Page 16: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

Who are the programers

Page 17: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

Who are the programers

Page 18: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

Who are the programers

Page 19: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

Who are the programers

it’s always been people

Page 20: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

Who are the programers

it’s always been people

Page 21: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

Who are the programers

lots of people

Page 22: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

why people?

human computers seem quaint today

will human programmers seem quaint tomorrow ?

Page 23: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

programming is changing

Page 24: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

RequirementsRequirements

Functional Requirements

Non-Functional Requirements

Page 25: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

Functional Requirements

Non-Functional Requirements

Memory

Execution Time

Battery

Size

Bandwidth functionality of the Program

Page 26: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

Software Design Process

Page 27: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

Software Design Process

Page 28: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

Software Design Process

Page 29: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

Software Design Process

Page 30: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

Software Design Process

Page 31: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

Multiplicity

Multiple Devices

Conflicting Objectives

Multiple Platforms

Page 32: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

Functional Requirements

Non-Functional Requirements

Which requirements must be human coded ?

humans have to define these

Page 33: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

Functional Requirements

Non-Functional Requirements

Which requirements are essential to human ?

humans have to define these

we can optimise these

Page 34: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

Page 35: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

Pareto Front

Page 36: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

Pareto Front

each circle is a program found by a machine

Page 37: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

Pareto Front

different nonfunctional

properties havedifferent paretoprogram fronts

Page 38: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

Failed Test Cases

Page 39: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

Why can’t functional properties be optimisation objectives ?

Page 40: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

Page 41: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

Optimisation

Page 42: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

Optimisation

2.5 times faster but failed 1 test case?

Page 43: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

Page 44: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

Optimisation

double the battery life but failed 2 test cases?

Page 45: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

ConformantGI

VS. HereticalGI

functional correctness

is king

conforms totraditional

views

Page 46: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

ConformantGI

VS. HereticalGI

functional correctness

is king

conforms totraditional

views

correctness meanshaving sufficient resources

for computation

conforms to acompelling new

orthodoxy

Page 47: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

ConformantGI

VS. HereticalGI

functional correctness

is king

conforms totraditional

views

correctness meanshaving sufficient resources

for computation

conforms to acompelling new

orthodoxy

there’s nothing correct about a flat battery

Page 48: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

can it work ?

Page 49: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

Software Uniqueness

500,000,000 LoCone has to write approximately 6 statements before one is writing unique code

Page 50: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

Software Uniqueness

500,000,000 LoCone has to write approximately 6 statements before one is writing unique code

M. Gabel and Z. Su. A study of the uniqueness of source code. (FSE 2010)

The space of candidate programs is far smaller than we might suppose.

Page 51: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

Software Robustness

after one line changes up to 89% of programs that compile run without error

Page 52: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

Software Robustness

after one line changes up to 89% of programs that compile run without error

W. B .Langdon and J. Petke Software is Not Fragile. (CS-DC 2015)

Software engineering artefacts are more robust than is often assumed.

Page 53: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna Petke

Genetic Improvement for Software Specialisation

Page 54: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna Petke

Page 55: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Question

Can we improve the efficiency of an already highly-optimised piece of

software using genetic programming?

Genetic Improvement Justyna Petke

Page 56: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Contributions

Introduction of multi-donor software transplantation

Use of genetic improvement as means to specialise software

Genetic Improvement Justyna Petke

Page 57: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Genetic Improvement

Genetic Improvement Justyna Petke

Page 58: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Program Representation

Changes at the level of lines of source code

Each individual is composed of a list of changes

Specialised grammar used to preserve syntax

Genetic Improvement Justyna Petke

Page 59: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Example

Genetic Improvement Justyna Petke

Page 60: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Code Transplants

GP has access to both:

• the host program to be evolved

• the donor program(s)

code bank contains all lines of source code GP has access to

Genetic Improvement Justyna Petke

Page 61: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Mutation

Addition of one of the following operations:

delete

copy

replace

Genetic Improvement Justyna Petke

Page 62: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Example

Genetic Improvement Justyna Petke

Page 63: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Crossover

Concatenation of two individuals

by appending two lists of mutations

Genetic Improvement Justyna Petke

Page 64: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Fitness

Based on solution quality and

Efficiency in terms of lines of source code

Avoids environmental bias

Genetic Improvement Justyna Petke

Page 65: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Fitness

Test cases are sorted into groups

One test case is sampled uniformly from each group

Avoids overfitting

Genetic Improvement Justyna Petke

Page 66: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Selection

Fixed number of generations

Fixed population size

Initial population contains single-mutation individuals

Genetic Improvement Justyna Petke

Page 67: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Selection

Top-half of population selected

Based on a threshold fitness value

Mutation and Crossover applied

Genetic Improvement Justyna Petke

Page 68: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Genetic Improvement

Genetic Improvement Justyna Petke

Page 69: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Filtering

Mutations in best individuals are often independent

Greedy approach used to combine best individuals

Genetic Improvement Justyna Petke

Page 70: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Question

Can we improve the efficiency of an already highly-optimised piece of

software using genetic programming?

Genetic Improvement Justyna Petke

Page 71: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Motivation for choosing a SAT solver

Boolean satisfiability (SAT) example:

x1 ∨ x2 ∨ ¬x4

¬x2 ∨ ¬x3

• xi : a Boolean variable

• xi, ¬xi : a literal

• ¬x2 ∨ ¬x3 : a clause

Genetic Improvement Justyna Petke

Page 72: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Motivation for choosing a SAT solver

Bounded Model Checking

Planning

Software Verification

Automatic Test Pattern Generation

Combinational Equivalence Checking

Combinatorial Interaction Testing

and many other applications..

Genetic Improvement Justyna Petke

Page 73: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Motivation for choosing a SAT solver

MiniSAT-hack track in SAT solver competitions

- good source for software transplants

Genetic Improvement Justyna Petke

Page 74: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Question

Can we evolve a version of the MiniSAT solver that is faster than any

of the human-improved versions of the solver?

Genetic Improvement Justyna Petke

Page 75: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Experiments: Setup

Solvers used:

MiniSAT2-070721

Test cases used:

∼ 2.5% improvement for general benchmarks (SSBSE’13)

Genetic Improvement Justyna Petke

Page 76: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Motivation for choosing a SAT solver

MiniSAT-hack track in SAT solver competitions

- good source for software transplants

Genetic Improvement Justyna Petke

Page 77: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Question

Can we evolve a version of the MiniSAT solver that is faster than any

of the human-improved versions of the solver for a particular

problem class?

Genetic Improvement Justyna Petke

Page 78: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Experiments: Setup

Solvers used:

MiniSAT2-070721

Test cases used:

from Combinatorial Interaction Testing field

Genetic Improvement Justyna Petke

Page 79: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Combinatorial Interaction Testing

Use of SAT-solvers limited due to poor scalability

SAT benchmarks containing millions of clauses

It takes hours to days to generate a CIT test suite using SAT

Genetic Improvement Justyna Petke

Page 80: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Experiments: Setup

Host program:

MiniSAT2-070721 (478 lines in main algorithm)

Donor programs:

MiniSAT-best09 (winner of ’09 MiniSAT-hack competition)

MiniSAT-bestCIT (best for CIT from ’09 competition)

- total of 104 new lines

Genetic Improvement Justyna Petke

Page 81: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Results

Solver Donor Lines Seconds

MiniSAT (original) — 1.00 1.00

MiniSAT-best09 — 1.46 1.76

MiniSAT-bestCIT — 0.72 0.87

MiniSAT-best09+bestCIT — 1.26 1.63

Genetic Improvement Justyna Petke

Page 82: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Question

How much runtime improvement can we achieve?

Genetic Improvement Justyna Petke

Page 83: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Results

Solver Donor Lines Seconds

MiniSAT (original) — 1.00 1.00

MiniSAT-best09 — 1.46 1.76

MiniSAT-bestCIT — 0.72 0.87

MiniSAT-best09+bestCIT — 1.26 1.63

MiniSAT-gp best09 0.93 0.95

Genetic Improvement Justyna Petke

Page 84: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Results

Donor: best09

13 delete, 9 replace, 1 copy

Among changes:

3 assertions removed

1 deletion on variable used for statistics

Genetic Improvement Justyna Petke

Page 85: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Results

Mainly if and for statements switched off

Decreased iteration count in for loops

Genetic Improvement Justyna Petke

Page 86: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Results

Solver Donor Lines Seconds

MiniSAT (original) — 1.00 1.00

MiniSAT-best09 — 1.46 1.76

MiniSAT-bestCIT — 0.72 0.87

MiniSAT-best09+bestCIT — 1.26 1.63

MiniSAT-gp best09 0.93 0.95

MiniSAT-gp bestCIT 0.72 0.87

Genetic Improvement Justyna Petke

Page 87: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Results

Donor: bestCIT

1 delete, 1 replace

Among changes:

1 assertion deletion

1 replace operation triggers 95% of donor code

Genetic Improvement Justyna Petke

Page 88: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Results

Solver Donor Lines Seconds

MiniSAT (original) — 1.00 1.00

MiniSAT-best09 — 1.46 1.76

MiniSAT-bestCIT — 0.72 0.87

MiniSAT-best09+bestCIT — 1.26 1.63

MiniSAT-gp best09 0.93 0.95

MiniSAT-gp bestCIT 0.72 0.87

MiniSAT-gp best09+bestCIT 0.94 0.96

Genetic Improvement Justyna Petke

Page 89: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Results

Donor: best09+bestCIT

50 delete, 20 replace, 5 copy

Among changes:

5 assertions removed

4 semantically equivalent replacements

3 operations used for statistics removed

∼ half of the mutations remove dead code

Genetic Improvement Justyna Petke

Page 90: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Results

Solver Donor Lines Seconds

MiniSAT (original) — 1.00 1.00

MiniSAT-best09 — 1.46 1.76

MiniSAT-bestCIT — 0.72 0.87

MiniSAT-best09+bestCIT — 1.26 1.63

MiniSAT-gp best09 0.93 0.95

MiniSAT-gp bestCIT 0.72 0.87

MiniSAT-gp best09+bestCIT 0.94 0.96

MiniSAT-gp-combined best09+bestCIT 0.54 0.83

Genetic Improvement Justyna Petke

Page 91: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Results

Combining results:

37 delete, 15 replace, 4 copy

56 out of 100 mutations used

Among changes:

8 assertion removed

95% of the bestCIT donor code executed

Genetic Improvement Justyna Petke

Page 92: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Conclusions

Introduced multi-donor software transplantation

Used genetic improvement as means to specialise software

Achieved 17% runtime improvement on MiniSAT

for the Combinatorial Interaction Testing domain

by combining best individuals

Genetic Improvement Justyna Petke

Page 93: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

GPProgra

msProgra

msProgra

msMiniSatImprov

Non-functional property Test

FitneTest

Sensitivity Analysis

Justyna Petke, Mark Harman, William B. Langdon and Westley WeimerUsing Genetic Improvement & Code Transplants to Specialise a C++ programto a Problem Class (EuroGP’14)

Multi-doner transplantSpecialized for CIT

17% faster

MiniSat

MiniSat

MiniSat

v1

v2

vn

GECCO Humie

silver medal

Inter version transplantation

Page 94: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

Bowtie2

GPProgra

msProgra

msProgra

msBowtie

2

Non-functional property Test

FitneTest

Sensitivity Analysis

Genetic Improvement of Programs

W. B. Langdon and M. HarmanOptimising Existing Software with Genetic Programming. TEC 2015

70 times faster30+ interventions

HC clean up: 7slight semantic improvement

Page 95: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

Cuda GPProgra

msProgra

msProgra

msCuda

Improv

Non-functional property Test

FitneTest

Sensitivity Analysis

Genetic Improvement of Programs

W. B. Langdon and M. HarmanGenetically Improved CUDA C++ Software, EuroGP 2014

7 times fasterupdated for new hardware

automated updating

Page 96: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

GP

Non-functional property Test

FitneTest

Sensitivity Analysis

Memory speed trade offs System

malloc

System

optimisedmalloc

Fan Wu, Westley Weimer, Mark Harman, Yue Jia and Jens Krinke Deep Parameter OptimisationConference on Genetic and Evolutionary Computation (GECCO 2015)

Improve execution time by 12% or achieve a 21% memory consumption reduction

Page 97: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

GP

Non-functional property Test

FitneTest

Sensitivity Analysis

Reducing energy consumption

Ensemble

AProVE

MiniSatCIT

MiniSat

MiniSatEnsemble

AProVE

Improved MiniSatCIT

Improved MiniSat

Improved MiniSat

Bobby R. Bruce Justyna Petke Mark HarmanReducing Energy Consumption Using Genetic ImprovementConference on Genetic and Evolutionary Computation (GECCO 2015)

Energy consumption can be reduced by as much as 25%

Page 98: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

GP

Non-functional property Test

FitneTest

Grow and graft new functionality

GP

Non-functional property Test

FitneTest

Sensitivity Feature

Grow Graft

HumanKnowledge Feature

Host System

Mark Harman, Yue Jia and Bill Langdon, Babel Pidgin: SBSE can grow and graft entirely new functionality into a real world system Symposium on Search-Based Software Engineering SSBSE 2014. (Challenge track)Chall

enge T

rack

Award

Page 99: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

GP

Non-functional property Test

FitneTest

Sensitivity Analysis

Real world cross system transplantation

Earl T. Barr, Mark Harman, Yue Jia, Alexandru Marginean, and Justyna Petke Automated Software Transplantation (ISSTA 2015)

Donor

Host

featureHost’

featureSuccessfully autotransplanted new functionality and passed all regression tests for 12 out of 15 real world systems

Page 100: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

Automated Software TransplantationE.T. Barr, M. Harman, Y. Jia, A. Marginean & J. Petke

ACM Distinguished Paper Award at ISSTA 2015

coverage in

article in

2647 shares of

Page 101: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Video Player

Start from scratch

Why Autotransplantation?Check open

source repositories Why not handle H.264?

~100 players

CREST Justyna PetkeGenetic Improvement

Page 102: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Human Organ Transplantation

CREST Justyna PetkeGenetic Improvement

Page 103: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Automated Software Transplantation

Host

Donor

OrganOrgan

ENTRY

V

Organ Test Suite

Manual Work:

Organ Entry

Organ’s Test Suite

Implantation Point

CREST Justyna PetkeGenetic Improvement

Page 104: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

μTransHost

Donor

Stage 1: Static Analysis

Host Beneficiary

Stage 2: Genetic

Programming

Stage 3: Organ

Implantation

Organ Test Suite

Implantation Point

Organ Entry

CREST Justyna PetkeGenetic Improvement

Page 105: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Stage 1 — Static Analysis Donor

OEENTRY

VeinOrgan

Matching Table

Dependency Graph

Host

Implantation Point

Stm: x = 10; -> Decl: int x;Donor: int X -> Host: int A, B, C

CREST Justyna PetkeGenetic Improvement

Page 106: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Stage 2 — GPS1 S2 S3 S4 S5 … Sn

Matching Table

V3H V4H

Donor Variable ID

Host Variable ID (set)

V1D

V2D

V1H V2H

V5H

Individual

Var

Mat

chin

gSt

atem

ent

s

V1D V1H

V2D V4H

S1 S7 S73

M1:M2:

…Genetic Programming

Algorithm 1 Generate the initial population P ; the functionchoose returns an element from a set uniformly at random.Input V , the organ vein; SD, the donor symbol table; OM , the host

type-to-variable map; Sp, the size of population; v, statements inindividual; m, mappings in individual.

1: P := ∅2: for i := 1 to Sp do3: m, v := ∅, ∅4: for all sd ∈ SD do5: sh := choose(OM [sd])6: m := m ∪ {sd → sh}7: v := { choose(V ) }8: P := P ∪ {(m, v)}9: return P

variables used in donor at LO. For each individual, Alg. 1 firstuniformly selects a type compatible binding from the host’svariables in scope at the implantation point to each of theorgan’s parameters. We then uniformly select one statementfrom the organ, including its vein, and add it to the individual.The GP system records which statements have been selectedand favours statements that have not yet been selected.

Search Operators During GP, µTrans applies crossoverwith a probability of 0.5. We define two custom, crossoveroperators: fixed-two-points and uniform crossover. The fixed-two-points crossover is the standard fixed-point operator sepa-rately applied to the organ’s map from host variables to organparameters and the statement vector, restricted to each vec-tor’s centre point. The uniform crossover operator producesonly one offspring, whose host to organ map is the crossover ofits parents’ and whose V and O statements are the union of itsparents. The use of union here is novel. Initially, we used con-ventional fixed-point crossover on organ and vein statementsvectors, but convergence was too slow. Adopting union spedconvergence, as desired. Fig. 4 shows an example of applyingthe crossover operators on the two individuals on the left.After crossover, one of the two mutation operators is ap-

plied with a probability of 0.5. The first operator uniformlyreplaces a binding in the organ’s map from host variablesto its parameters with a type compatible alternative. Inour running example, say an individual currently maps thehost variable curs to its N_init parameter. Since curs isnot a valid array length in idct, the individual fails theorgan test suite. Say the remap operator chooses to remapN_init. Since its type is int, the remap operator selects anew variable from among the int variables in scope at theinsertion point, which include ‘hit_eof, curs, tos, length,. . . ’ . The correct mapping is N_list to length; if the remapoperator selects it, the resulting individual will be more fit.

The second operator mutates the statements of the organ.First, it uniformly picks t, an offset into the organ’s statementlist. When adding or replacing, it first uniformly selects aindex into the over-organ’s statement array. To add, it insertsthe selected statement at t in the organ’s statement list; toreplace, it overwrites the statement at t with the selectedstatement. In essence, the over-organ defines a large set ofaddition and replacement operations, one for each uniquestatement, weighted by the frequency of that statement’sappearance in the over-organ. Fig. 5 shows an example ofapplying µTrans’s mutation operators.

At each generation, we select top 10% most fit individuals(i.e. elitism) and insert them into the new generation. Weuse tournament selection to select 60% of the population forreproduction. Parents must be compilable; if the proportionof possible parents is less than 60% of the population, Alg. 1generates new individuals. At the end of evolution, an organ

that passes all the tests is selected uniformly at random andinserted into the host at Hl.

Fitness Function Let IC be the set of individuals thatcan be compiled. Let T be the set of unit tests used in GP,TXi and TPi be the set of non-crashed tests and passed testsfor the individual i respectively. Our fitness function follows:

fitness(i) =

!13 (1 +

|TXi||T | + |TPi|

|T | ) i ∈ IC

0 i /∈ IC(1)

For the autotransplantation goal, a viable candidate must,at a minimum, compile. At the other extreme, a successfulcandidate passes all of the TO

D , the developer-provided testsuite that defines the functionality we seek to transplant.These poles form a continuum. In between fall those individ-uals who execute tests to termination, even if they fail. Ourfitness function therefore contains three equally-weightedfitness components. The first checks whether the individualcompiles properly. The second rewards an individual forexecuting test cases to termination without crashing and lastrewards an individual for passing tests in TO

D .

Implementation Implemented in TXL and C, µScalpelrealizes µTrans and comprises 28k SLoCs, of which 16k isTXL [17], and 12k is C code. µScalpel inherits the limita-tions of TXL, such as its stack limit which precludes parsinglarge programs and its default C grammar’s inability to prop-erly handle preprocessor directives. As an optimisation weinline all the function calls in the organ. Inlining eases slicereduction, eliminating unneeded parameters, returns andaliasing. For the construction of the call graphs, we use GNUcflow, and inherit its limitations related to function pointers.

5. EMPIRICAL STUDYThis section explains the subjects, test suites, and research

questions we address in our empirical evaluation of automatedcode transplantation as realized in our tool, µScalpel.

Subjects We transplant code for five donors into three hosts.We used the following criteria to choose these programs. First,they had to be written in C, because µScalpel currentlyoperates only on C programs. Second, they had to be popularreal-world programs people use. Third, they had to be diverse.Fourth, the host is the system we seek to augment, so it had tobe large and complex to present a significant transplantationchallenge, while, fifth, the organ we transplant could comefrom anywhere, so donors had to reflect a wide range of sizes.To meet these constraints, we perused GitHub, SourceForge,and GNU Savannah in August 2014, restricting our attentionto popular C projects in different application domains.

Presented in Tab. 1, our donors include the audio stream-ing client IDCT, the simple archive utility MYTAR, GNUCflow (which extracts call graphs from C source code), Web-server3 (which handles HTTP requests), the command lineencryption utility TuxCrypt, and the H.264 codec x264. Ourhosts include Pidgin, GNU Cflow (which we use as both adonor and a host), SOX, a cross-platform command line au-dio processing utility, and VLC, a media player. We use x264and VLC in our case study in Sec. 7; we use the rest in Sec. 6.These programs are diverse: their application domains

span chat, static analysis, sound processing, audio streaming,archiving, encryption, and a web server. The donors vary insize from 0.4–63k SLoC and the hosts are large, all greater3https://github.com/Hepia/webserver.

261

Weak Proxies: Does it execute test cases without

crashing?

Does it compile?Strong Proxies: Does it produce the correct output?

CREST Justyna PetkeGenetic Improvement

Page 107: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Host

Organ

Donor

Do we break the initial functionality?

Have we really added new functionality?

How about the computational effort?Is autotransplantation useful?

Research QuestionsRegression TestsAcceptance Tests

CREST Justyna PetkeGenetic Improvement

Page 108: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Research QuestionsDo we break the initial functionality?

How about the computational effort? Is autotransplantation useful?

Have we really added new functionality?

Empirical Study

15 Transplantations 300 Runs 5 Donors 3 Hosts

Case Study:

H.264 Encoding Transplantation

CREST Justyna PetkeGenetic Improvement

Page 109: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

ValidationRegression

Tests

Augmented Regression

Tests

Donor Acceptance

Tests

Acceptance Tests

Manual Validation

Host Beneficiary

CREST Justyna PetkeGenetic Improvement

Page 110: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Subjects

Minimal size: 0.4k

Max size: 422k

Average Donor:16k

Average Host: 213k

Subjects Type Size KLOC

Idct Donor 2.3Mytar Donor 0.4Cflow Donor 25

Webserver Donor 1.7TuxCrypt Donor 2.7

Pidgin Host 363Cflow Host 25SoX Host 43

Case Studyx264 Donor 63VLC Host 422

Feedback-Controlled Random TestGeneration

Kohsuke Yatoh1∗, Kazunori Sakamoto2, Fuyuki Ishikawa2, Shinichi Honiden12

1University of Tokyo, Japan,2National Institute of Informatics, Japan

{k-yatoh, exkazuu, f-ishikawa, honiden}@nii.ac.jp

ABSTRACTFeedback-directed random test generation is a widely usedtechnique to generate random method sequences. It lever-ages feedback to guide generation. However, the validity offeedback guidance has not been challenged yet. In this pa-per, we investigate the characteristics of feedback-directedrandom test generation and propose a method that exploitsthe obtained knowledge that excessive feedback limits thediversity of tests. First, we show that the feedback loopof feedback-directed generation algorithm is a positive feed-back loop and amplifies the bias that emerges in the candi-date value pool. This over-directs the generation and limitsthe diversity of generated tests. Thus, limiting the amountof feedback can improve diversity and effectiveness of gener-ated tests. Second, we propose a method named feedback-controlled random test generation, which aggressively con-trols the feedback in order to promote diversity of generatedtests. Experiments on eight different, real-world applicationlibraries indicate that our method increases branch cover-age by 78% to 204% over the original feedback-directed al-gorithm on large-scale utility libraries.

Categories and Subject DescriptorsD.2.5 [Software Engineering]: Testing and Debugging—Testing tools

General TermsAlgorithms, Reliability, Verification

KeywordsRandom testing, Test generation, Diversity

∗The author is currently affiliated with Google Inc., Japan,and can be reached at [email protected].

Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.ISSTA’15 , July 12–17, 2015, Baltimore, MD, USACopyright 2015 ACM 978-1-4503-3620-8/15/07 ...$15.00.

1. INTRODUCTIONFeedback-directed random testing [17] is a promising tech-

nique to automatically generate software tests. The tech-nique can create random method sequences using publicmethods from the classes of a system-under-test (SUT). Itis a general and test oracle independent technique to gen-erate software tests. Due to its generality and flexibility,many researchers have used feedback-directed random test-ing. Some researchers leveraged feedback-directed randomtesting as a part of their proposed methods [5, 25]. Othersused feedback-directed random testing to prove their the-ories on random testing [11, 12]. There is an interestingstudy that mined SUT specifications by analyzing the dy-namic behavior of SUT observed during feedback-directedrandom testing [18]. In addition, feedback-directed randomtesting has already been adopted by industries and under-gone intensive use [19].Despite its importance, characteristics of feedback-directed

random testing have seldom been studied. To the best ofour knowledge, some studies have proposed extensions tofeedback-directed random testing [14, 27], but they failedto analyze the nature of feedback-directed random testing.Specifically, the idea of feedback guidance had never beenchallenged. In this paper we investigate characteristics offeedback-directed random testing by using a model SUT andpropose a new technique that exploits the obtained knowl-edge that excessive feedback over-directs generation, ampli-fies bias, and limits the diversity of generated tests.We address two research questions in this paper.

RQ1: Why does the test effectiveness stop increasing atdifferent points depending on random seeds?

RQ2: Can our proposed technique lessen the dependencyon random seeds and improve the overall performanceof test generation?

The resulting test effectiveness of feedback-directed randomtesting should differ because of its randomness. However,the observed difference is much larger than expected. Forexample, the interquartile range marks 10% in our prelim-inary experiment on the model SUT. This spoils the credi-bility of feedback-directed random testing.There are three contributions in this paper.

• We hypothesize that feedback guidance over-directs thegeneration and limits the diversity of generated testsand show that both average score and variance of testeffectiveness improve by limiting the amount of feed-back.

Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page. Copyrights for components of this work owned by others than ACMmust be honored. Abstracting with credit is permitted. To copy otherwise, or republish,to post on servers or to redistribute to lists, requires prior specific permission and/or afee. Request permissions from [email protected] is held by the owner/author(s). Publication rights licensed to ACM.

ISSTA’15, July 13–17, 2015, Baltimore, MD, USAACM. 978-1-4503-3620-8/15/07http://dx.doi.org/10.1145/2771783.2771805

Consist

ent *Complete *Well D

ocumented*Easyt

oR

euse* *

Evaluated

*ISSTA*Ar

tifact *

AEC

316

CREST Justyna PetkeGenetic Improvement

Page 111: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Experimental Methodology and Setup

μSCALPEL

Host

Implantation Point

Donor

OEOrgan Test Suite

Host Beneficiary

Implantation Point

Organ

64 bit Ubuntu 14.10 16 GB RAM 8 threads

CREST Justyna PetkeGenetic Improvement

Page 112: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Donor Host All Passed Regression Regression++ AcceptanceIdct Pidgin 16 20 17 16

Mytar Pidgin 16 20 18 20Web Pidgin 0 20 0 18Cflow Pidgin 15 20 15 16Tux Pidgin 15 20 17 16Idct Cflow 16 17 16 16

Mytar Cflow 17 17 17 20Web Cflow 0 0 0 17Cflow Cflow 20 20 20 20Tux Cflow 14 15 14 16Idct SoX 15 18 17 16

Mytar SoX 17 17 17 20Web SoX 0 0 0 17Cflow SoX 14 16 15 14Tux SoX 13 13 13 14

TOTAL 188/300 233/300 196/300 256/300

Empirical Study Feedback-Controlled Random TestGeneration

Kohsuke Yatoh1∗, Kazunori Sakamoto2, Fuyuki Ishikawa2, Shinichi Honiden12

1University of Tokyo, Japan,2National Institute of Informatics, Japan

{k-yatoh, exkazuu, f-ishikawa, honiden}@nii.ac.jp

ABSTRACTFeedback-directed random test generation is a widely usedtechnique to generate random method sequences. It lever-ages feedback to guide generation. However, the validity offeedback guidance has not been challenged yet. In this pa-per, we investigate the characteristics of feedback-directedrandom test generation and propose a method that exploitsthe obtained knowledge that excessive feedback limits thediversity of tests. First, we show that the feedback loopof feedback-directed generation algorithm is a positive feed-back loop and amplifies the bias that emerges in the candi-date value pool. This over-directs the generation and limitsthe diversity of generated tests. Thus, limiting the amountof feedback can improve diversity and effectiveness of gener-ated tests. Second, we propose a method named feedback-controlled random test generation, which aggressively con-trols the feedback in order to promote diversity of generatedtests. Experiments on eight different, real-world applicationlibraries indicate that our method increases branch cover-age by 78% to 204% over the original feedback-directed al-gorithm on large-scale utility libraries.

Categories and Subject DescriptorsD.2.5 [Software Engineering]: Testing and Debugging—Testing tools

General TermsAlgorithms, Reliability, Verification

KeywordsRandom testing, Test generation, Diversity

∗The author is currently affiliated with Google Inc., Japan,and can be reached at [email protected].

Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.ISSTA’15 , July 12–17, 2015, Baltimore, MD, USACopyright 2015 ACM 978-1-4503-3620-8/15/07 ...$15.00.

1. INTRODUCTIONFeedback-directed random testing [17] is a promising tech-

nique to automatically generate software tests. The tech-nique can create random method sequences using publicmethods from the classes of a system-under-test (SUT). Itis a general and test oracle independent technique to gen-erate software tests. Due to its generality and flexibility,many researchers have used feedback-directed random test-ing. Some researchers leveraged feedback-directed randomtesting as a part of their proposed methods [5, 25]. Othersused feedback-directed random testing to prove their the-ories on random testing [11, 12]. There is an interestingstudy that mined SUT specifications by analyzing the dy-namic behavior of SUT observed during feedback-directedrandom testing [18]. In addition, feedback-directed randomtesting has already been adopted by industries and under-gone intensive use [19].Despite its importance, characteristics of feedback-directed

random testing have seldom been studied. To the best ofour knowledge, some studies have proposed extensions tofeedback-directed random testing [14, 27], but they failedto analyze the nature of feedback-directed random testing.Specifically, the idea of feedback guidance had never beenchallenged. In this paper we investigate characteristics offeedback-directed random testing by using a model SUT andpropose a new technique that exploits the obtained knowl-edge that excessive feedback over-directs generation, ampli-fies bias, and limits the diversity of generated tests.We address two research questions in this paper.

RQ1: Why does the test effectiveness stop increasing atdifferent points depending on random seeds?

RQ2: Can our proposed technique lessen the dependencyon random seeds and improve the overall performanceof test generation?

The resulting test effectiveness of feedback-directed randomtesting should differ because of its randomness. However,the observed difference is much larger than expected. Forexample, the interquartile range marks 10% in our prelim-inary experiment on the model SUT. This spoils the credi-bility of feedback-directed random testing.There are three contributions in this paper.

• We hypothesize that feedback guidance over-directs thegeneration and limits the diversity of generated testsand show that both average score and variance of testeffectiveness improve by limiting the amount of feed-back.

Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page. Copyrights for components of this work owned by others than ACMmust be honored. Abstracting with credit is permitted. To copy otherwise, or republish,to post on servers or to redistribute to lists, requires prior specific permission and/or afee. Request permissions from [email protected] is held by the owner/author(s). Publication rights licensed to ACM.

ISSTA’15, July 13–17, 2015, Baltimore, MD, USAACM. 978-1-4503-3620-8/15/07http://dx.doi.org/10.1145/2771783.2771805

Consist

ent *Complete *Well D

ocumented*Easyt

oR

euse* *

Evaluated

*ISSTA*Ar

tifact *

AEC

316

CREST Justyna PetkeGenetic Improvement

in 12 out of 15 experiments we successfully autotransplanted

new functionality

Page 113: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Execution Time (minutes)Donor HostIdct Pidgin 5 7 97

Mytar Pidgin 3 1 65Web Pidgin 8 5 160Cflow Pidgin 58 16 1151Tux Pidgin 29 10 574Idct Cflow 3 5 59

Mytar Cflow 3 1 53Web Cflow 5 2 102Cflow Cflow 44 9 872Tux Cflow 31 11 623Idct SoX 12 17 233

Mytar SoX 3 1 60Web SoX 7 3 132Cflow SoX 89 53 74Tux SoX 34 13 94

Total

Empirical Study

Average

334 (min)

Std. Dev. Total

72 (hours)10 (Average)

Feedback-Controlled Random TestGeneration

Kohsuke Yatoh1∗, Kazunori Sakamoto2, Fuyuki Ishikawa2, Shinichi Honiden12

1University of Tokyo, Japan,2National Institute of Informatics, Japan

{k-yatoh, exkazuu, f-ishikawa, honiden}@nii.ac.jp

ABSTRACTFeedback-directed random test generation is a widely usedtechnique to generate random method sequences. It lever-ages feedback to guide generation. However, the validity offeedback guidance has not been challenged yet. In this pa-per, we investigate the characteristics of feedback-directedrandom test generation and propose a method that exploitsthe obtained knowledge that excessive feedback limits thediversity of tests. First, we show that the feedback loopof feedback-directed generation algorithm is a positive feed-back loop and amplifies the bias that emerges in the candi-date value pool. This over-directs the generation and limitsthe diversity of generated tests. Thus, limiting the amountof feedback can improve diversity and effectiveness of gener-ated tests. Second, we propose a method named feedback-controlled random test generation, which aggressively con-trols the feedback in order to promote diversity of generatedtests. Experiments on eight different, real-world applicationlibraries indicate that our method increases branch cover-age by 78% to 204% over the original feedback-directed al-gorithm on large-scale utility libraries.

Categories and Subject DescriptorsD.2.5 [Software Engineering]: Testing and Debugging—Testing tools

General TermsAlgorithms, Reliability, Verification

KeywordsRandom testing, Test generation, Diversity

∗The author is currently affiliated with Google Inc., Japan,and can be reached at [email protected].

Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.ISSTA’15 , July 12–17, 2015, Baltimore, MD, USACopyright 2015 ACM 978-1-4503-3620-8/15/07 ...$15.00.

1. INTRODUCTIONFeedback-directed random testing [17] is a promising tech-

nique to automatically generate software tests. The tech-nique can create random method sequences using publicmethods from the classes of a system-under-test (SUT). Itis a general and test oracle independent technique to gen-erate software tests. Due to its generality and flexibility,many researchers have used feedback-directed random test-ing. Some researchers leveraged feedback-directed randomtesting as a part of their proposed methods [5, 25]. Othersused feedback-directed random testing to prove their the-ories on random testing [11, 12]. There is an interestingstudy that mined SUT specifications by analyzing the dy-namic behavior of SUT observed during feedback-directedrandom testing [18]. In addition, feedback-directed randomtesting has already been adopted by industries and under-gone intensive use [19].Despite its importance, characteristics of feedback-directed

random testing have seldom been studied. To the best ofour knowledge, some studies have proposed extensions tofeedback-directed random testing [14, 27], but they failedto analyze the nature of feedback-directed random testing.Specifically, the idea of feedback guidance had never beenchallenged. In this paper we investigate characteristics offeedback-directed random testing by using a model SUT andpropose a new technique that exploits the obtained knowl-edge that excessive feedback over-directs generation, ampli-fies bias, and limits the diversity of generated tests.We address two research questions in this paper.

RQ1: Why does the test effectiveness stop increasing atdifferent points depending on random seeds?

RQ2: Can our proposed technique lessen the dependencyon random seeds and improve the overall performanceof test generation?

The resulting test effectiveness of feedback-directed randomtesting should differ because of its randomness. However,the observed difference is much larger than expected. Forexample, the interquartile range marks 10% in our prelim-inary experiment on the model SUT. This spoils the credi-bility of feedback-directed random testing.There are three contributions in this paper.

• We hypothesize that feedback guidance over-directs thegeneration and limits the diversity of generated testsand show that both average score and variance of testeffectiveness improve by limiting the amount of feed-back.

Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page. Copyrights for components of this work owned by others than ACMmust be honored. Abstracting with credit is permitted. To copy otherwise, or republish,to post on servers or to redistribute to lists, requires prior specific permission and/or afee. Request permissions from [email protected] is held by the owner/author(s). Publication rights licensed to ACM.

ISSTA’15, July 13–17, 2015, Baltimore, MD, USAACM. 978-1-4503-3620-8/15/07http://dx.doi.org/10.1145/2771783.2771805

Consist

ent *Complete *Well D

ocumented*Easyt

oR

euse* *

Evaluated

*ISSTA*Ar

tifact *

AEC

316

CREST Justyna PetkeGenetic Improvement

Page 114: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Case Study

Transplant Time & Test SuitesTime (hours) Regression Regression++ Acceptance

H.264 26 100% 100% 100%

Feedback-Controlled Random TestGeneration

Kohsuke Yatoh1∗, Kazunori Sakamoto2, Fuyuki Ishikawa2, Shinichi Honiden12

1University of Tokyo, Japan,2National Institute of Informatics, Japan

{k-yatoh, exkazuu, f-ishikawa, honiden}@nii.ac.jp

ABSTRACTFeedback-directed random test generation is a widely usedtechnique to generate random method sequences. It lever-ages feedback to guide generation. However, the validity offeedback guidance has not been challenged yet. In this pa-per, we investigate the characteristics of feedback-directedrandom test generation and propose a method that exploitsthe obtained knowledge that excessive feedback limits thediversity of tests. First, we show that the feedback loopof feedback-directed generation algorithm is a positive feed-back loop and amplifies the bias that emerges in the candi-date value pool. This over-directs the generation and limitsthe diversity of generated tests. Thus, limiting the amountof feedback can improve diversity and effectiveness of gener-ated tests. Second, we propose a method named feedback-controlled random test generation, which aggressively con-trols the feedback in order to promote diversity of generatedtests. Experiments on eight different, real-world applicationlibraries indicate that our method increases branch cover-age by 78% to 204% over the original feedback-directed al-gorithm on large-scale utility libraries.

Categories and Subject DescriptorsD.2.5 [Software Engineering]: Testing and Debugging—Testing tools

General TermsAlgorithms, Reliability, Verification

KeywordsRandom testing, Test generation, Diversity

∗The author is currently affiliated with Google Inc., Japan,and can be reached at [email protected].

Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.ISSTA’15 , July 12–17, 2015, Baltimore, MD, USACopyright 2015 ACM 978-1-4503-3620-8/15/07 ...$15.00.

1. INTRODUCTIONFeedback-directed random testing [17] is a promising tech-

nique to automatically generate software tests. The tech-nique can create random method sequences using publicmethods from the classes of a system-under-test (SUT). Itis a general and test oracle independent technique to gen-erate software tests. Due to its generality and flexibility,many researchers have used feedback-directed random test-ing. Some researchers leveraged feedback-directed randomtesting as a part of their proposed methods [5, 25]. Othersused feedback-directed random testing to prove their the-ories on random testing [11, 12]. There is an interestingstudy that mined SUT specifications by analyzing the dy-namic behavior of SUT observed during feedback-directedrandom testing [18]. In addition, feedback-directed randomtesting has already been adopted by industries and under-gone intensive use [19].Despite its importance, characteristics of feedback-directed

random testing have seldom been studied. To the best ofour knowledge, some studies have proposed extensions tofeedback-directed random testing [14, 27], but they failedto analyze the nature of feedback-directed random testing.Specifically, the idea of feedback guidance had never beenchallenged. In this paper we investigate characteristics offeedback-directed random testing by using a model SUT andpropose a new technique that exploits the obtained knowl-edge that excessive feedback over-directs generation, ampli-fies bias, and limits the diversity of generated tests.We address two research questions in this paper.

RQ1: Why does the test effectiveness stop increasing atdifferent points depending on random seeds?

RQ2: Can our proposed technique lessen the dependencyon random seeds and improve the overall performanceof test generation?

The resulting test effectiveness of feedback-directed randomtesting should differ because of its randomness. However,the observed difference is much larger than expected. Forexample, the interquartile range marks 10% in our prelim-inary experiment on the model SUT. This spoils the credi-bility of feedback-directed random testing.There are three contributions in this paper.

• We hypothesize that feedback guidance over-directs thegeneration and limits the diversity of generated testsand show that both average score and variance of testeffectiveness improve by limiting the amount of feed-back.

Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page. Copyrights for components of this work owned by others than ACMmust be honored. Abstracting with credit is permitted. To copy otherwise, or republish,to post on servers or to redistribute to lists, requires prior specific permission and/or afee. Request permissions from [email protected] is held by the owner/author(s). Publication rights licensed to ACM.

ISSTA’15, July 13–17, 2015, Baltimore, MD, USAACM. 978-1-4503-3620-8/15/07http://dx.doi.org/10.1145/2771783.2771805

Consist

ent *Complete *Well D

ocumented*Easyt

oR

euse* *

Evaluated

*ISSTA*Ar

tifact *

AEC

316

VLC

H264

CREST Justyna PetkeGenetic Improvement

within 26 hours performed a taskthat took developers

an avg of 20 days of elapsed time

Page 115: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

Automated Software Transplantation

H

D

OO

ENTRY

V

Organ’s Test Suite

Manual Work:

Organ Entry

Organ’s Test Suite

Implantation Point

Alexandru Marginean — Automated Software Transplantation

μTrans

Alexandru Marginean — Automated Software Transplantation

Host

Donor

Stage 1: Static Analysis

Host Beneficiary

Stage 2: Genetic

Programming

Stage 3: Organ

Implantation

Organ’s Test Suite

Validation

Alexandru Marginean — Automated Software Transplantation

Regression Tests

Augmented Regression

Tests

Host Beneficiary

Donor Acceptance

Tests

Acceptance Tests

Manual Validation

RQ1.a

RQ1.b RQ2

CREST Justyna PetkeGenetic Improvement

Page 116: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

* http://crest.cs.ucl.ac.uk/autotransplantation/MuScalpel.html

Page 117: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

GI Applications

Bug fixing

Page 118: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

* http://dijkstra.cs.virginia.edu/genprog/

* http://people.csail.mit.edu/fanl/

Kali, SPR, ClearView

(from MIT)

and other …Claire Le Goues, Stephanie Forrest, Westley Weimer:

Current challenges in automatic software repair. Software Quality Journal 21(3): 421-443 (2013)

Page 119: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

GI Applications

Bug fixing

Improving energy consumption

Porting old code to new hardware

Grafting new functionality into an existing system

Specialising software for a particular problem class

Other

Page 120: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

What if fitness is expensive to compute ?

GI4GI: Improving Genetic Improvement Fitness FunctionsMark Harman & Justyna Petke(Genetic Improvement Workshop 2015)

Page 121: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

GI4GI: Energy Optimisation Example

many factors affecting energy consumption, including:

screen behaviour

memory access

device communications

CPU utilisation

Page 122: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

GI4GI: Energy Optimisation Example

a hardware-dependent linear energy model for GI:

Post-compiler software optimization for reducing energy (ASPLOS’14) Schulte et al.

Page 123: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

GI4GI: Energy Optimisation Example

Idea:

Use GI to evolve a fitness function f for energy consumption.

Use f to improve energy consumption of software.

Page 124: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

GI4GI

Page 125: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna Petke

GI Growth

Page 126: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

Growing Area1st International Genetic Improvement Workshopat GECCO 2015, Madrid, Spainwww.geneticimprovementofsoftware.com

Page 127: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

GI Growth

Special Issueon GI

Special Session on GI *http://www.wcci2016.org/

Page 128: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

Functional Requirements

Non-Functional Requirements

Conclusions

humans have to define these

we can optimise these

Page 129: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

Summary

Search Based Optimisation

Software Engineering

S B S E

Genetic Improvement

Combinatorial Interaction Testing

Page 130: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna Petke

Research Opportunities

CREST Justyna Petke

Page 131: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna Petke

Contact me if you want to visit CREST:

j.petke at ucl.ac.uk

Centre for Research on Evolution, Search and TestingUniversity College London

Page 132: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna Petke

20 mins walk

NationalGallery

Nelson’s Column

Eros RoyalCourts of Justice

St. Paul’s

Tate ModernGlobe

Theatre

Covent Garden Market

WestminsterAbbey

House of Parliament

London Eye

BritishMuseum

Madame Tussaud’sSherlock Holmes

Museum

Marble Arch

National History Museum

Page 133: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna Petke

COWsCREST Open Workshop

Roughly one per month

Discussion based

Recorded and archivedhttp://crest.cs.ucl.ac.uk/cow/

Page 134: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna Petke

COWs

http://crest.cs.ucl.ac.uk/cow/

Page 135: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna Petke

COWs

http://crest.cs.ucl.ac.uk/cow/

#Total Registrations 1512 #Unique Attendees 667 #Unique Institutions 244 #Countries 43 #Talks 421

(Last updated on November 4, 2015)

Page 136: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna Petke

CREST Open Workshop (COW)

Page 137: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna Petke

http://crest.cs.ucl.ac.uk/cow/

Genetic Improvement25-26 January 2016

45th COW

Page 138: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna Petke

Dynamic Adaptive Search Based Software

Engineering

EPSRCGrant

Stirling

Birmingham

York

UCL

DAASE

Page 139: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

Summary

Search Based Optimisation

Software Engineering

S B S E

Genetic Improvement

Combinatorial Interaction Testing

COWsVisitor SchemeOpen positions

Page 140: Genetic Improvement - phdopen.mimuw.edu.plphdopen.mimuw.edu.pl/zima15/petke-slides/gi.pdf · I7Q`knaQHp2`nR9y=,,4 ]B I TbXbBx2UV] I7Q`jnaQHp2`nR9y=,,4 ]BYY] GeneticImprovement JustynaPetke.

CREST Justyna PetkeGenetic Improvement

Pictures used with thanks from these sources

Pickering's Harem: [Public domain], via Wikimedia Commons

IBM 026 Card Punch: By Ben Franske (Own work) [GFDL (http://www.gnu.org/copyleft/fdl.html) or CC-BY-SA-3.0-2.5-2.0-1.0 (http://creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons

BBC_Micro: [Public domain], via Wikimedia Commons

Programmer: undesarchiv, B 145 Bild-F031434-0006 / Gathmann, Jens / CC-BY-SA [CC-BY-SA-3.0-de (http://creativecommons.org/licenses/by-sa/3.0/de/deed.en)], via Wikimedia Commons

IBM PC: By Boffy B (Own work) [GFDL (http://www.gnu.org/copyleft/fdl.html) or CC-BY-SA-3.0-2.5-2.0-1.0 (http://creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons

IMac: By Matthieu Riegler, Wikimedia Commons [CC-BY-3.0 (http://creativecommons.org/licenses/by/3.0)], via Wikimedia Commons

Ada Lovelace: By Alfred Edward Chalon [Public domain], via Wikimedia Commons

Stonehenge: By Yuanyuan Zhang [All right reserved] via Flickr

Bath Abbey: By Yuanyuan Zhang [All right reserved] via Flickr