Poggi analytics - geneticos - 1

Buenos Aires, junio de 2016Eduardo Poggi

Temario

Introduction Representing Hypotheses Operators Fitness Function Genetic Programming Models of Evolution and Learning

Introduction

Genetic Algorithms (GA/AG) provide a learning method based on principles of biological evolution.

New hypotheses are generated by mutating and recombining current hypotheses.

At each step we have a population of hypotheses from which we select the most fit.

GA do a parallel search over different parts of the hypothesis space.

Introduction

La evolución puede verse como un método robusto de adaptación, sumamente utilizado como modelo en biología.

Los AG pueden buscar en espacios de hipótesis que contienen elementos en compleja interacción, de tal forma que la influencia de cada elemento sobre la medida de adaptación global de la hipótesis sea difícil de modelar.

Los AG son fácilmente paralelizables por lo que pueden sacar ventaja de la reducción de costo de procesamiento e incremento de la performance.

Introduction

A hypothesis is good if it has a high “fitness” value.

In Classification: The accuracy of the hypothesis on test examples

In Games: Number of games won against other hypotheses of the same population.

Parameters

A genetic algorithm has the following parameters: A fitness function that gives a score to every hypothesis. A fitness threshold above which we stop and take the

hypothesis with best fit. p: The number of hypothesis in the current population. r: the fraction of hypothesis replaced by cross-over. m: the mutation rate.

Algorithm

Initialize population with p hypotheses at random For each hypothesis h compute its fitness While max fitness < threshold do

Create a new generation Ps Return the hypothesis with highest fitness

Algorithm

New Population

Select: (1-r)p members of P and add them to Ps. The probability of selecting a member is as follows: P(hi) = Fitness

(hi) / Σj Fitness (hj) Crossover:

select rp/2 pairs of hypotheses from P according to P(hi). For each pair (h1,h2) produce two offspring by applying the crossover

operator. Add all offspring to Ps.

Mutate: Choose m members of Ps with uniform probability. Invert: one bit in the representation randomly. Update: P with Ps.

Evaluate: for each h compute its fitness.

Representing Hypotheses

Las hipótesis en general se representan con reglas if-then codificadas como cadenas de bits.

Por ejemplo: cielo: soleado, nublado, lluvia = 100, 010, 001 viento: débil, fuerte = 10, 01 Jugar_tenis = si, no = 10, 01

SI (cielo=nublado v lluvia) ^ (viento=fuerte) then jugar_tenis = NO

= 011 01 01

Representing Hypotheses

Some combinations may be undesirable. In those cases we can do one of the following:

Use a different encoding. Modify the operators to avoid constructing that

combination. Assign low fitness values to those strings.

Genetic Operators - Crossover

The most common is called single-point crossover (sexual recombination):

Initial Strings: 11101001000 00001010101

Crossover Mask: 11111000000

Offspring: 11101010101 00001001000

Genetic Operators - Mutation

The idea is to produce small changes to a string at random.

Under a uniform distribution, we select one bit and switch its value.

Input string: 00001010101 Output String: 01001010101 Select and change the second bit

Fitness Function and Selection

If we want to learn classification rules, possible functions are accuracy or complexity.

Sometimes the system is complex and it is hard to know the fitness value of a rule.

We may have a method to measure the overall performance of a systems of rules.

Fitness Function and Selection

Selection: The typical approach is the “fitness proportionate selection” or

“roulette wheel selection”: P(hi) = Fitness (hi) / Σj Fitness (hj) Other options are rank selection: Rank according to fitness but

then select based on rank only. Fitness Function:

squared of the classification accuracy over the training data Fitness(h) = (Correct(h))^2 Where Correct is de % of examples correctly classified by h

An Example - GABIL

System GAIL by Kenneth De Jong et. al. 1993, uses genetic algorithms to learn Boolean concepts represented as a disjunction of rules.

If condition1 then class = + If condition2 then class = - If condition3 then class = +

Representation - GABIL

We represent the set of rules as a string of bits. Each rule contains a pre-condition and a consequent; if we

assume only two attributes, a1 and a2, then a hypothesis can be encoded as follows:

If a1 = T and a2 = F THEN c = T If a2 = T THEN c = F

Corresponding string: a1 a2 c 10 01 10 11 10 01

Genetic Operators - GABIL

Parameters: r: 0.6 (fraction of the parent population replaced by crossover) m: 0.001 (mutation rate) p: 100-1000 (population size)

Mutation: The probability of choosing a bit is 0.001

Fitness Function: Squared of the classification accuracy over the training data

Fitness(h) = (correct(h))2

Experiments: Compared to C4.5 accuracy was 92.1% ; the performance of other

systems ranged from 91.2% to 96.6%

Extensiones - GABIL

AddCondition: sobre algún atributo generaliza cambiando un bit en 0 por un 1.

DropCondition: reemplaza todos los bits de una subcadena por 1s.

Reportó resultados ambivalentes, mejoras en algunos problemas y baja de desempeño en otros.

Sugiere la posibilidad que el AG evolucione su propio método de busca de hipótesis.

GP - Genetic Programming

Genetic Programming is a form of evolutionary computation in which the individuals in the population are computer programs.

Programs are normally represented by trees. A program is executed by parsing the tree.

Example:

Sin sqrt

x + ^ y x 2

F = sin(x) + sqrt( x^2 + y)

GP - Vocabulary

To apply genetic programming one has to define the functions that can be applied:

Example: sin, cos, sqrt, +, -, etc. A genetic programming algorithm explores the

space of combinations of these functions. The fitness of an individual is determined by

executing the program on a set of training data.

GP - Cross Over

The crossover operator is performed by replacing subtrees ofone program with subtrees of another program.

x + x y

Sin sqrt

x + ^ y x 2

GP - Result

x ^ x y

Sin sqrt

x + + y x 2

Blocks

Learning an algorithm for stacking blocks. Learn an algorithm to stack the blocks so that it

reads “UNIVERSAL”

V U L A I

Blocks

The only valid operations are move a block from the stack to the surface of the table move a block from the surface of table to the stack

The primitive functions are: CS (current stack) returns the name of the block on top of

the stack. TB (top current block) returns the name of the topmost

block where all blocks and itself are in correct order. NN (next necessary) name of the block needed above TB

to read the word “UNIVERSAL”.

Blocks

Additional functions are: (MS x) move block x to stack if x is on table. (MT x) move block x to table if x is on the stack. (EQ x y) returns true if x = y. (NOT x) returns the complement of x. DO(x y) do x until expression y is true

Blocks

What is the fitness value? In this experiment the author provided 166

different examples having each different initial block configurations. The fitness of a program is how many problem were correctly solved by the program.

Initial population = 100 programs After 10 generations the genetic programming

strategy found the following program: (EQ (DU (MT CS)(NOT CS)) (DU (MS NN)(NOT NN)) )

which solved all 166 programs correctly.

Ejemplos:

Models of Evolution and Learning

Jean-Baptiste Pierre Antoine de Monet, Caballero de Lamarck; 1744-1829.

Lamarckian Evolution: Propuso que la evolución sobre muchas

generaciones estaba directamente influenciada por las experiencias de los organismos individuales durante su vida. En particular propuso que las experiencias de un organismo afectaban directamente el marcaje genético de su prole.

Esta conjetura es atractiva porque podría presumiblemente permitir un progreso evolutivo más eficiente que el proceso genera-prueba de los algoritmos genéticos, que ignoran la experiencia ganada durante el tiempo de vida de los individuos.

A pesar de lo atractivo la evidencia científica actual contradice el modelo de Lamarck. El punto de vista aceptado actualmente es que el marcaje genético de un individuo es de hecho, inalterable por la experiencia de vida de sus padres biológicos.

Sin embargo, y a pesar de esta contradicción, algunos estudios computacionales han mostrado que los procesos Lamarckianos pueden en algunos casos, mejorar la efectividad de los algoritmos genéticos.

Baldwin Effect

After J. M. Baldwin (1986). The Baldwin effect relies on the

following: Si una especie está evolucionando en

un ambiente cambiante, habrá una presión evolutiva a favor de los individuos con la capacidad de aprender durante su tiempo de vida.

Aquellos individuos que son capaces de aprender muchos rasgos, dependen menos de su código genético. Estos individuos pueden soportar un pool genético más diverso, basándose en el aprendizaje individual para resolver rasgos perdidos o no optimizados de su código genético. Este pool genético diverso puede a su vez, soportar una adaptación por evolución más rápida.

Fitness

Better individuals are preferred Best is not always picked Worst is not necessarily excluded Nothing is guaranteed Mixture of greedy exploitation and adventurous

exploration Similarities to simulated annealing (SA)

Crowding

Individuos más aptos que otros comienzan a reproducirse rápidamente de tal forma que copias o individuos muy parecidos ocupan una fracción importante de P afectando la diversidad de la población y reduce la velocidad.

Algunas técnicas para evitar el crowding incluyen: Usar un criterio de selección por torneo o por rango, en

lugar de la selección basada en adaptación. Reducir el índice de adaptación de un individuo si se

detecta la presencia de individuos similares en la población (fitness sharing).

Restringir que individuos pueden combinarse para producir prole, mitigando la creación de especies o grupos de individuos similares.

¿AG vs BH?

Los algoritmos genéticos emplean una búsqueda al azar por barrido para encontrar la mejor hipótesis.

Esta forma de buscar es diferente de la BH más tradicional debido que en este caso la búsqueda puede moverse abruptamente en el espacio de hipótesis posibles, por ejemplo, cuando la prole difiere totalmente de sus padres.

Es menos probable que el algoritmo caiga en máximos locales.

Parallel Genetic Agorithms

It seems very natural to produce a parallel version of genetic algorithms.

Some approaches divide the population into groups called demes (“local population of closely related organisms”).

Each deme resides on a computational node, and a standard search is carried out on each node.

There is communication and combination among demes but these are rare compared to those occurring within demes.

Summary

Genetic algorithm do a parallel, hill climbing search looking to optimize a fitness function.

The search resembles the mechanism of biological evolution. Genetic algorithms have many application outside machine

learning. In machine learning the goal is to evolve a population of hypotheses to find the hypothesis with highest accuracy.

Genetic programming is similar to genetic algorithms but what we evolve is computer programs.

eduardopoggi@yahoo.com.ar

eduardo-poggi

http://ar.linkedin.com/in/eduardoapoggi

https://www.facebook.com/eduardo.poggi

@eduardoapoggi

Biblio

Capítulo 9 de Mitchell Guerra, Algoritmos Genéticos

Poggi analytics - geneticos - 1

Business

Operadores Geneticos*

Poggi analytics - ebl - 1

Poggi analytics - inference - 1a

Padecimientos Geneticos

Procesos geneticos

MAPAS GENETICOS

Polimorfismos Geneticos

Poggi analytics - clustering - 1

Poggi analytics - star - 1a

Eventos geneticos

Poggi analytics - ensamble - 1b

Recursos Geneticos

AMBIENTES GENETICOS

Poggi analytics - concepts - 1a

Alg. Geneticos

disturbios geneticos