Recent Advances in Data-driven Evolutionary Computationstaff.ustc.edu.cn/~ketang/PPT/DDEC201406.pdf · Recent Advances in Data-driven Evolutionary Computation Ke Tang USTC-Birmingham

Recent Advances in Data-driven Evolutionary Computation

Ke Tang

USTC-Birmingham Joint Research Institute in Intelligent Computation and Its Applications (UBRI) School of Computer Science and Technology

University of Science and Technology of China

June 2014 @ IWSIS&IWMWC

Outline

•  A data-driven perspective on Evolutionary Algorithms •  History-Based Topological Speciation (HTS)

•  Population-based Algorithm Portfolios based on Estimated Performance Matrix (EPM-PAP)

•  Summary

Outline

•  Summary

A data-driven perspective on EAs

•  Evolutionary Algorithms (EAs): Algorithms that mimic natural evolution.

•  There are quite a few famous branches of EAs –  Genetic Algorithms –  Genetic Programming –  Evolution Strategies –  Evolutionary Programming –  Particle Swarm Optimizer –  Ant Colony Optimization –  Differential Evolution –  etc.

•  Despite of the different historical background, most EAs share a similar framework.

•  The Framework of EAs

c�Xin Yao 3'

What Is an Evolutionary Algorithm?

(OK, you can open your eyes and wake up now.)

1. Generate the initial population P (0) at random, and seti← 0;

2. REPEAT

(a) Evaluate the fitness of each individual in P (i);

(b) Select parents from P (i) based on their fitness in P (i);

(c) Generate offspring from the parents using crossover andmutation to form P (i + 1);

(d) i← i + 1;

3. UNTIL halting criteria are satisfied

•  Most EAs are Stochastic Population-Based Generate-and-Test algorithms.

•  An EA iteratively sample in a solution space, and thus can be viewed as a data-generating process.

Sampling

x1 … xD fitness Individual 1 …

Individual n

•  The data consist a lot of information: –  The candidate solutions (individuals) –  Their corresponding fitness –  The “source” of an individual (e.g., an individual was

generated by applying which operator to which parents).

•  Data-driven Evolutionary Computation: To exploit the data generated by an EA during its search process to enhance the EA’s performance.

•  Data-driven Evolutionary Computation is a methodology rather than a new idea, and has been adopted in many existing works, e.g., –  Surrogate Assisted Evolutionary Algorithms –  Various parameter adaptation/self-adaptation schemes

•  The key issues of DDEC: –  What information is to be exploited from the data, and how. –  How will an EA benefits from the obtained information.

Outline

•  Population-based Algorithm Portfolios based on Estimated Performance Matrix

•  Summary and Discussions

Outline

•  Summary

HTS - Background

•  Traditional optimization –  Finds a single global optimum of a problem

•  Challenge brought by a multimodal problem –  There might be more than one optima that are (roughly) equally good. –  Finds multiple optima of a problem.

•  Why?

–  Provides the user with a range of choices (more informed decisions) –  Reveals insights into the problem (inspire innovations)

HTS - Background

•  When employing EAs to find multiple optima, a procedure called speciation is usually required.

•  Speciation: partitioning a population into a few species. –  Niche: an region of attraction on the fitness landscape –  Species: a group of individuals occupying the same niche –  Species seed: the best (fittest) individual of a species

HTS - Background

•  A typical speciation procedure

HTS - Background

•  Most speciation methods relies on a sub-algorithm to determine whether two individuals are of the same species.

•  Speciation methods –  Distance-based: Determines whether two individuals are of the same

species according to their distance –  Topology-based: Determines whether two individuals are of the same

species according to the fitness landscape topography

Speciation

Distance-Based Topology-Based

Hill-Valley Recursive Middling

Distance-based Speciation

•  Two individuals are assigned to the same species if their distance is smaller than a predefined threshold called niche radius.

•  Introduce an additional parameter (i.e., niche radius), which is difficult to tune.

•  Make strong assumptions, i.e., equally sized and spherically shaped niches.

Topology-based Speciation

–  Hill-Valley (HV)

–  Recursive Middling

–  Make weaker assumptions –  Sampling new points in order to capture the landscape topography, –  When more FEs are spent on speciation, less are available for the

evolutionary algorithm to converge. –  Not very attractive especially when fitness evaluation is costly.

History-based Topology Speciation

•  Research question: Could topology-based speciation be FE-free, so that their benefits can be better appreciated?

•  Approach: Capture landscape topography based exclusively on search history.

•  Topology-based speciation methods can be interpreted from the perspective of a sequence of points. –  is an infinite sequence and cannot be tested directly. –  RM method “approximates” by sampling a few points on ab.

•  Basic idea: Approximate only using history data/points.

•  What is a “good” approximation? bad bad good

•  Conceptually, HTS follows a two-step procedure 1.  Construct a finite discrete approximate sequence 2.  Test the approximate sequence to reach a final decision (trivial)

•  More formally, the problem of finding the best approximation

can be stated as:

•  where

HTS: Experiments

•  Compared methods –  HTS –  The traditional distance-based method

•  DIS: niche radius = 𝜎 computed using Deb and Goldberg’s formula

•  DIS-: niche radius = 1/5 𝜎

•  DIS+: niche radius = 5 𝜎

–  HV •  HV1: sample size = 1 •  HV3: sample size = 3 •  HV5: sample size = 5

–  RM –  The fictional idealized method RM*

•  Works the same as RM but with all FE cost being artificially neglected •  Used to obtain some kind of performance upper-bound for topology-based

speciation

HTS: Experiments

•  Different methods are integrated into the same evolutionary framework for comparison –  Crowding Differential Evolution with Species Conservation

•  Benchmark functions –  F1—F6: 6 two-dimensional functions with various properties

Number of optima: 2—10 –  F7—F10: MMP functions in 4, 8, 16, 32 dimensions, respectively

Number of optima: 48 –  F11: An composition of 50 random 32-dimensional shifted rotated

ellipsoidal sub-functions coupled via the max operator Number of optima: 50

•  The goal is to find all optima of the benchmark functions

HTS: Experiments

•  Performance Measure: The distance error of the last generation is then used to measure the performance of the algorithm

•  Win/Draw/Lose of HTS versus every other method

–  Both t-test and Wilcoxon rank-sum tests are used –  Consider a difference to be statistically significant if it is asserted so by both

tests at the 0.05 significance level –  A draw is counted when no statistically significant difference is observed

DIS- DIS DIS+ HV1 HV3 HV5 RM RM* 9/0/2 2/5/4 4/5/2 9/0/2 9/0/2 9/0/2 9/0/2 1/3/7

Outline

•  Summary

PAP - Background � A scenario frequently encounter in the real world: Ø  A number of optimization problems Ø  A time budget T Ø  A number of optimization algorithms (e.g., GA, ES, EP, EDA,

DE, PSO…) We want to obtain the best (or as good as possible) solutions for all the problems with T.

PAP - Background � Ø  Intuitively, the total time budget T can used for two purposes

(1) to identify the best algorithm (2) to search for the best solution

Ø  In general, the more time we spent on (2), the better solution it

will achieve.

Ø  Different problem may favor different algorithm. Finding the best algorithm for a problem can be very time consuming.

PAP - Background � General thoughts:

Ø  Arbitrarily pick an algorithm for every problem? – T will solely be used to search for solutions – too risky

Ø  Carefully identify the best algorithm for each problem? – A lot of time will be used for algorithm selection – The time left for searching for good solutions might be insufficient.

Ø  Try to find a single algorithm suitable for all problems? – Sounds like a good trade-off – Advantages of having a set of different algorithms are not fully utilized.

PAP - Background � Ø  How about establishing a good “portfolio” of algorithms (e.g.,

a combination of multiple algorithms) for all problems?

Advantages:

•  Making use of advantages of different algorithms, rather than putting all the eggs (time) into a single basket (algorithm).

•  Hopefully not too time-consuming since only one portfolio is needed for all problems.

PAP - Background �

Ø  Algorithm Portfolios “invests” limited time in multiple algorithms to fully utilize the advantages of these algorithms to maximize the expected utility of a problem solving episode.

Ø  Analogy to Economics: One allocates his/her money to different financial assets (stocks, bonds, etc.) in order to maximize the expected returns while minimizing risks.

Ø  Population-based Algorithm Portfolios (PAP) – Conceptually similar to Algorithm Portfolios – Aims to solve a set of problems rather a single one – Focuses on population-based algorithms (e.g., EAs)

Ø  The General Framework of PAP

Select the cons:tuent algorithms from a pool of candidate algorithms

Construct a concrete PAP instan:a:on with the cons:tuent algorithms

Apply the PAP instan:a:on to each problem

Output the best solu:on obtained for each problem

Which candidate algorithms should serve as constituent algorithms depends on the way of building a PAP instantiation: Ø  A PAP instantiation maintains multiple sub-populations.

Ø  Each sub-population is evolved with a constituent algorithm.

Ø  Information is shared among sub-populations by activating a migration scheme periodically.

Pseudo-code of a PAP instantiation:

Ø  The above pseudo-code looks very simple.

Ø  But…if combining different EAs in this way won’t lead to any advantage over using a single algorithm. Why bother seeking the so-called constituent algorithms?

Ø  A preliminary experiment – 4 Candidate EAs: CMA-ES, G3PCX, SaNSDE, wPSO – 11 PAP instantiations (with 2, 3, and 4 constituent algorithms)

Ø  For each PAP instantiation – compare it to its constituent algorithms alone to verify whether there would be any advantage of PAP over its constituent algorithms. – compare it to G-CMA-ES to verify whether portfolio of some “weak” algorithms could outperforms the state-of-the-art.

Ø  Benchmark problems – 13 numerical problems from classical benchmark suite – 14 numerical problems from CEC2005 benchmark suite – Dimension: 30

Ø  Total Fitness Evaluations (FEs) for each problem: 300,000

Ø  30 independent runs on each problem

Ø  For the convenience of implementation, all constituent algorithms of a PAP instantiation evolve with the same number of generations.

Ø  Parameters of constituent algorithms are not fine-tuned, but set according to rules or default values suggested in the original publications.

Ø  migration_interval=MAX_GEN/20, migration_size=1

Ø  Wilcoxon Test Results (Significance level 0.05): “w-d-l” stands for “win-draw-lose”

Ø  Observations: Combining EAs in the simple way does lead to performance improvement.

Ø  Is that due to the fine-tuning of the migration parameter?

Ø  Consider four values for each of the two migration parameters and test 16 pairs of parameters. – migration_interval: MAX_GEN divided by 20, 30, 40, 50 – migration_size: 1, 2, 4, 8

Ø  Sensitivity Analysis – for each pair, 30 runs were executed on all 27 benchmark functions. – for each instantiation, count the number of the benchmark functions on which all 16 pairs of parameters made no difference. – Kruskal-Wallis one-way analysis of variance by ranks with significance level 0.05 was employed.

Ø  Results

Ø  Performance of a PAP instantiation is not very sensitive to the migration parameters, but relies on the constituent algorithms.

EPM-PAP �

On Choosing Constituent Algorithms

Ø  Let F = {fk | k = 1, 2, … , n} be a given problem set, A = {aj |j = 1, 2, ..., m} be a set of candidate EAs, choosing constituent algorithm for PAP is formulated as seeking the algorithm subset = {ai |i = 1, 2, ..., l} of A that leads to the best overall performance on F, as given in Eq. (1)

Ø  A most straightforward approach: enumerate all possible subset and employ a procedure like statistical racing to find the best one.

Even more time consuming than selecting a single algorithm!

),,~(maxarg~~

TFAUAAA

opt⊆

EPM-PAP �

Ø  Recall that we expect that a good PAP instantiation to under-perform a candidate EA (say, ai) with small probability.

Ø  Assuming independence between constituent algorithms, the above statement can be written for an algorithm j on problem fk as:

Ø  Averaging over all problems and all candidate EAs, we get (1)

Rjk = (1−Pi, jk )

∑∑∏= = =

kjiPmn

R1 1 1

, )1(1

EPM-PAP �

What is Estimated Performance Matrix

•  A matrix that records the performance of each candidate EA.

•  For each aj, the corresponding EPM, denoted by EPMj, is an r-by-n matrix.

•  This matrix can be obtained by running aj on each of the n problems for r times.

•  Each element of EPMj is the objective value of the best solution that aj obtained on a problem in a single run.

•  Since each element of EPMj is obtained with a small portion of T, it can be viewed as a conservative estimate of the solution quality achieved by running aj with T on the same problem.

EPM-PAP �

Ø  With the help of some statistical tests, EPMs provide all information that is needed to calculate Eq. (1)

Good news

Ø  No need to compare the performance of all possible subsets with a tedious procedure like statistical racing.

Ø  Estimating the performance of a single candidate EA is sufficient for constituent algorithm subset selection.

EPM-PAP �

Detailed steps for Choosing Constituent Algorithms

1.  Apply each candidate EA aj to each problem for r independent runs. The final population obtained in each run is stored.

2.  Construct EPM for each aj based on the quality of the best solution it obtained in each run.

3.  All possible subset of A is enumerated and the corresponding R is calculated using Eq. (1) and the EPMs.

4.  The subset with the smallest R is selected as the constituent algorithms for PAP.

EPM-PAP: Experiments � Ø  4 Candidate EAs: CMA-ES, G3PCX, SaNSDE, wPSO

Ø  Benchmark problems – 13 numerical problems from classical benchmark suite used in [6] – 14 numerical problems from CEC2005 benchmark suite [7] – Dimension: 30

EPM-PAP: Experiments � Ø  Total Fitness Evaluations (FEs) for each problem: 400000,

800000, and 1200000, respectively.

Ø  25 independent runs on each problem

Ø  For the convenience of implementation, all constituent algorithms of a PAP instantiation evolve with the same number of generations.

Ø  Parameters of constituent algorithms are not fine-tuned.

Ø  migration_interval=MAX_GEN/20, migration_size=1

Ø  PAP with 2 and 3 constituent algorithms are considered.

EPM-PAP: Experiments � Ø  Wilcoxon Test Results (Significance level 0.05): “w-d-l” stands

for “win-draw-lose”

Time Budget

SaNSDE

wPSO G3PCX CMA-ES

F-Race Intra-AOTA

EPM-PAP-2

T1 8-14-5 17-10-0 21-6-0 8-13-6 9-14-4 6-15-6

T2 7-14-6 16-10-1 20-7-0 9-14-4 7-15-5 5-18-4

T3 6-15-6 17-9-1 21-6-0 10-14-3 7-14-6 6-18-3

EPM-PAP-3

T1 9-11-7 19-7-1 21-5-1 10-10-4 10-13-4 5-17-5

T2 8-17-2 17-9-1 20-7-0 9-12-6 9-12-6 5-20-2

T3 9-16-2 17-10-0 21-6-0 9-14-4 9-14-4 6-20-1

EPM-PAP: Experiments �

Ø  Performance ranking of all possible EPM-PAP-2 and EPM-PAP-3

PAP Rank Time Budget = T1 Time Budget = T2 Time Budget = T3 with 2 constituent algorithms

1 SaNSDE + CMA-ES SaNSDE + CMA-ES SaNSDE + CMA-ES 2 wPSO + CMA-ES wPSO + CMA-ES wPSO + CMA-ES 3 SaNSDE + wPSO SaNSDE + wPSO SaNSDE + wPSO 4 SaNSDE + G3PCX SaNSDE + G3PCX SaNSDE + G3PCX 5 G3PCX + CMA-ES G3PCX + CMA-ES G3PCX + CMA-ES 6 wPSO + G3PCX wPSO + G3PCX wPSO + G3PCX

with 3 constituent algorithms

1 SaNSDE+wPSO+CMA-ES SaNSDE+wPSO+CMA-ES SaNSDE+wPSO+CMA-ES 2 SaNSDE+G3PCX+CMA-ES SaNSDE+G3PCX+CMA-ES SaNSDE+G3PCX+CMA-ES 3 SaNSDE+wPSO+G3PCX SaNSDE+wPSO+G3PCX wPSO+ G3PCX+CMA-ES 4 wPSO+ G3PCX+CMA-ES wPSO+ G3PCX+CMA-ES SaNSDE+wPSO+G3PCX

EPM-PAP: Experiments �

Ø  Success Rates of the EPM-based selection procedure: How likely did it select the best constituent algorithm subset?

Time Budget SR1 SR2

EPM-PAP-2 T1 40% 88%

T2 56% 100%

T3 72% 100%

EPM-PAP-3 T1 16% 84%

T2 36% 88%

T3 56% 100%

Outline

•  Summary

Summary

•  Data-driven EAs make use of data analytics approach to gain useful information from the data generated during search.

•  Two recent examples of DDEC have been introduced.

•  In HTS, history data are utilized to determine whether two individuals belong to the same species.

•  In EPM-PAP, history data are utilized to select constituent algorithms for PAP.

•  Different context in EC may induce significantly different data analytics problem, where a lot of work could be done.

Collaborators

•  Collaborators –  Mr. Lingxi Li (HTS)

–  Dr. Fei Peng (EPM-PAP)

–  Prof. Xin Yao (EPM-PAP)

–  Prof. Guoliang Chen (EPM-PAP)

References

HTS •  L. Li and K. Tang, “History-Based Topological Speciation for Multimodal

Optimization,” IEEE Transactions on Evolutionary Computation, in press (DOI: 10.1109/TEVC.2014.2306677).

EPM-PAP

•  F. Peng, K. Tang, G. Chen and X. Yao, “Population-based Algorithm Portfolios for Numerical Optimization,” IEEE Transactions on Evolutionary Computation, 14(5): 782-800, October 2010.

•  K. Tang, F. Peng, G. Chen and X. Yao, “Population-based Algorithm Portfolios with automated constituent algorithms selection,” Information Sciences, in press (DOI: 10.1016/j.ins.2014.03.105).

Thanks for your time! Q&A?

Recent Advances in Data-driven Evolutionary Computationstaff.ustc.edu.cn/~ketang/PPT/DDEC201406.pdf · Recent Advances in Data-driven Evolutionary Computation Ke Tang USTC-Birmingham

Documents

UNIVERSITY OF JYVÄSKYLÄ Resource Discovery in P2P Networks...

Membuat Media Pembelajaran Fisika - KETANG WIYONO -...

EVOLUTIONARY BIOLOGY Copyright © 2020 Anthropogenic...

Building trees of algae: some advances in phylogenetic and.....

Introduc-on -...

Building trees of algae: some advances in phylogenetic and.....

Bio. 230 --- Evolution II. Evolutionary Advances (I) 1) CELL...

ECE 4271: Evolutionary Processes, Evolutionary Algorithms...

Advances in Cognitive Theory and Therapy: The Generic...

Advancing Economic Development Through Innovation in Energy....

Multi-objective Evolutionary Approaches for ROC...

CS 478 - Evolutionary Algorithms1 Evolutionary Algorithms.

METHODOLOGICAL ADVANCES - INFERENCE OF...

Applications of Evolutionary Algorithms · Applications of....

EVOLUTIONARY BIOLOGY From asymmetrical to balanced … ·.....

Running head: EVOLUTIONARY EVALUATION 1 Evolutionary ...