Spurious Dependencies and EDA Scalability Elizabeth Radetic and Martin Pelikan Missouri Estimation of Distribution Algorithms Laboratory (MEDAL) University of Missouri, St. Louis, MO http://medal.cs.umsl.edu/ [email protected]Download MEDAL Report No. 2010002 http://medal.cs.umsl.edu/files/2010002.pdf Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Spurious Dependencies and EDA Scalability
Elizabeth Radetic and Martin PelikanMissouri Estimation of Distribution Algorithms Laboratory (MEDAL)
University of Missouri, St. Louis, MOhttp://medal.cs.umsl.edu/
Estimation of distribution algorithms (EDAs)I Replace standard crossover and mutation by
I building a probabilistic model of selected solutions, andI sampling the probabilistic model to generate new solutions.
I Can solve many problems intractable with standard EAs.
Model accuracyI It is important that the EDA model is accurate.I Types of inaccuracies for dependency-based models
I Missing dependencies.I Spurious, unnecessary dependencies.
I Most prior work focused on missing dependencies.
This studyI Focus on effects of spurious dependencies.
I Theoretical study for population sizing.I Empirical study for the number of generations.
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability
Outline
1. Model accuracy.
2. Spurious dependenciesI Model for spurious dependencies.I Effects on population sizing.I Effects on the number of generations.
3. Experiments.
4. Conclusions and future work.
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability
Dependency-Based Probabilistic Models in EDAs
Dependency-based probabilistic models
I Encode dependencies and independencies between variables.I Dependency structure decomposes the problem.I Subproblems should be of bounded order.
Examples
I Marginal product models.I Bayesian networks.
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability
Marginal Product Model
I Variables are divided into linkage groups.I Defines problem decomposition into separable subproblems.I Distribution of each group encoded by probability table.I We assume binary representation of candidate solutions.
Martin Pelikan, Probabilistic Model-Building GAs 29
How to Learn a Tree Model? ! Mutual information:
! Goal " Find tree that maximizes mutual information
between connected nodes. " Will minimize Kullback-Leibler divergence.
! Algorithm " Prim’s algorithm for maximum spanning trees.
I(Xi ,Xj ) = P(Xi = a,Xj = b)a,b! log
P(Xi = a,Xj = b)P(Xi = a)P(Xj = b)
Martin Pelikan, Probabilistic Model-Building GAs 30
Prim’s Algorithm
! Start with a graph with no edges. ! Add arbitrary node to the tree. ! Iterate
" Hang a new node to the current tree. " Prefer addition of edges with large mutual
information (greedy approach).
! Complexity: O(n2)
Martin Pelikan, Probabilistic Model-Building GAs 31
Variants of PMBGAs with Tree Models
! COMIT (Baluja, Davies, 1997) " Tree models.
! MIMIC (DeBonet, 1996) " Chain distributions.
! BMDA (Pelikan, Mühlenbein, 1998) " Forest distribution (independent trees or tree)
Martin Pelikan, Probabilistic Model-Building GAs 32
Beyond Pairwise Dependencies: ECGA
! Extended Compact GA (ECGA) (Harik, 1999). ! Consider groups of string positions.
0 86 %
1 14 %
String Model
000 17 %
001 2 %
! ! ! 111 24 %
00 16 %
01 45 %
10 35 %
11 4 %
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability
Model Accuracy
Types of inaccuraciesI Missing dependencies.I Spurious, unnecessary dependencies.
Example: Trap-5
I ftrap5(X1, . . . , Xn) =∑n/5
i=1 trap5(X5i−4 + X5i−3 + X5i−2 + X5i−1 + X5i)
I trap5(u) =
{5 if u = 54− u otherwise
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability
Onemax Model of Spurious Dependencies
Onemax is the sum of bits in the binary string
I onemax(X1, . . . , Xn) =∑n
i=1 Xi
Perfect and spurious models for onemax
I Perfect model assumes no dependence at all.I Spurious model assumes linkage groups of order kspurious > 1.I Parameter kspurious controls order of spurious dependencies.
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability
Effects of Spurious Models on EDA Performance
Two main effects of spurious dependencies
I Population size.I Number of generations.
Population sizing decomposition
I Population size requirements should increaseI Effects depend on learning, but sometimes substantial.
Number of generations
I Number of generations may decrease due to weaker variation.I Effects not expected substantial.
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability
EDA Population Sizing and Spurious Dependencies
Population sizing decomposition
I Initial supplyI Initial population is random.I Ensure sufficient supply of partial solutions for each group.
I Decision makingI Decision making between partial solutions is stochastic.I Ensure that best partial solution wins in each group.
I Model buildingI Ensure accurate enough models to find the optimum.I The reason for spurious dependencies, not the effect.
Focus in this work
I Initial supply.I Decision making.
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability
Population Sizing: Initial Supply
Initial supply for perfect model (Goldberg et al., 2001)
N = 2 ln 2m
Initial supply for arbitrary kspurious
N = 2kspurious
(kspurious ln 2 + ln
n
kspurious
)
Initial-supply population increase factor
γis = 2kspurious−1kspurious ln 2 + ln n
kspurious
ln 2 + ln n
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability
Population Sizing: Decision Making
Decision making for perfect model (Harik et al., 1997)
N = −1
2lnα
√π(n− 1)
Decision making for arbitrary kspurious
N = −2kspurious−2 lnα√
π(n− 1)
Decision-making population increase factor
γdm = 2kspurious
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability
Number of Generations
Effects of spurious dependencies on number of generations
I Spurious dependencies weaken the mixing.I This reduces the effects of variation.I This should reduce the number of generations until
convergence (assuming a large enough population).I No theoretical model as of now.
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability
Description of Experiments
Operators
I Binary tournament selection without replacement.I Three replacement types
I Full replacement.I Elitist replacement (50% worst are replaced).I Restricted tournament replacement (niching).
I Models with various levels of spurious linkage.
Parameters
I Optimal population size obtained by bisection.I Runs stop when a solution close enough to the optimum is
reached (allow one linkage group to end up incorrect).
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability
Population Size (Full Replacement)
0
200
400
600
800
1000
1 1.5 2 2.5 3 3.5 4 4.5 5
Pop
ulat
ion
size
Spurious linkage group size
Gambler’s ruinInitial supplyExperiment
(a) Population size
4
8
12
16
1 2 3 4 5
Pop
ulat
ion
size
rat
io
Spurious linkage group size
Gambler’s ruinInitial supplyExperiment
(b) Population size ratio
Figure 2: Growth of the population size with respect to the group size for a problem of 300 bits.The left-hand side shows the actual population sizes compared to the theoretical model, whereasthe right-hand side shows the ratio of the population sizes with spurious linkage and the populationsizes with no spurious linkage.
0
200
400
600
800
1000
1200
1 1.5 2 2.5 3 3.5 4 4.5 5
Pop
ulat
ion
size
Group size (bits per group)
Problem size30024018012060
(a) Full replacement
0
200
400
600
800
1000
1 1.5 2 2.5 3 3.5 4 4.5 5
Pop
ulat
ion
size
Group size (bits per group)
Problem size30024018012060
(b) Elitist replacement
0
100
200
300
400
500
1 1.5 2 2.5 3 3.5 4 4.5 5
Pop
ulat
ion
size
Group size (bits per group)
Problem size30024018012060
(c) RTR
Figure 3: Growth of the population size with respect to the spurious linkage group size.
1(a) shows the average number of spurious linkage groups (groups of size at least 2) for each prob-lem size. The results indicate that the number of such groups increases approximately linearly withproblem size. Figure 1(b) shows the average size of spurious linkage groups. For each problem size,the size of spurious linkage groups is close to two, indicating that larger linkage groups were createdonly rarely. Finally, figure 1(c) shows the average linkage group size when both spurious linkagegroups and independent bits are taken into account. The average group size is between 1 and 2and it increases slightly with problem size. Similar results for ECGA on onemax were reported inref. [28]. The results presented here thus reaffirm the need for studying spurious dependencies andtheir effects.
4.2.2 Population sizing with spurious linkage
This section presents the results of using fixed MPM models with spurious dependencies on onemax.To confirm the theory presented in section 3, figure 2 compares the experimental results for thepopulation size to the predictions made by the initial supply and gambler’s ruin population sizingmodels. As expected, the gambler’s ruin model matches the experimental results more closely thanthat for the initial supply. The gambler’s ruin model can therefore help to determine the impactof spurious dependencies on EDA population sizing. Since the population size is one of the keyfactors that affect EDA performance, this can provide guidelines for predicting the overall impactof spurious dependencies on EDA performance.
Additional results on the effects of spurious dependencies on the population size for all problemsizes and replacement methods are shown in figure 3. These results illustrate that, in each case, thepopulation size grows approximately exponentially with the spurious linkage group size. Further-
10
I Increase of population size with kspurious is exponential.I Theory provides a conservative bound.
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability
Population Size (All Replacement Strategies)
Full replacement Elitist replacement RTR
0
200
400
600
800
1000
1 1.5 2 2.5 3 3.5 4 4.5 5
Pop
ulat
ion
size
Spurious linkage group size
Gambler’s ruinInitial supplyExperiment
(a) Population size
4
8
12
16
1 2 3 4 5
Pop
ulat
ion
size
rat
io
Spurious linkage group size
Gambler’s ruinInitial supplyExperiment
(b) Population size ratio
Figure 2: Growth of the population size with respect to the group size for a problem of 300 bits.The left-hand side shows the actual population sizes compared to the theoretical model, whereasthe right-hand side shows the ratio of the population sizes with spurious linkage and the populationsizes with no spurious linkage.
0
200
400
600
800
1000
1200
1 1.5 2 2.5 3 3.5 4 4.5 5
Pop
ulat
ion
size
Group size (bits per group)
Problem size30024018012060
(a) Full replacement
0
200
400
600
800
1000
1 1.5 2 2.5 3 3.5 4 4.5 5
Pop
ulat
ion
size
Group size (bits per group)
Problem size30024018012060
(b) Elitist replacement
0
100
200
300
400
500
1 1.5 2 2.5 3 3.5 4 4.5 5
Pop
ulat
ion
size
Group size (bits per group)
Problem size30024018012060
(c) RTR
Figure 3: Growth of the population size with respect to the spurious linkage group size.
1(a) shows the average number of spurious linkage groups (groups of size at least 2) for each prob-lem size. The results indicate that the number of such groups increases approximately linearly withproblem size. Figure 1(b) shows the average size of spurious linkage groups. For each problem size,the size of spurious linkage groups is close to two, indicating that larger linkage groups were createdonly rarely. Finally, figure 1(c) shows the average linkage group size when both spurious linkagegroups and independent bits are taken into account. The average group size is between 1 and 2and it increases slightly with problem size. Similar results for ECGA on onemax were reported inref. [28]. The results presented here thus reaffirm the need for studying spurious dependencies andtheir effects.
4.2.2 Population sizing with spurious linkage
This section presents the results of using fixed MPM models with spurious dependencies on onemax.To confirm the theory presented in section 3, figure 2 compares the experimental results for thepopulation size to the predictions made by the initial supply and gambler’s ruin population sizingmodels. As expected, the gambler’s ruin model matches the experimental results more closely thanthat for the initial supply. The gambler’s ruin model can therefore help to determine the impactof spurious dependencies on EDA population sizing. Since the population size is one of the keyfactors that affect EDA performance, this can provide guidelines for predicting the overall impactof spurious dependencies on EDA performance.
Additional results on the effects of spurious dependencies on the population size for all problemsizes and replacement methods are shown in figure 3. These results illustrate that, in each case, thepopulation size grows approximately exponentially with the spurious linkage group size. Further-
10
0
200
400
600
800
1000
1 1.5 2 2.5 3 3.5 4 4.5 5
Pop
ulat
ion
size
Spurious linkage group size
Gambler’s ruinInitial supplyExperiment
(a) Population size
4
8
12
16
1 2 3 4 5
Pop
ulat
ion
size
rat
io
Spurious linkage group size
Gambler’s ruinInitial supplyExperiment
(b) Population size ratio
Figure 2: Growth of the population size with respect to the group size for a problem of 300 bits.The left-hand side shows the actual population sizes compared to the theoretical model, whereasthe right-hand side shows the ratio of the population sizes with spurious linkage and the populationsizes with no spurious linkage.
0
200
400
600
800
1000
1200
1 1.5 2 2.5 3 3.5 4 4.5 5
Pop
ulat
ion
size
Group size (bits per group)
Problem size30024018012060
(a) Full replacement
0
200
400
600
800
1000
1 1.5 2 2.5 3 3.5 4 4.5 5
Pop
ulat
ion
size
Group size (bits per group)
Problem size30024018012060
(b) Elitist replacement
0
100
200
300
400
500
1 1.5 2 2.5 3 3.5 4 4.5 5
Pop
ulat
ion
size
Group size (bits per group)
Problem size30024018012060
(c) RTR
Figure 3: Growth of the population size with respect to the spurious linkage group size.
1(a) shows the average number of spurious linkage groups (groups of size at least 2) for each prob-lem size. The results indicate that the number of such groups increases approximately linearly withproblem size. Figure 1(b) shows the average size of spurious linkage groups. For each problem size,the size of spurious linkage groups is close to two, indicating that larger linkage groups were createdonly rarely. Finally, figure 1(c) shows the average linkage group size when both spurious linkagegroups and independent bits are taken into account. The average group size is between 1 and 2and it increases slightly with problem size. Similar results for ECGA on onemax were reported inref. [28]. The results presented here thus reaffirm the need for studying spurious dependencies andtheir effects.
4.2.2 Population sizing with spurious linkage
This section presents the results of using fixed MPM models with spurious dependencies on onemax.To confirm the theory presented in section 3, figure 2 compares the experimental results for thepopulation size to the predictions made by the initial supply and gambler’s ruin population sizingmodels. As expected, the gambler’s ruin model matches the experimental results more closely thanthat for the initial supply. The gambler’s ruin model can therefore help to determine the impactof spurious dependencies on EDA population sizing. Since the population size is one of the keyfactors that affect EDA performance, this can provide guidelines for predicting the overall impactof spurious dependencies on EDA performance.
Additional results on the effects of spurious dependencies on the population size for all problemsizes and replacement methods are shown in figure 3. These results illustrate that, in each case, thepopulation size grows approximately exponentially with the spurious linkage group size. Further-
10
0
200
400
600
800
1000
1 1.5 2 2.5 3 3.5 4 4.5 5
Pop
ulat
ion
size
Spurious linkage group size
Gambler’s ruinInitial supplyExperiment
(a) Population size
4
8
12
16
1 2 3 4 5
Pop
ulat
ion
size
rat
io
Spurious linkage group size
Gambler’s ruinInitial supplyExperiment
(b) Population size ratio
Figure 2: Growth of the population size with respect to the group size for a problem of 300 bits.The left-hand side shows the actual population sizes compared to the theoretical model, whereasthe right-hand side shows the ratio of the population sizes with spurious linkage and the populationsizes with no spurious linkage.
0
200
400
600
800
1000
1200
1 1.5 2 2.5 3 3.5 4 4.5 5
Pop
ulat
ion
size
Group size (bits per group)
Problem size30024018012060
(a) Full replacement
0
200
400
600
800
1000
1 1.5 2 2.5 3 3.5 4 4.5 5
Pop
ulat
ion
size
Group size (bits per group)
Problem size30024018012060
(b) Elitist replacement
0
100
200
300
400
500
1 1.5 2 2.5 3 3.5 4 4.5 5
Pop
ulat
ion
size
Group size (bits per group)
Problem size30024018012060
(c) RTR
Figure 3: Growth of the population size with respect to the spurious linkage group size.
1(a) shows the average number of spurious linkage groups (groups of size at least 2) for each prob-lem size. The results indicate that the number of such groups increases approximately linearly withproblem size. Figure 1(b) shows the average size of spurious linkage groups. For each problem size,the size of spurious linkage groups is close to two, indicating that larger linkage groups were createdonly rarely. Finally, figure 1(c) shows the average linkage group size when both spurious linkagegroups and independent bits are taken into account. The average group size is between 1 and 2and it increases slightly with problem size. Similar results for ECGA on onemax were reported inref. [28]. The results presented here thus reaffirm the need for studying spurious dependencies andtheir effects.
4.2.2 Population sizing with spurious linkage
This section presents the results of using fixed MPM models with spurious dependencies on onemax.To confirm the theory presented in section 3, figure 2 compares the experimental results for thepopulation size to the predictions made by the initial supply and gambler’s ruin population sizingmodels. As expected, the gambler’s ruin model matches the experimental results more closely thanthat for the initial supply. The gambler’s ruin model can therefore help to determine the impactof spurious dependencies on EDA population sizing. Since the population size is one of the keyfactors that affect EDA performance, this can provide guidelines for predicting the overall impactof spurious dependencies on EDA performance.
Additional results on the effects of spurious dependencies on the population size for all problemsizes and replacement methods are shown in figure 3. These results illustrate that, in each case, thepopulation size grows approximately exponentially with the spurious linkage group size. Further-
10
I Increase of population size with kspurious similar in all cases.
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability
Number of Generations (All Replacement Strategies)
Full replacement Elitist replacement RTR
10 20 30 40 50 60 70 80
1 1.5 2 2.5 3 3.5 4 4.5 5
Num
ber
of g
ener
atio
ns
Group Size (bits per group)
Problem size300240180
12060
(a) Full replacement
10 20 30 40 50 60 70 80
1 1.5 2 2.5 3 3.5 4 4.5 5N
umbe
r of
gen
erat
ions
Group size (bits per group)
Problem size300240180
12060
(b) Elitist replacement
10
100
1000
10000
100000
1e+06
1e+07
1 1.5 2 2.5 3 3.5 4 4.5 5
Num
ber
of g
ener
atio
ns
Group size (bits per group)
Problem size30024018012060
(c) RTR
Figure 4: Growth of the number of generations with respect to the spurious linkage group size.
60 80
100 120 140 160 180 200 220 240
1 1.2 1.4 1.6 1.8 2
Pop
ulat
ion
size
Avg. spurious linkage group size
Full repl.Elitist repl.RTR repl.
(a) Population size
20 40 60 80
100 120 140 160 180 200
1 1.2 1.4 1.6 1.8 2
Num
ber
of g
ener
atio
ns
Avg. spurious linkage group size
Full repl.Elitist repl.RTR repl.
(b) Number of generations
2000 4000 6000 8000
10000 12000 14000 16000 18000 20000
1 1.2 1.4 1.6 1.8 2N
umbe
r of
eva
luat
ions
Avg. spurious linkage group size
Full repl.Elitist repl.RTR repl.
(c) Number of evaluations
Figure 5: The population size, the number of generations, and the number of evaluations for the300-bit onemax with varying numbers of spurious linkage groups of size 2 (the remaining groupsare of size 1).
more, elitist replacement requires somewhat smaller populations than full replacement, and RTRoutperforms other replacement strategies in terms of the population size. However, in each casethe relative increase of population sizes with the size of spurious linkage groups is similar.
The results for ECGA presented earlier (figure 1) indicated that typically the models in ECGAcombined only a fraction of the bits into spurious linkage groups, most of which were of size 2.The results of figure 5 illustrate the effects of having different proportions of such linkage groups.More specifically, figure 5 illustrates the results obtained when using a variable number of spuriouslinkage groups of size 2 on a 300-bit onemax (the remaining groups are of size 1). From theseresults, it is clear that the required population size increases with the number of spurious linkagegroups. An upper bound for this scenario should be of course provided by EDAs with MPMs withthe fixed order of spurious dependencies kspurious = 2.
In summary, the results presented in this subsection demonstrate that spurious dependenciestend to increase the required population size for EDAs and that the theory from section 3 providesan accurate estimation of the effects of spurious dependencies on the population size.
4.2.3 Number of generations with spurious linkage
According to the results in figures 4 and 5, the number of generations until optimum does not seemto be strongly affected by the order of spurious dependencies when full and elitist replacementmethods are used. In fact, figure 4 shows that the number of generations required to find anaccurate solution decreases slightly with the size of spurious linkage groups. Figure 5 shows similarresults on a smaller scale.
11
10 20 30 40 50 60 70 80
1 1.5 2 2.5 3 3.5 4 4.5 5
Num
ber
of g
ener
atio
ns
Group Size (bits per group)
Problem size300240180
12060
(a) Full replacement
10 20 30 40 50 60 70 80
1 1.5 2 2.5 3 3.5 4 4.5 5
Num
ber
of g
ener
atio
ns
Group size (bits per group)
Problem size300240180
12060
(b) Elitist replacement
10
100
1000
10000
100000
1e+06
1e+07
1 1.5 2 2.5 3 3.5 4 4.5 5
Num
ber
of g
ener
atio
ns
Group size (bits per group)
Problem size30024018012060
(c) RTR
Figure 4: Growth of the number of generations with respect to the spurious linkage group size.
60 80
100 120 140 160 180 200 220 240
1 1.2 1.4 1.6 1.8 2
Pop
ulat
ion
size
Avg. spurious linkage group size
Full repl.Elitist repl.RTR repl.
(a) Population size
20 40 60 80
100 120 140 160 180 200
1 1.2 1.4 1.6 1.8 2
Num
ber
of g
ener
atio
ns
Avg. spurious linkage group size
Full repl.Elitist repl.RTR repl.
(b) Number of generations
2000 4000 6000 8000
10000 12000 14000 16000 18000 20000
1 1.2 1.4 1.6 1.8 2
Num
ber
of e
valu
atio
ns
Avg. spurious linkage group size
Full repl.Elitist repl.RTR repl.
(c) Number of evaluations
Figure 5: The population size, the number of generations, and the number of evaluations for the300-bit onemax with varying numbers of spurious linkage groups of size 2 (the remaining groupsare of size 1).
more, elitist replacement requires somewhat smaller populations than full replacement, and RTRoutperforms other replacement strategies in terms of the population size. However, in each casethe relative increase of population sizes with the size of spurious linkage groups is similar.
The results for ECGA presented earlier (figure 1) indicated that typically the models in ECGAcombined only a fraction of the bits into spurious linkage groups, most of which were of size 2.The results of figure 5 illustrate the effects of having different proportions of such linkage groups.More specifically, figure 5 illustrates the results obtained when using a variable number of spuriouslinkage groups of size 2 on a 300-bit onemax (the remaining groups are of size 1). From theseresults, it is clear that the required population size increases with the number of spurious linkagegroups. An upper bound for this scenario should be of course provided by EDAs with MPMs withthe fixed order of spurious dependencies kspurious = 2.
In summary, the results presented in this subsection demonstrate that spurious dependenciestend to increase the required population size for EDAs and that the theory from section 3 providesan accurate estimation of the effects of spurious dependencies on the population size.
4.2.3 Number of generations with spurious linkage
According to the results in figures 4 and 5, the number of generations until optimum does not seemto be strongly affected by the order of spurious dependencies when full and elitist replacementmethods are used. In fact, figure 4 shows that the number of generations required to find anaccurate solution decreases slightly with the size of spurious linkage groups. Figure 5 shows similarresults on a smaller scale.
11
10 20 30 40 50 60 70 80
1 1.5 2 2.5 3 3.5 4 4.5 5
Num
ber
of g
ener
atio
ns
Group Size (bits per group)
Problem size300240180
12060
(a) Full replacement
10 20 30 40 50 60 70 80
1 1.5 2 2.5 3 3.5 4 4.5 5
Num
ber
of g
ener
atio
ns
Group size (bits per group)
Problem size300240180
12060
(b) Elitist replacement
10
100
1000
10000
100000
1e+06
1e+07
1 1.5 2 2.5 3 3.5 4 4.5 5
Num
ber
of g
ener
atio
ns
Group size (bits per group)
Problem size30024018012060
(c) RTR
Figure 4: Growth of the number of generations with respect to the spurious linkage group size.
60 80
100 120 140 160 180 200 220 240
1 1.2 1.4 1.6 1.8 2
Pop
ulat
ion
size
Avg. spurious linkage group size
Full repl.Elitist repl.RTR repl.
(a) Population size
20 40 60 80
100 120 140 160 180 200
1 1.2 1.4 1.6 1.8 2
Num
ber
of g
ener
atio
ns
Avg. spurious linkage group size
Full repl.Elitist repl.RTR repl.
(b) Number of generations
2000 4000 6000 8000
10000 12000 14000 16000 18000 20000
1 1.2 1.4 1.6 1.8 2
Num
ber
of e
valu
atio
ns
Avg. spurious linkage group size
Full repl.Elitist repl.RTR repl.
(c) Number of evaluations
Figure 5: The population size, the number of generations, and the number of evaluations for the300-bit onemax with varying numbers of spurious linkage groups of size 2 (the remaining groupsare of size 1).
more, elitist replacement requires somewhat smaller populations than full replacement, and RTRoutperforms other replacement strategies in terms of the population size. However, in each casethe relative increase of population sizes with the size of spurious linkage groups is similar.
The results for ECGA presented earlier (figure 1) indicated that typically the models in ECGAcombined only a fraction of the bits into spurious linkage groups, most of which were of size 2.The results of figure 5 illustrate the effects of having different proportions of such linkage groups.More specifically, figure 5 illustrates the results obtained when using a variable number of spuriouslinkage groups of size 2 on a 300-bit onemax (the remaining groups are of size 1). From theseresults, it is clear that the required population size increases with the number of spurious linkagegroups. An upper bound for this scenario should be of course provided by EDAs with MPMs withthe fixed order of spurious dependencies kspurious = 2.
In summary, the results presented in this subsection demonstrate that spurious dependenciestend to increase the required population size for EDAs and that the theory from section 3 providesan accurate estimation of the effects of spurious dependencies on the population size.
4.2.3 Number of generations with spurious linkage
According to the results in figures 4 and 5, the number of generations until optimum does not seemto be strongly affected by the order of spurious dependencies when full and elitist replacementmethods are used. In fact, figure 4 shows that the number of generations required to find anaccurate solution decreases slightly with the size of spurious linkage groups. Figure 5 shows similarresults on a smaller scale.
11
I Full and elitist replacementI Number of generations slightly decreases with kspurious.
I Niching (restricted tournament replacement)I Number of generations dramatically increases!
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability
Spurious Linkage in Multivariate EDAs
Experiment
I Use optimal population size in ECGA.I Observe spurious dependencies in actual models.
0 20 40 60 80
100 120 140
50 100 150 200 250 300Avg
. num
ber
of g
roup
s >
1
Problem size (number of bits)
ReplacementRTR
ElitistFull
(a) Number of spurious linkagegroups
2.015 2.02
2.025 2.03
2.035 2.04
2.045 2.05
50 100 150 200 250 300
Avg
. siz
e of
gro
ups
> 1
Problem size (number of bits)
ReplacementRTR
ElitistFull
(b) Avg. size of spurious linkagegroups
1.4 1.45
1.5 1.55
1.6 1.65
1.7 1.75
1.8
50 100 150 200 250 300
Ave
rage
gro
up s
ize
Problem size (number of bits)
ReplacementRTR
ElitistFull
(c) Average linkage group size
Figure 1: The average number of spurious linkage groups (groups of size ≥ 2), the average sizeof linkage groups of size ≥ 2, and the average linkage group size (including all linkage groups) forECGA on onemax. Three replacement strategies are considered: full replacement, elitist replace-ment and RTR. For each problem size and replacement strategy, the results represent an averageover 100 runs (10 bisections of 10 runs each).
Problems of size n = 60 to 300 bits in increments of 60 were tested. Population sizes weredetermined empirically using the bisection method [27, 22] to ensure 10 successful consecutiveruns. For each problem size and each test scenario, 10 independent bisections were performed, fora total of 100 independent runs. A run was considered successful when a string was found with atmost one suboptimal linkage group (with the linkage groups depending on the used model). Forthe base cases with no spurious linkage and the experiments with ECGA, at most 1 bit was allowedto be incorrect. Full population convergence was not required and each run was terminated whenone solution of the desired quality had been found. This allowed the same stopping criterion for alltested replacement methods including those with niching, which prevents full convergence.
Binary tournament selection without replacement was used to select parent populations. Toincorporate the offspring into the population, three replacement strategies were tested: (1) Fullreplacement, where the child population completely replaces the parent population; (2) elitistreplacement, where the worst 50% of the parent population was replaced with the child population;and (3) restricted tournament replacement (RTR) [10, 22], where for each offspring w solutions wererandomly selected from the parent population and the one genotypically closest to the offspringwas replaced if its fitness is worse. The window size in RTR was set to min(n, 0.05N) as suggestedby ref. [22].
The maximum number of generations allowed was set to 5n for most runs, but this limit wasincreased to 10000n when RTR was used with spurious dependencies. The need for this increasewas due to the effects of niching, as will be discussed in section 4.2.3.
4.2 Results
This section presents experimental results. First, the results that depict the accuracy of ECGAmodels on onemax are presented. The effects of spurious dependencies on the population size andthe time to convergence are then shown.
4.2.1 Spurious linkage for ECGA on onemax
The results in figure 1 provide an insight into the number and size of spurious dependencies dis-covered by ECGA on onemax using the population size obtained with bisection. Specifically, figure
9
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability
Conclusions and Future Work
Conclusions
I Population size increases exponentially with kspurious.I Number of generations mostly unaffected.I But for niching, the number of generations skyrocks!I Spurious dependencies should not be ignored.
Future work
I From our model to multivariate EDAsI In most EDAs population sizing driven by model building.I Almost always the models contain spurious dependencies.I How do the models interact?
I Dramatic increase in the number of generations with nichingI Explain why.I Propose ways to deal with it.
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability
Acknowledgments
Acknowledgments
I NSF; NSF CAREER grant ECS-0547013.
I University of Missouri; High Performance ComputingCollaboratory sponsored by Information Technology Services;Research Award; Research Board.
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability