IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION 1 Letters 1 Decomposition-Based Multiobjective Evolutionary 2 Algorithm with an Ensemble of Neighborhood Sizes 3 Shi-Zheng Zhao, Ponnuthurai Nagaratnam Suganthan, Senior 4 Member, IEEE, and Qingfu Zhang, Senior Member, IEEE 5 Abstract—The multiobjective evolutionary algorithm based on de- 6 composition (MOEA/D) has demonstrated superior performance by 7 winning the multiobjective optimization algorithm competition at the 8 CEC 2009. For effective performance of MOEA/D, neighborhood size 9 (NS) parameter has to be tuned. In this letter, an ensemble of different 10 NSs with online self-adaptation is proposed (ENS-MOEA/D) to overcome 11 this shortcoming. Our experimental results on the CEC 2009 competition 12 test instances show that an ensemble of different NSs with online self- 13 adaptation yields superior performance over implementations with only 14 one fixed NS. 15 Index Terms—Decomposition, multiobjective optimization, self- 16 adaptation. 17 I. Introduction 18 A MULTIOBJECTIVE optimization problem (MOP) can 19 be defined mathematically as follows [1]: 20 minimize : F (x)=(f 1 (x),...,f m (x)) T subject to : x ∈ (1) where is the decision (variable) space, F : → R m consists 21 of m real-valued objective functions, and R m is called the 22 objective space. In many real-world applications, since the 23 objectives in (1) conflict with one another, no point in can 24 minimize all the objectives at the same time. 25 Let u, v ∈ R m , u is stated to dominate v if and only if 26 u i ≤ v i for every i ∈{1,...,m} and u j <v j for at least one 27 index j ∈{1,...,m}. A point x ∗ ∈ is Pareto optimal if 28 there is no any other point x ∈ such that F (x) dominates 29 F (x ∗ ). F (x ∗ ) is then called a Pareto optimal (objective) vector. 30 In other words, any improvement in a Pareto optimal point in 31 one objective must lead to deterioration to at least one other 32 objective. The set of all the Pareto optimal objective vectors 33 is the Pareto front (PF) [1]. 34 Many multiobjective evolutionary algorithms (MOEAs) 35 have been developed to find a set of representative Pareto 36 optimal solutions in a single run. Most of them are Pareto 37 dominance based. Guided mainly by dominance-based fit- 38 ness measures of individual solutions, these algorithms push 39 Manuscript received December 24, 2010; revised May 31, 2011; accepted August 4, 2011. S.-Z. Zhao and P. N. Suganthan are with the School of Electrical and Elec- tronic Engineering, Nanyang Technological University, 639798, Singapore (e-mail: [email protected]; [email protected]). Q. Zhang is with the School of Computer Science and Electronic Engineering, University of Essex, Colchester CO4 3SQ, U.K. (e-mail: [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TEVC.2011.2166159 the whole population toward the PF. NSGA-II, SPEA-II, 40 and PAES [1] have been among the most popular Pareto- 41 dominance-based MOEAs in the past. 42 Multiobjective evolutionary algorithm based on decomposi- 43 tion (MOEA/D) [3] is a recent MOEA. Using conventional 44 aggregation approaches, MOEA/D decomposes the approx- 45 imation of the PF into a number of single objective opti- 46 mization subproblems. The objective of each subproblem is 47 a (linear or nonlinear) weighted aggregation of all the objec- 48 tives in the MOP under consideration. Neighborhood relations 49 among these subproblems are defined based on the distances 50 among their aggregation weight vectors. Each subproblem is 51 optimized by using information mainly from its neighboring 52 subproblems. The neighborhood size (NS) plays a crucial role 53 in MOEA/D [5]. Arguably, different multiobjective problems 54 need different NSs, and even for a particular problem, using 55 different NSs at different search stages could improve the 56 algorithm performance. When some solutions are trapped in 57 a locally optimal region, a large NS is required to increase 58 diversity for helping these solutions escape from the trapped 59 region. However, if the globally optimal area has been found, 60 a small NS will be favorable for local exploitation. 61 Ensemble learning has proven to be very efficient and 62 effective for adjusting algorithmic control parameters and 63 operators in an online manner [7]–[9]. In this letter, we 64 propose to use an ensemble of different NSs in MOEA/D and 65 dynamically adjust their selection probabilities based on their 66 previous performances. We compare the resultant algorithm, 67 called ENS-MOEA/D, with MOEA/D proposed in [3] on CEC 68 2009 MOEA Contest benchmark problems [4]. Our results 69 indicate that ensemble learning improves the performance of 70 MOEA/D significantly. 71 II. Review on the MOEA/D 72 There are several variants of MOEA/D. In this letter, we 73 use MOEA/D with dynamical resource allocation [3], which 74 won the CEC2009 multiobjective algorithm contest. To de- 75 compose (1), MOEA/D needs N evenly spread weight vectors 76 λ 1 , ..., λ N . Each λ j = (λ j 1 , ..., λ j m ) T satisfies ∑ m k=1 λ j k = 1 77 and λ j k ≥ 0 for all k and m. Let z ∗ = ( z ∗ 1 , ..., z ∗ m ) T , where 78 z ∗ i = min {f i (x)|x ∈ }. Then, the problem of approximation 79 of the PF of (1) can be decomposed into N scalar optimization 80 subproblems and the objective function of the jth minimization 81 subproblem is 82 g te (x|λ,z ∗ ) = max 1≤i≤m λ i |f i (x) − z ∗ i | . (2) z ∗ is often unknown before the search, the algorithm uses the 83 lowest f i -value found during the search to substitute z ∗ i [2]. 84 During the search, MOEA/D maintains: 85 1) a population of N points x 1 , ..., x N ∈ , where x i is the 86 current solution to ith subproblem; 87 1089-778X/$26.00 c 2011 IEEE
5
Embed
IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION 1 …web.mysites.ntu.edu.sg/epnsugan/PublicSite/Shared Documents/Ens... · IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION 1 ... Convergence
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Member, IEEE, and Qingfu Zhang, Senior Member, IEEE5
Abstract—The multiobjective evolutionary algorithm based on de-6
composition (MOEA/D) has demonstrated superior performance by7
winning the multiobjective optimization algorithm competition at the8
CEC 2009. For effective performance of MOEA/D, neighborhood size9
(NS) parameter has to be tuned. In this letter, an ensemble of different10
NSs with online self-adaptation is proposed (ENS-MOEA/D) to overcome11
this shortcoming. Our experimental results on the CEC 2009 competition12
test instances show that an ensemble of different NSs with online self-13
adaptation yields superior performance over implementations with only14
one fixed NS.15
Index Terms—Decomposition, multiobjective optimization, self-16
adaptation.17
I. Introduction18
AMULTIOBJECTIVE optimization problem (MOP) can19
be defined mathematically as follows [1]:20
minimize : F (x) = (f1(x), . . . , fm(x))T
subject to : x ∈ �(1)
where � is the decision (variable) space, F : � → Rm consists21
of m real-valued objective functions, and Rm is called the22
objective space. In many real-world applications, since the23
objectives in (1) conflict with one another, no point in � can24
minimize all the objectives at the same time.25
Let u, v ∈ Rm, u is stated to dominate v if and only if26
ui ≤ vi for every i ∈ {1, . . . , m} and uj < vj for at least one27
index j ∈ {1, . . . , m}. A point x∗ ∈ � is Pareto optimal if28
there is no any other point x ∈ � such that F (x) dominates29
F (x∗). F (x∗) is then called a Pareto optimal (objective) vector.30
In other words, any improvement in a Pareto optimal point in31
one objective must lead to deterioration to at least one other32
objective. The set of all the Pareto optimal objective vectors33
is the Pareto front (PF) [1].34
Many multiobjective evolutionary algorithms (MOEAs)35
have been developed to find a set of representative Pareto36
optimal solutions in a single run. Most of them are Pareto37
dominance based. Guided mainly by dominance-based fit-38
ness measures of individual solutions, these algorithms push39
Manuscript received December 24, 2010; revised May 31, 2011; acceptedAugust 4, 2011.
S.-Z. Zhao and P. N. Suganthan are with the School of Electrical and Elec-tronic Engineering, Nanyang Technological University, 639798, Singapore(e-mail: [email protected]; [email protected]).
Q. Zhang is with the School of Computer Science and ElectronicEngineering, University of Essex, Colchester CO4 3SQ, U.K. (e-mail:[email protected]).
Color versions of one or more of the figures in this paper are availableonline at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TEVC.2011.2166159
the whole population toward the PF. NSGA-II, SPEA-II, 40
and PAES [1] have been among the most popular Pareto- 41
dominance-based MOEAs in the past. 42
Multiobjective evolutionary algorithm based on decomposi- 43
tion (MOEA/D) [3] is a recent MOEA. Using conventional 44
aggregation approaches, MOEA/D decomposes the approx- 45
imation of the PF into a number of single objective opti- 46
mization subproblems. The objective of each subproblem is 47
a (linear or nonlinear) weighted aggregation of all the objec- 48
tives in the MOP under consideration. Neighborhood relations 49
among these subproblems are defined based on the distances 50
among their aggregation weight vectors. Each subproblem is 51
optimized by using information mainly from its neighboring 52
subproblems. The neighborhood size (NS) plays a crucial role 53
in MOEA/D [5]. Arguably, different multiobjective problems 54
need different NSs, and even for a particular problem, using 55
different NSs at different search stages could improve the 56
algorithm performance. When some solutions are trapped in 57
a locally optimal region, a large NS is required to increase 58
diversity for helping these solutions escape from the trapped 59
region. However, if the globally optimal area has been found, 60
a small NS will be favorable for local exploitation. 61
Ensemble learning has proven to be very efficient and 62
effective for adjusting algorithmic control parameters and 63
operators in an online manner [7]–[9]. In this letter, we 64
propose to use an ensemble of different NSs in MOEA/D and 65
dynamically adjust their selection probabilities based on their 66
previous performances. We compare the resultant algorithm, 67
called ENS-MOEA/D, with MOEA/D proposed in [3] on CEC 68
Fig. 1. Performance of all the variants of MOEA/D with different fixed NS and ENS-MOEA/D.AQ:3
Fig. 2. Selection probabilities of different NSs in ENS-MOEA/D. (a) UF3. (b) UF9.
2) FV 1, ..., FVN , where FV i is the F-value of xi, i.e.,88
FV i = F (xi) for each i = 1, . . . , N ;89
3) z =(z1, . . . , zm
)T, where zi is the best value found so90
far for objective fi.91
For each weight vector, its NS-neighborhood is the set of NS-92
closest weight vectors to it. Correspondingly, each solution and93
each subproblem have their NS-neighborhoods, respectively.94
At each generation, a set of the current solutions is selected.95
For each selected solution xi, MOEA/D does the following.96
1) Set the mating and update range P to be the T -97
neighborhood of xi with a large probability, and the98
whole population otherwise.99
2) Randomly select three current solutions from P.100
3) Apply genetic operators on the above selected solutions101
to generate a new solution y, repair y if necessary.102
Compute F(y).103
4) Replace a small number of solutions in P by y if y is104
better than them for their subproblems.105
No solution will be replaced in Step 4 if y is not better106
than any solution in P for its subproblem. When such a107
case happens, we say that the update fails, otherwise, it is108
successful.109
III. Ensemble of NSs for MOEA/D110
To evolve the solution of a subproblem, only the current111
solutions to its neighboring subproblems are exploited in112
MOEA/D. Usually, a larger neighborhood makes the search113
more globally, whereas a smaller NS encourages local search.114
Hence, by appropriately adjusting the NS for each subproblem,115
the performance of the MOEA/D can be enhanced. However, 116
for diverse problems, a trial-and-error approach can be too 117
demanding for tuning NSs. Motivated by these observations, 118
we employ an ensemble of NSs which are selected accord- 119
ing to their historical performances of generating promising 120
solutions. 121
In ENS-MOEA/D, K-fixed NSs are used as a pool of 122
candidates. During the evolution, a NS will be chosen for 123
each subproblem from the pool based on the candidates’ previ- 124
ous performances of generating improved solutions. In ENS- 125
MOEA/D, the certain fixed number of previous generations 126
used to store the success probability is defined as the learning 127
period (LP). At the generation G > LP − 1, the probability of 128
choosing the kth (k = 1, 2, . . . , K)NS is updated by 129
pk,G =Rk,G
∑Kk=1 Rk,G
(3)
where 130
Rk,G =
G−1∑
g=G−LPFEs successk,g
∑G−1g=G−LP FEsk,g
+ ε, (k = 1, 2, .., K; G > LP).
(4)Rk,G represents the proportion of improved solutions generated 131
with the kth NS within the previous LP generations. By 132
improved solutions, we mean the solutions which successfully 133
entered the next generation. FEsk,g is the total number of 134
solutions generated with the kth NS within the previous 135
LP generations, FEs−successk,g is the number of improved 136
solutions generated with the kth NS within the previous LP 137
generations. The small constant value ε = 0.05 is used to 138
ZHAO et al.: DECOMPOSITION-BASED MULTIOBJECTIVE EVOLUTIONARY ALGORITHM 3
Fig. 3. Convergence graphs in term of mean of IGD obtained by two methods. (a) F1. (b) F2. (c) F3. (d) F4. (e) F5. (f) F6. (g) F7. (h) F8. (i) F9. (j) F10.
avoid the possible zero selection probabilities. To ensure that139
the probabilities of choosing strategies are always summed to140
1, we further divide Rk,G by∑K
k=1 Rk in calculating pk,G.141
The procedure of ENS-MOEA/D is presented below.142
1) Initialization.143
Set generation counter as G=0; probability pk,G = 1/K.144
Generate a uniform spread of N weight vectors, λi =145
(λ1, ..., λN ). Generate an initial population x1, ..., xN ∈146
� by uniformly randomly sampling from the search147
space. Initialize z = (z1, ..., zm)T by setting zi to be the148
best value found so far for objectivefi(x). Initialize the149
utility of each subproblem πi.150
2) Optimization Loop.151
If The stopping criteria is not satisfied, repeat:152
For k = 1 to N, based on the current probability pk,G,153
select one NS from the NS pool and then define the 154
neighborhood of subproblem k. 155
Apply the MOEA/D [3] (MOEA/D is shown in 156
Appendix 2) for one generation. G = G + 1; AQ:1157
Update FEsk,G and FEs−successk,G for all the 158
NSs. 159
If (mod(G, LP) == 0) 160
Update the probability Pk,G based on the current 161
FEsk,G 162
and FEs−successk,G for all the kth NS as in (4). 163
Reinitialize the FEsk,g and FEs−successk,g as 0 164
for the next LP loop. 165
Endif 166
Else Stop and output the final N solutions. 167
Endif 168
4 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION
Fig. 3 (Continued).
IV. Experimental Results169
ENS-MOEA/D is tested on the first ten CEC 2009 Contest170
unconstrained test instances [4]. The IGD performance mea-AQ:2 171
sure is used as in the CEC 2009 Contest. The IGD measures172
both the convergence as well as the diversity among solutions.173
The common parameter settings are the same as in MOEA/D174
[3]. N is set to be 600 for two objectives and 1000 for three175
objectives; CR = 1.0 and F = 0.5 in DE operators; the176
number of function evaluations (FEs) is 300 000. The four177
different NSs for the two-objective problems are 30, 60, 90,178
and 120, where NS = 60 is the original parameter setting in the179
MOEA/D in [3]; the NSs for the three-objective problems are180
60, 80, 100, 120, and 140, where 100 is the original parameter181
setting for NS in the MOEA/D [3]. Both MOEA/D and ENS-182
MOEA/D are implemented in MATLAB. Final N population183
members are used to compute IGD values for both MOEA/D1184
and Ens-MOEA/D in this letter.185
We conducted a parameter sensitivity investigation of LP186
for ENS-MOEA/D using four different values (10, 25, 50, and187
75) on the ten benchmark instances. By observing the mean188
of IGD values over 25 runs, we can conclude that the LP is189
not so sensitive to most of the instances. In the experiments190
reported in this letter, LP = 50.191
The mean of IGD values over 25 runs among all the variants192
of MOEA/D with different fixed NSs and ENS-MOEA/D are193
1MOEA/D results presented at http://cswww.essex.ac.uk/staff/zhang/webofmoead.htm used 100 filtered points and 150 filtered points from finalpopulation to compute IGD values for two-objective problems and three-objective problems.
ranked in Fig. 1. In both of the bar charts in Fig. 1, the different 194
fixed values of NS settings are shown in the horizontal axis, 195
while the vertical axis represents the accumulation of rank 196
values for each variant. Our results also showed that the ENS- 197
MOEA/D yields superior performance (smallest accumulation 198
of ranks) over all other implementations with only one fixed 199
value of NS setting. 200
To investigate the self-adaptive selection of different NSs in 201
ENS-MOEA/D, the average value of the selection probabilities 202
for each NS as the evolution progresses are plotted in Fig. 2. 203
At the beginning, each NS is assigned the same selection 204
probability. For UF3, NS = 30 is rarely used during the whole 205
evolution progress. It is because NS = 30 cannot yield better 206
results during the search. On the contrary, NS = 90 and NS = 207
120 are often assigned high probabilities, especially at the last 208
search stage. For UF9, it is obvious that NS = 80 dominates 209
others when the FEs are around 200 k. However, at the last 210
stage, NS = 80 has received very small selection probability. 211
Overall, there is not any NS dominates others throughout 212
the whole evolution, the selection probabilities for each NS 213
varies very much. From this observation, one conclusion can 214
be made, the ensemble of different NSs with online self- 215
adaptation is necessary for MOEA/D. 216
The comparison, in terms of the IGD value, between the 217
original MOEA/D and the proposed ENS-MOEA/D are pre- 218
sented in Fig. 3 and Table I. The t-test at the 5% significance 219
level has been conducted to compare the final IGD values 220
obtained at 300 000 function evaluations of the two methods. 221
h = 1 in Table I means that the difference is significant and h = 222
ZHAO et al.: DECOMPOSITION-BASED MULTIOBJECTIVE EVOLUTIONARY ALGORITHM 5
TABLE I
Means of the Final IGD Values Obtained by Two Methods