Meta online learning: experiments on a unit commitment problem (ESANN2014)

MetaOnlineLearning: ExperimentsonaUnitCommitmentProblem

Jialin Liu, Olivier [email protected],[email protected]

Black-box Noisy OptimizationObjective function fitness : Rd → ROptimum θ∗ = argmin

θ∈Rd

fitness(θ)

Some NOAs• RSAES : Self-Adaptive Evolution Strategy

with resampling;

• Fabian’s algorithm: a first-order methodusing gradients estimated by finitedifferences[?, ?];

• Noisy Newton’s algorithm: a second-ordermethod using a Hessian matrix approxi-mated also by finite differences[?].

Compare Solvers Early• kn ≤ n: lag

• Why this lag ?(i) comparing current recommendations→ comparing good points→ very close fitness→ very expensive(ii) algorithms’ ranking is usually stable→ let us save up time by comparing"older" recommendations

Solvers and Notationsµ: parent pop. size in ESλ: pop. size in ESd: search space dimensionn: generation indexσn : stepsize at generation nrn: resampling number at generation n

For all NOPA:kn = dn0.1ern = n3

sn = 15n

Table 1: Solvers in experimentsNotation Algorithm and parametrization

RSAES λ = 10d, µ = 5d, rn = 10n2

Fabian1 σn = 10/n0.49, a = 100Fabian2 σn = 10/n0.05, a = 100Newton1 σn = 10/n, rn = n2

Newton2 σn = 100/n4, rn = n2

P.12345 NOPA of 5 solvers aboveP.12345 + S. P.12345 with information sharing.P.22 NOPA of 2 (identical) Fabian1P.22 + S. P.22 with information sharing.P.222 NOPA of 3 (identical) Fabian1P.222 + S. P.222 with information sharing.

Some References

Abstract• Online learning = real time machine learning “on the fly”• Meta online learning = combining several online learning algorithms from a given set (termed

portfolio) of algorithms ' combining Noisy Optimization Algorithms (NOPA=noisy optimiza-tion portfolio algorithm).

• Goals: (i) mitigating the effect of a bad choice of online learning algorithms (ii) parallelization(iii) combining the strengths of different algorithms.

• This paper:- Portfolio = classical for combinatorial optimization: we test portfolios for noisy optimization.- Recently, a methodology termed lag has been proposed for NOPA. We test experimentallythe lag methodology for various problems.

Noisy Optimization Portfolio Algorithm (NOPA)Iteration n of the portfolio {S1, . . . , SM} containing M NOAs:• Initialization module: If n = 0 initialize all Si, i ∈ {1, . . . ,M}.• For i ∈ {1, . . . ,M}:

– Update module: Apply an iteration of solver Si until it has received at least n data samples.

– Let θi,n be the current recommendation by solver Si.• Comparison module: If n = rm for some m, then

– For i ∈ {1, . . . ,M}, perform sm evaluations of the (stochastic) reward R(θi,kn ) and define yi the average reward.

– Define i∗ ← argmini∈{1,...,M} yi.

• Recommendation module: θ̃ = θi∗,n

ExperimentsTable 2: Artificial problem R(θ) = −||θ − θ∗||2 + ||θ − θ∗||z × Gaussian. n: evaluation number. z = rateat which the variance decreases around the optimum.

z Comparison of log(R(θ̃n))/ log(n) for d = 2 Comparison of log(R(θ̃n))/ log(n) for d = 50 Newton1 < RSAES ' P.12345∗ < . . . Newton1 < RSAES ' P.12345∗ < . . .1 P.12345 < Fabian1 ' P.22∗ < . . . Fabian1 < P.22∗ < P.222∗ < . . .2 P.12345∗ � Fabian1 ' P.22∗ < . . . Fabian1 < P.12345 < P.22∗ < . . .

Discussion: NOPAs are usually not far from the best of their NOAs. In small dimension with noisevariance decreazing quickly to 0 around optimum (z = 2), NOPA outperforms all its NOAs.

Table 3: Stochastic Unit Commitment problems, conformant planning. St: number of stocks.Problem size Considered NOA or NOPASt, T , d P.22 P.22 + S. P.222 P.222 + S. Best NOA Worst NOA3, 21, 63 0.61 ± 0.07 0.63 ± 0.03 0.63 ± 0.05 0.63 ± 0.07 0.49 ± 0.08 0.81 ± 0.054, 21, 84 0.75 ± 0.02 0.75 ± 0.03 0.79 ± 0.05 0.76 ± 0.03 0.69 ± 0.06 1.27 ± 0.065, 21, 105 0.53 ± 0.04 0.58 ± 0.08 0.58 ± 0.03 0.52 ± 0.05 0.58 ± 0.04 1.44 ± 0.166, 15, 90 0.40 ± 0.05 0.39 ± 0.06 0.37 ± 0.06 0.39 ± 0.06 0.38 ± 0.06 0.96 ± 0.136, 21, 126 0.53 ± 0.08 0.54 ± 0.08 0.55 ± 0.07 0.54 ± 0.07 0.54 ± 0.07 1.78 ± 0.378, 15, 120 0.53 ± 0.03 0.50 ± 0.05 0.53 ± 0.02 0.51 ± 0.05 0.51 ± 0.04 1.70 ± 0.108, 21, 168 0.69 ± 0.04 0.77 ± 0.09 0.73 ± 0.06 0.71 ± 0.04 0.71 ± 0.06 2.68 ± 0.027, 21, 147 0.70 ± 0.07 0.70 ± 0.05 0.70 ± 0.07 0.70 ± 0.07 0.69 ± 0;06 2.28 ± 0.08

Discussion: Given a same budget, a NOPA of identical solvers can outperform its NOAs. RSAESis usually the best NOA for small dimensions and variants of Fabian for large dimension.

Table 4: Approximate convergence rates log(−R(θ̃n))/ log(n) for Cart-Pole, a multimodal problem, usingNN. n: evaluation number.

Solver 2 neurons, d = 9 4 neurons, d = 17 8 neurons, d = 331 (RSAES) -0.458033±0.045014 -0.421535±0.045643 -0.351726±0.0517052 (Fabian1) 0.002226±5.29923e-05 0.002089±1.57766e-04 0.00221±8.14518e-053 (Fabian2) 0.002318±9.80792e-05 0.002238±1.14289e-04 0.00236±1.51244e-044 (Newton1) 0.002229±6.08973e-05 -0.030731±0.111294 0.002247±1.19829e-045 (Newton2) 0.00227±5.2989e-05 0.002217±7.80888e-05 0.002307±9.96404e-056 (P.12345) -0.408705±0.068428 -0.3917±0.071791 -0.320399±0.0503387 (P.12345 + S.) -0.42743±0.05709 -0.403707±0.056173 -0.354043±0.069576

Discussion: Fabian and Newton can’t solve this multimodal problem ⇒ one solver is much betterthan others ⇒ easy for NOPA.

Conclusion• Main conclusion:

Usual: Portfolio of Algorithms for Combinatorial Optimization;New: Portfolio of Algorithms for Noisy Optimization.

• “Sharing” not that good.• NOPA sometimes better than NOA even if all NOA equal!• We show mathematically[?] and empirically a log(M) shift when usingM solvers, when working

on the log-log scale (usual scale in noisy optimization).• Portfolio = approximately as efficient as the best - except when one iteration of one algorithm

monopolizes most of the budget - as RSAES in the unit commitment problem.

Meta online learning: experiments on a unit commitment problem (ESANN2014)

Presentations & Public Speaking