| | Damian Steiger 1 Extreme Value Analysis of Simulated Annealing, Simulated Quantum Annealing and a Mean-Field model Damian Steiger
||Damian Steiger 1
Extreme Value Analysis of Simulated Annealing, Simulated Quantum Annealing and a Mean-Field
modelDamian Steiger
||Damian Steiger 2
Ising spin glasses
Jij = {±1}
hi = 0
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
1
HIsing = �X
i<j
Jij�zi �
zj �
X
i
hi�zi , �z
i = {±1}
||Damian Steiger 3
NP-hard
▪▪ ▪
▪ Travelling salesman !
▪ Knapsack !
▪ Bin packing !
▪ and many others…
4
NP-hard problems
||Damian Steiger 5
Heuristic Solvers inspired by Physics
Image credit ANFF NSW node, University of New South
Simulated Annealing Simulated Quantum AnnealingMean-Field Model
||Damian Steiger
▪ Single spin updates ▪ Linear schedule
6
Simulated Annealing
Kirkpatrick, S., Gelatt, C. D. & Vecchi, M. P. Optimization by Simulated Annealing. Science 220, 671–680. (1983).
Image credit ANFF NSW node, University of New South
||Damian Steiger 7
Simulated Quantum Annealing
H(t) = �A(t)X
i
�x
i
�B(t)X
i<j
Jij
�z
i
�z
j
0.0
0.2
0.4
0.6
0.8
1.0
steps
energy
a.u.
t
B t
A
||Damian Steiger 8
Mean-Field Model
Shin et al. How "Quantum" is the D-Wave Machine?. arXiv:1401.7087 . (2014)
H(t) = �A(t)X
i
�x
i
�B(t)X
i<j
Jij
�z
i
�z
j
H(t) = �A(t)X
i
sin ✓i �B(t)X
i<j
Jij cos ✓i cos ✓j
||Damian Steiger 9
Distribution of time to solution for these algorithms?
Damian Steiger
||Damian Steiger
1. Create 20’000 random instances for each system size
2. Optimize annealing time for each algorithm
3. Find single run success probability s for each instance
4. Calculate mean number of repetitions 𝜏 to find the ground state for each instance
5. Make a histogram10
Test with Random Problem Instances
100 101 102 103 104 105 106 107 108 109
100
101
102
103
104
mean number of repetitions t
instances
SA 288 spins
||Damian Steiger 11
Scaling
100 101 102 103 104 105 106 107 108 109
100
101
102
103
104
mean number of repetitions Τ
instances
SA 32 spins
100 101 102 103 104 105 106 107 108 109
100
101
102
103
104
mean number of repetitions t
instances
SA 512 spins
?Scaling prediction would be easier if parametric model of distribution function would be known
Extreme Value Theory provides a parametric model for hard instances
||Damian Steiger 12
Central Limit Theorem
X1 X2 Xn X1 X2 Xn X1 X2 Xn. . . . . . . . .
Block of size n
Block maximum
u
threshold
excesses
+ + ... +
Sum Sn=X1+...+Xn
Central Limit Theorem
for n large enough:Sn follows approx.
a normal df
Limit law for block maxima
for n large enough:Block maximum follows approx. a
GEV df
Limit law for excesses
for threshold uhigh enough:
excesses followapprox. a GP df
exceedance
Let X be an iid random variable with distribution F.
For n large enough: Sn follows approximately a Normal distribution !
(if F satisfies some assumptions)
||Damian Steiger 13
Limit Law for Block Maxima
X1 X2 Xn X1 X2 Xn X1 X2 Xn. . . . . . . . .
Block of size n
Block maximum
u
threshold
excesses
+ + ... +
Sum Sn=X1+...+Xn
Central Limit Theorem
for n large enough:Sn follows approx.
a normal df
Limit law for block maxima
for n large enough:Block maximum follows approx. a
GEV df
Limit law for excesses
for threshold uhigh enough:
excesses followapprox. a GP df
exceedanceFor n large enough: Block maximum follows approximately a Generalized Extreme Value distribution !
(if F satisfies some assumptions)
Let X be an iid random variable with distribution F.
||Damian Steiger 14
Limit Law for Excesses
X1 X2 Xn X1 X2 Xn X1 X2 Xn. . . . . . . . .
Block of size n
Block maximum
u
threshold
excesses
+ + ... +
Sum Sn=X1+...+Xn
Central Limit Theorem
for n large enough:Sn follows approx.
a normal df
Limit law for block maxima
for n large enough:Block maximum follows approx. a
GEV df
Limit law for excesses
for threshold uhigh enough:
excesses followapprox. a GP df
exceedance
For u large enough: Excesses follow approximately a Generalized Pareto distribution !
(if F satisfies some assumptions)
Let X be an iid random variable with distribution F.
||Damian Steiger 15
Scaling
100 101 102 103 104 105 106 107 108 109
100
101
102
103
104
mean number of repetitions Τ
instances
SA 32 spins
100 101 102 103 104 105 106 107 108 109
100
101
102
103
104
mean number of repetitions t
instances
SA 512 spins
?
||Damian Steiger 16
Extreme Value Theoryprobability
density
log(x)u
threshold
fitted GP probability density
observed excesses yi
range of observations xi range of extrapolation
||Damian Steiger 17
Balkema - de Haan - Pickands Theorem
Pickands III, J. Statistical inference using extreme order statistics. Ann. Stat. 3, 119–131. (1975).
Balkema, A. A. & de Haan, L. Residual life time at great age. Ann. Probab. 2, 792–804. (1974).
Reiss, R.-D. & Thomas, M. Statistical Analysis of Extreme Values: with Applications to Insurance, Finance, Hydrology and Other Fields 3rd ed.
(Birkhäuser, 2007)
Let X be an iid random variable with distribution F. If F satisfies some assumptions and threshold u is high enough:
Pr {X � u y | X > u} ⇡ H⇠(y) := 1�✓1 +
⇠y
�̃
◆�1/⇠
generalized Pareto E
�Y k
�= 1 for k � 1/⇠
||Damian Steiger 18
Generalized Pareto Distribution
x
Log!Pro
babi
lity
Den
sity"
!x"!1#1#Ξ"exp!"x#Σ"
end"point
shape param.Ξ 0Ξ 0Ξ 0
Σ 0scale param.
at "Σ#Ξ
||Damian Steiger 19
Optimizing Algorithms
Damian Steiger
||Damian Steiger 20
Optimize algorithms for random Ising spin glasses
101 102 103 104
106
107
108
sweeps
numberofspinflips
200 spins quantiles0.9990.99
0.90.750.50.250.10.01
0.95
SA
||Damian Steiger 21
Bimodality changes
0.0 0.2 0.4 0.6 0.8 1.00
500
1000
1500
2000
success probability
instances
SQA 200 spins
0.0 0.2 0.4 0.6 0.8 1.00
500
1000
1500
2000
2500
3000
3500
success probability
instances
SQA 200 spins
10000 sweeps 150 sweeps
||Damian Steiger 22
Results
Damian Steiger
||Damian Steiger 23
Distribution of SQA
100 101 102 103 104 105 106 107 108 109
100
101
102
103
104
mean number of repetitions t
instances
SQA 32 spins
100 101 102 103 104 105 106 107 108 109
100
101
102
103
104
mean number of repetitions t
instances
SQA 72 spins
100 101 102 103 104 105 106 107 108 109
100
101
102
103
104
mean number of repetitions t
instances
SQA 128 spins
100 101 102 103 104 105 106 107 108 109
100
101
102
103
104
mean number of repetitions t
instances
SQA 200 spins
100 101 102 103 104 105 106 107 108 109
100
101
102
103
104
mean number of repetitions Τ
instances
SQA 288 spins
||Damian Steiger 24
Distribution of SA
100 101 102 103 104 105 106 107 108 109
100
101
102
103
104
mean number of repetitions Τ
instances
SA 32 spins
100 101 102 103 104 105 106 107 108 109
100
101
102
103
104
mean number of repetitions Τinstances
SA 72 spins
100 101 102 103 104 105 106 107 108 109
100
101
102
103
104
mean number of repetitions t
instances
SA 128 spins
100 101 102 103 104 105 106 107 108 109
100
101
102
103
104
mean number of repetitions Τ
instances
SA 200 spins
100 101 102 103 104 105 106 107 108 109
100
101
102
103
104
mean number of repetitions t
instances
SA 288 spins
100 101 102 103 104 105 106 107 108 109
100
101
102
103
104
mean number of repetitions t
instances
SA 392 spins
100 101 102 103 104 105 106 107 108 109
100
101
102
103
104
mean number of repetitions t
instances
SA 512 spins
||Damian Steiger 25
-
-
-- -
-
-
-- -
-
-- -
- --
-
-- -
- --
32 72 128 200 288 392 512-0.5
0.0
0.5
1.0
1.5
2.0
spins
shapeparameterx
100 101 102 103 104 105 106 107 108 109
100
101
102
103
104
mean number of repetitions t
instances
SA 288 spins
100 101 102 103 104 105 106 107 108 109
100
101
102
103
104
mean number of repetitions Τ
instances
SQA 288 spins
x
Log!Pro
babi
lity
Den
sity"
!x"!1#1#Ξ"exp!"x#Σ"
end"point
shape param.Ξ 0Ξ 0Ξ 0
Σ 0scale param.
at "Σ#Ξ
||Damian Steiger 26
Tail Comparison
-
-
-- -
-
-
-- -
-
-- -
- --
-
-- -
- --
32 72 128 200 288 392 512-0.5
0.0
0.5
1.0
1.5
2.0
spins
shapeparameterx
SQA (Low Temp, Slow)
SA (Fast)
||Damian Steiger
-
-
-- -
-
-
-- -
-
--
-
-
--
-
-
-
32 72 128 200 288 392 512-0.5
0.0
0.5
1.0
1.5
2.0
spins
shapeparameterx
-
-
-- -
-
-
-- -
32 72 128 200 288 392 512-0.5
0.0
0.5
1.0
1.5
2.0
spins
shapeparameterx
-
-
-- -
-
-
-- -
-
--
-
-
--
-
-
-
--
--
-
--
- --
32 72 128 200 288 392 512-0.5
0.0
0.5
1.0
1.5
2.0
spins
shapeparameterx
27
Optimising helpsSQA (Low Temp, Slow)
SQA (Low Temp, Fast)
SQA (High Temp, Fast)
||Damian Steiger 28
Comparing the algorithms
-
--
-
-
--
-
-
-
32 72 128 200 288 392 512-0.5
0.0
0.5
1.0
1.5
2.0
spins
shapeparameterx
-
--
-
-
--
-
-
-
-
-- -
- --
-
-- -
- --
32 72 128 200 288 392 512-0.5
0.0
0.5
1.0
1.5
2.0
spins
shapeparameterx
-
--
-
-
--
-
-
-
--
--
- -
--
--
- -
-
-- -
- --
-
-- -
- --
32 72 128 200 288 392 512-0.5
0.0
0.5
1.0
1.5
2.0
spins
shapeparameterx
-
--
-
-
--
-
-
-
--
--
-
--
- --
--
--
- -
--
--
- -
-
-- -
- --
-
-- -
- --
32 72 128 200 288 392 512-0.5
0.0
0.5
1.0
1.5
2.0
spins
shapeparameterx
SQA (Low Temp, Fast)
SA (Fast)
MF (High Temp, Fast)
SQA (High Temp, Fast)
▪▪ ▪
||Damian Steiger
▪ SQA and MF have worse tail behaviour than SA !!
▪ SQA has a better tail behaviour if it is run faster and even better if it runs faster at a higher temperature
29
Summary
-
-
-- -
-
-
-- -
-
--
-
-
--
-
-
-
--
--
-
--
- --
32 72 128 200 288 392 512-0.5
0.0
0.5
1.0
1.5
2.0
spinsshapeparameterx
-
--
-
-
--
-
-
-
--
--
-
--
- --
--
--
- -
--
--
- -
-
-- -
- --
-
-- -
- --
32 72 128 200 288 392 512-0.5
0.0
0.5
1.0
1.5
2.0
spins
shapeparameterx
||Damian Steiger 30
Collaborators
Matthias Troyer (ETH Zürich)
Troels Rønnow (ETH Zürich)
Ilia Zintchenko (ETH Zürich)
||Damian Steiger 31
Damian Steiger