Top Banner
Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University
70

Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

Jan 19, 2016

Download

Documents

Terry Cullom
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

Regret Minimizationand

Job Scheduling

Yishay MansourTel Aviv University

Page 2: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

3

Page 3: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

Decision Making under uncertainty

• Online algorithms– Stochastic models– Competitive analysis– Absolute performance

criteria• A different approach:

– Define “reasonable“ strategies

– Compete with the best (in retrospect)

– Relative performance criteria

Page 4: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

5

Routing

• Model: Each day 1. select a path from source to

destination 2. observe the latencies.

– Each day diff values

• Strategies: All source-dest. paths • Loss: The average latency on

the selected path• Performance Goal: match the

latency of best single path

Page 5: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

6

Financial Markets: options

• Model: stock or cash. Each day, set portfolio

then observe outcome.• Strategies: invest either:

all in stock or, all in cash• Gain: based on daily

changes in stock• Performance Goal:

Implements an option !

CASH

STOCK

Page 6: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

7

Machine learning – Expert Advice

• Model: each time 1. observe expert predictions 2. predict a label• Strategies: experts

(online learning algorithms)• Loss: errors• Performance Goal: match the

error rate of best expert– In retrospect

1

2

3

4

1

0

1

1

Page 7: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

8

Parameter Tuning

• Model: Multiple parameters.

• Strategies: settings of parameters• Optimization: any• Performance Goal:

match the best setting of parameters

Page 8: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

9

Parameter Tuning

• Development Cycle– develop product (software)– test performance– tune parameters– deliver “tuned” product

• Challenge: can we combine– testing– tuning– runtime

Page 9: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

10

Regret Minimization: Model

• Actions A={1, … ,N}• Time steps: t ∊ { 1, … , T}• At time step t:– Agent selects a distribution pt(i) over A– Environment returns costs ct(i) ε [0,1]

• Adversarial setting– Online loss: lt(on) = Σi ct(i) pt(i)

• Cumulative loss : LT(on) = Σt lt(on)

Page 10: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

11

External Regret

• Relative Performance measure:– compares to the best strategy in A

• The basic class of strategies

• Online cumulative loss : LT(on) = Σt lt(on)

• Action i cumulative loss : LT(i) = Σt ct(i)

• Best action: LT(best) = MINi {LT(i) }=MINi {Σt ct(i)}

• External Regret = LT(on) – LT(best)

Page 11: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

External Regret Algorithm

• Goal: Minimize Regret• Algorithm:– Track the regrets– Weights proportional to the regret

• Formally: At time t– Compute the regret to each action• Yt(i)= Lt(on)- Lt(i), and rt(i) = MAX{ Yt(i),0}

• pt+1(i) = rt(i) / Σi rt(i) – If all rt(i) = 0 select pt+1 arbitrarily.

• Rt = < rt(1), …,rt(N)> and ΔRt= Yt - Yt-1

Page 12: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

External Regret Algorithm: Analysis

Rt = < rt(1), …,rt(N)> and ΔRt= Yt - Yt-1

• LEMMA: ΔRt ∙ Rt-1 = 0

Σi(ct(i) – lt(on)) rt-1(i) =

Σict(i)rt-1(i)– Σilt(on)rt-1(i)

Σi lt(on) rt-1(i)

= [Σi ct(i) pt(i) ]Σi rt-1(i)

= Σi ct(i)rt-1(i)• LEMMA:

Rt-1

Rt

NTTt t

ΔRiT

Ri

12

2)(max

Page 13: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

14

External regret: Bounds

• Average regret goes to zero– No regret– Hannan [1957]

• Explicit bounds– Littstone & Warmuth ‘94 – CFHHSW ‘97– External regret = O(log N + √Tlog N)

Page 14: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

15

Page 15: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

16

Dominated Actions

• Model: action y dominates x if y always better than x• Goal: Not to play

dominated actions• Goal (unknown model):

The fraction of times we play dominated actions is played is vanishing

.3

.8

.9

.3

.6

CostAction y

CostAction x

.2

.4

.7

.1

.3

Page 16: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

Internal/Swap Regret

• Internal Regret– Regret(x,y) = ∑t: a(t)=x ct(x) - ct(y)

– Internal Regret = maxx,y Regret(x,y)

• Swap Regret– Swap Regret = ∑x maxy Regret(x,y)

• Swap regret ≥ External Regret– ∑x maxy Regret(x,y) ≥ maxy ∑x Regret(x,y)

• Mixed actions– Regret(x,y) = ∑t (ct(x) - ct(y))pt(x)

Page 17: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

Dominated Actions and Regret

• Assume action y dominates action x– For any t: ct(x) > ct(y)+δ

• Assume we used action x for n times– Regret(x,y) > δ n

• If SwapRegret < R then– Fraction of time dominated action used – At most R/δ

Page 18: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

19

Calibration

• Model: each step predict a probability and observe outcome

• Goal: prediction calibrated with outcome– During time steps where

the prediction is p the average outcome is (approx) p

.3

.5

.3

.3

.5

predictions outcome

Calibration:

.3

.5

1/3

1/2

Predict prob. of rain

Page 19: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

Calibration to Regret

• Reduction to Swap/Internal regret:– Discrete Probabilities• Say: {0.0,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0}

– Loss of action x at time t: (x – ct)2

– y*(x)= argmaxy Regret(x,y) • y*(x)=avg(ct|x)

– Consider R(x,y*(x))

Page 20: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

21

Internal regret

• No internal regret– [Foster & Vohra] , [Hart & Mas-Colell]• Based on the approachability theorem [Blackwell ’56]

– Explicit Bounds– [Cesa-Bianchi & Lugasi ’03]• Internal regret = O(log N + √T log N)

– [Blum & Mansour]• Swap regret = O(log N + √T N)

Page 21: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

22

Regret: External vs Internal

• External regret– You should have bought

S&P 500

– Match boyi to girli

• Internal regret – Each time you bought

IBM you should have bought SUN

– Stable matching

• Limitations:- No state- Additive over time

Page 22: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

23

[Even-Dar, Mansour, Nadav, 2009]

Page 23: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

Routing Gamess1

t1

t2 s2

f1, L

f1, R

f2, T

f2, B

f2

f1

• Atomic

– Finite number of players

– Player i transfer flow from si to ti

f1,L

f2,T

Latency on edge e = Le(f1,L + f2,T)

e

• Costi = pε (si, ti) Latency(p) * flowi (p)

• Splittable flows

Page 24: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

Cournot Oligopoly [Cournot 1838]

• Best response dynamics converges for 2 players [Cournot 1838]

– Two player’s oligopoly is a super-modular game [Milgrom, Roberts 1990]

• Diverges for n 5 [Theocharis 1960]

X Y

Cost1(X) Cost2(Y)

Market price

Overall quantity

• Firms select production level• Market price depends on the TOTAL supply• Firms maximize their profit = revenue - cost

X y

P

Page 25: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

Resource Allocation Games

• Each advertiser wins a proportional market share

$5M $10M $17M $25M

• Advertisers set budgets:

‘s allocated rate = 5+10+17+25

25

f (U = ) - $25M

• Utility:

– Concave utility from allocated rate

– Quasi-linear with money

The best response dynamics generally diverges for linear resource allocation games

Page 26: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

Properties of Selfish Routing, Cournot Oligopoly and Resource Allocation Games

1. Closed convex strategy set

2. A (weighted) social welfare is concave

3. The utility of a player is convex in the vector of actions of other players

R

There exists 1,…,n > 0

Such that

1 u1 (x) + 2 u2(x)+…+n un(x)

Socially Concave Games

Page 27: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

The relation between socially concave games and concave games

Zero SumGames

SociallyConcaveGames

ConcaveGames

Concave Games [ Rosen 65]• The utility of a player is strictly concave in her

own strategy

• A sufficient condition for equilibrium uniqueness

Normal Form Games

(with mixed strategies)

Unique Nash Equilibrium

Atomic, splittable

routing

Resource Allocation

Cournot

Page 28: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

The average action and average utility converge to NE

Theorem 1: The average action profile converges to NE

Player 1:

Player 2:

Player n:

Day 1 Day 2 Day 3 Day T Average of days 1…T

( )T - Nash equilibrium

Theorem 2: The average daily payoff of each player converges to her payoff

in NE

If each player uses a procedure without regret in socially

concave games then their joint play converges to Nash

equilibrium:

Page 29: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

Convergence of the “average action” and “average payoff” are two different things!

• Here the average action converges to (½,½) for every player

s tts

• But the average cost is 2, while the average cost in NE is 1

s t

• On Even Days • On Odd Days

Page 30: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

The Action Profile Itself Need Not Converge

s tts

• On Even Days • On Odd Days

Page 31: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

32

Correlated Equilibrium

• CE: A joint distribution Q• Each time t, a joint action

drawn from Q – Each player action is BR

• Theorem [HM,FV]: Multiple players playing

low internal (swap) regret converge to CE

.3

.8

.9

.3

.6

Action x Action y

.2

.4

.7

.1

.3

Page 32: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

33[Even-Dar, Klienberg, Mannor, Mansour, 2009]

Page 33: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

35

Job Scheduling: Motivating Example

Load Balancer

usersservers

GOAL: Minimize load on servers

Page 34: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

36

Online Algorithms

• Job scheduling– N unrelated machines

• machine = action– each time step:

• a job arrives– has different loads on

different machines• algorithm schedules

the job on some machine– Given its loads

– Goal:• minimize the loads

– makespan or L2

• Regret minimization– N actions• machines

– each time step• First, algorithm

selects an action (machine)• Then, observes the

losses– Job loads

– Goal: • minimize the sum of

losses

Page 35: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

37

Modeling Differences: Information

• Information model:– what does the algorithm know when it selects

action/machine• Known cost: – First observe costs then select action– job scheduling

• Unknown cost:– First select action then observe costs– Regret Minimization

Page 36: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

38

Modeling Differences: Performance

• Theoretical Performance measure:– comparison class• job scheduling: best (offline) assignment• regret minimization: best static algorithm

– Guarantees:• job scheduling: multiplicative• regret minimization: additive and vanishing.

• Objective function:– job scheduling: global (makespan)– regret minimization: additive.

Page 37: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

39

Formal Model

• N actions• Each time step t algorithm ON– select a (fractional) action: pt(i)

– observe losses ct(i) in [0,1]

• Average losses of ON– for action i at time T: ONT(i) = (1/T) Σt<T pt(i) ct(i)

• Global cost function:– C∞(ONT(1), … , ONT(N)) = maxi ONT(i)

– Cd(ONT(1), … , ONT(N)) = [ Σi (ONT(i))d ]1/d

Page 38: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

40

Formal Model

• Static Optimum:– Consider any fixed distribution α • Every time play α

– Static optimum α* - minimizes cost C

• Formally:– Let α ◊ L = (α(1)L(1) , … , α(N) L(N))

• Hadamard (or Schur) product.– best fixed α*(L) = arg minα C(α ◊ L )

• where LT(i) = (1/T) Σt ct(i)– static optimality C*(L) = C(α*(L) ◊ L)

Page 39: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

41

Example

• Two machines, makespan:

observedloads

α*(L)L1 L2

21

1

21

2 ,LL

L

LL

L finalloads

L1 L2

21

12

LL

LLmakespan

4 2 ( 1/3 , 2/3) 4/3 4/3

Page 40: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

42

Our Results: Adversarial General

• General Feasibility Result:– Assume C convex and C* concave• includes makespan and Ld norm for d>1.

– There exists an online algorithm ON, which for any loss sequence L:

C(ON) < C*(L) + o(1)– Rate of convergence about √N/T

Page 41: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

43

Our Results: Adversarial Makespan

• Makespan Algorithm– There exists an algorithm ON– for any loss sequence LC(ON) < C*(L) + O(log2 N / √T)

• Benefits:– very simple and intuitive– improved regret bound

Δ

Tpt

2

1)1(

Two actions

Page 42: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

44

Our Results: Adversarial Lower Bound

• We show that for many non-convex C there is a non-vanishing regret– includes Ld norm for d<1

• Non-vanishing regret ratio >1

There is a sequence of losses L, such that,C(ON) > (1+γ) C*(L), where γ>0

Page 43: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

45

Preliminary: Local vs. Globaltime

B1 B2 Bk

Low regret in each block

Overall low regret

Page 44: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

46

Preliminary: Local vs. Global

• LEMMA:– Assume C convex and C* concave,– Assume a partition of time to Bi

– At each time block Bi regret at most Ri

Then:

C(ON)-C*(L) ≤ Σi Ri

Page 45: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

47

Preliminary: Local vs. Global

Proof:C(ON) ≤ Σ C(ON(Bi)) C is convexΣ C*(L(Bi)) ≤ C*(L) C* is concaveC(ON(Bi)) – C*(L(Bi)) ≤ Ri low regret in each Bi

Σ C(ON(Bi)) – C*(L(Bi)) ≤ Σ Ri C(ON) – C*(L) ≤ Σ Ri

QED

• Enough to bound the regret on subsets.

Page 46: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

48

Example

t=1

t=2

M1 M2

arrivallosses

static optα*=(1/2,1/2)cost = 3/2

M1 M2

local optα*:(1/3,2/3)(2/3,1/3)cost = 4/3

M1 M2M1 M2

global offline opt:(0,1)(1,0)cost = 1

Page 47: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

Stochastic case:

• Each time t the costs are drawn from a joint distribution, – i.i.d over time steps, not between actions

INTUITION: Assume two actions (machines)• Load Distribution:– With probability ½ : (1,0)– With probability ½ : (0,1)

• Which policy minimizes makespan regret?!• Regret components:– MAX(L(1),L(2)) = sum/2 +|Δ|/2– Sum=L(1)+L(2) & Δ=L(1)-L(2)

Page 48: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

Stochastic case: Static OPT

• Natural choice (model based)– Select always action ( ½, ½ )

• Observations:– Assume T/2+Δ times (1,0) and T/2-Δ times (0,1)– Loads (T/4+ Δ/2 , T/4-Δ/2)– Makespan = T/4+ Δ/2 > T/4– Static OPT: T/4 – Δ2/T < T/4

• W.h.p. OPT is T/4-O(1)

• sum=T/2 & E[|Δ|]= O(√T)– Regret = O(√T)

Page 49: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

Can we do better ?!

Page 50: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

Stochastic case: Least Loaded

• Least loaded machine:– Select the machine with the lower current load

• Observation:– Machines have same load (diff ≤ 1): |Δ| ≤ 1– Sum of loads: E[sum] = T/2 – Expected makespan = T/4

• Regret– Least Loaded Makespan LLM=T/4 ± √T– Regret =MAX{LLM-T/4,0} = O(√T)• Regret considers only the “bad” regret

Page 51: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

Can we do better ?!

Page 52: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

Stochastic case: Optimized Finish

• Algorithm:– Select action ( ½, ½ )

• For T-4√T steps

– Play least loaded afterwards.• Claim: Regret =O(T1/4)– Until T-4 √T steps (w.h.p) Δ < 2√T– Exists time t in [T-4 √T ,T]:

• Δ=0 & sum = T/2 + O( T1/4)• From 1 to t: regret = O(T1/4)• From t to T: Regret = O(√(T-t)) =

O(T1/4)

Page 53: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

Can we do better ?!

Page 54: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

Stochastic case: Any time

• An algorithm which has low regret for any t– Not plan for final horizon T

• Variant of least loaded:– Least loaded weight: ½ + T-1/4

• Claim: Regret = O(T1/4)– Idea:• Regret = max{(L1 + L2)/2 – T/4,0} + Δ• Every O(T1/2) steps Δ=0• Very near (½, ½)

Page 55: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

Can we do better ?!

Page 56: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

Stochastic case: Logarithmic Regret

• Algorithm:– Use phases– Length of phases shrink exponentially• Tk= Tk-1/2 and T1= T/2• Log T phases

– Every phase cancels deviations of previous phase• Deviation from the expectation

• Works for any probabilities and actions !– Assuming the probabilities are known.

Page 57: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

Can we do better ?!

Page 58: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

Stochastic case

• Assume that each action’s cost is drawn from a joint distribution, – i.i.d over time steps

• Theorem (makespan/Ld)– Known distribution• Regret =O(log T /T)

– Unknown distributions• Regret = O( log2 T /T)

Page 59: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

Summary

• Regret Minimization– External – Internal – Dynamics

• Job Scheduling and Regret Minimization– Different global function– Open problems:• Exact characterization• Lower bounds

Page 60: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

64

Page 61: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

65

Makespan Algorithm

• Outline:– Simple algorithm for two machines• Regret O(1/√T)• simple and almost memory-less

– Recursive construction:• Given three algorithms: two for k/2 actions andone for 2 actions build an algorithm for k actions• Main issue: what kind of feedback to “propagate”.• Regret O(log2 N /√T)

– better than the general result.

Page 62: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

66

Makespan: Two Machines

• Intuition:– Keep online’s loads balanced

• Failed attempts:– use standard regret minimization• In case of unbalanced input sequence L,• algo. will put most of the load on single machine

– use optimum to drive the probabilities• Our approach: – Use the online loads • not opt or static cumulative loads

Page 63: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

67

Makespan Algorithm: Two actions

• At time t maintain probabilities pt,1 and pt,2 = 1-pt,1

• Initially p1,1 = p1,2 = ½• At time t:

• Remarks:– uses online loads– Almost memory-less

T

ONONT

lplppp

tt

tttttt

1,2,

1,1,2,2,1,1,1

2

1

Δ

Tpt

2

11,

Page 64: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

68

Makespan Algorithm: Analysis

• View the change in probabilities as a walk on the line.

0½ 1

Page 65: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

69

Makespan Algorithm: Analysis

• Consider a small interval of length ε

• Total change in loads:– identical on both• started and ended

with same Δ

• Consider only losses in the interval– local analysis

• Local opt is also in the interval• Online used “similar” probability– loss of at most ε per step

Page 66: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

70

Makespan Algorithm: Analysis

• Simplifying assumptions:– The walk is “balanced” in every interval

• add “virtual” losses to return to initial state• only O(√T) additional losses

– relates the learning rate to the regret

– Losses “cross” interval’s boundary line• needs more sophisticated “bookkeeping”.

– make sure an update affects at most two adjacent intervals.

– Regret accounting• loss in an interval• additional “virtual” losses.

Page 67: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

71

Makespan: Recursive algorithm

• Recursive algorithm:

A3

A1A2

Page 68: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

72

Makespan: Recursive

• The algorithms:– Algorithms A1 and A2:

• Each has “half” of the actions– gets the actual losses and “balances” them

• each work in isolation.– simulating and not considering actual loads.

– Algorithm A3• gets the average load in A1 and A2

– balances the “average” loads.

A3

A1 A2

AVG(lt,i qt,i ) AVG(l t,iq’ t,i

)

lt,1 …. lt,k/2 ….

Page 69: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

74

Makespan: Recursive

• The combined output :

A3A1A2

p2p1

x xq1, …

q1p1, … ,

qk/2, …

qk/2p2, … ,

l1, … lk/2, …

AVG(lt,1 qt,i ) AVG(l t,k/2q t,i

)

Page 70: Regret Minimization and Job Scheduling Yishay Mansour Tel Aviv University.

75

Makespan: Recursive

• Analysis (intuition):– Assume perfect ZERO regret • just for intuition …

– The output of A1 and A2 completely balanced• The average equals the individual loads

– maximum=average=minimum

– The output of A3 is balanced• the contribution of A1 machines equal to that of A2

• Real Analysis:– need to bound the amplification in the regret.