Hawk: Hybrid Datacenter Scheduling - Inria · Hawk no centralized Hawk no stealing Hawk no partition 50th short jobs 90th short jobs 50th long jobs 90th long jobs Decomposing Hawk

Hawk: Hybrid Datacenter Scheduling

Pamela Delgado, Florin Dinu,

Anne-Marie Kermarrec, Willy Zwaenepoel

1USENIX ATC 2015

cluster

scheduler

task task

… …

Introduction: datacenter scheduling

cluster

centralized

scheduler

task task

… …

Introduction: centralized scheduling

cluster

centralized

scheduler… Job 1Job 2Job N

… …

Introduction: centralized scheduling

cluster

centralized

scheduler… Job 1Job 2Job N

… …

Good: placement

Not so good: scheduling latency

Introduction: distributed scheduling

cluster

distributedscheduler 1

distributedscheduler N

… …

Introduction: distributed scheduling

cluster

Good: scheduling latency

Not so good: placement

… …

Outline

1) Introduction

2) HAWK hybrid scheduling

• Rationale

• Design

3) Evaluation

• Simulation

• Real cluster implementation

4) Conclusion

Hybrid scheduling

cluster

centralizedscheduler

… …

Hawk: hybrid scheduling

Long jobs centralized

Short jobs distributed

Long job 1

Short job 1

Long job M

Short job N

Long/short:estimatedexecution time vs cut-off

… … …

Short job 2 distributedscheduler 2

Rationale for Hawk

Long job 1

Short job 1

Long job M

Short job N

Typical production workloads

little resources

most resources

…Short job 2

Rationale for Hawk (continued)

13Source: Design Insights for MapReduce from Diverse Production Workloads, Chen et al 2012

Percentage of long jobs Percentage of task-seconds for long jobs

Rationale for Hawk (continued)

14Source: Design Insights for MapReduce from Diverse Production Workloads, Chen et al 2012

Percentage of long jobs Percentage of task-seconds for long jobs

Long jobs: minority but

take up most of the resources

centralized

distributed 1

distributed N

Few jobs reasonableschedulinglatency

Few resources can tradenot-so-good

placement

Long job 1

Short job 1

Short job N

Bulk ofresources good placement

Latency sensitive Fast scheduling

… …

centralized

distributed 1

distributed N

Few jobs reasonableschedulinglatency

Few resources can tradenot-so-good

placement

Long job 1

Short job 1

Short job N

Bulk ofresources good placement

Latency sensitive Fast scheduling

… …

BEST OF BOTH WORLDS

Good: scheduling latency for short jobs

Good: placement for long jobs

Hawk: distributed scheduling

• Sparrow

• Work-stealing

• Sparrow

• Work-stealing

Sparrow

distributedscheduler

random

reservation

(power of two)

• Sparrow

• Work-stealing

Sparrow and high load

Random

placement:

Low likelihood on

finding a free node21

Sparrow and high load

Random

placement:

Low likelihood on

finding a free node22

High load + job heterogeneity

head-of-line blocking

Hawk work-stealing

Free node!!

Hawk work-stealing

1. Free node:

contact random

node for probes!

2. Random node:

send short tasks

reservation in queue

Hawk work-stealing

1. Free node:

contact random

node for probes!

2. Random node:

send short tasks

reservation in queue

High load high probability

of contacting node with backlog

Hawk cluster partitioning

Reserved nodes:

small cluster

partition

No coordination,

challenge: no free

nodes for mice!

Hawk cluster partitioning

Reserved nodes:

small cluster

partition

No coordination,

challenge: no free

nodes for mice!

Short jobs schedule anywhere.

Long jobs only in non-reserved nodes.

Hawk design summary

Hybrid scheduler:

long centralized, short distributed

Work-stealing

Cluster partitioning

Evaluation: 1. Simulation

• Sparrow simulator

• Google trace

• Vary number of nodes to vary cluster utilization

• Measure: Job running time

• Report 50th and 90th percentiles for short and long jobs

• Normalized to Sparrow

Simulated results: short jobs

10 20 30 40 50

Number of nodes in the cluster (thousands)

50th 90th

lower better

Simulated results: short jobs

10 20 30 40 50

50th 90th

lower better

Better across the board

Simulated results: long jobs

10 20 30 40 50

50th 90th

lower better

10 20 30 40 50

50th 90th

Better except under high load

lower better

10 20 30 40 50

50th 90th

Very high utilization: partitioning

lower better

Decomposing Hawk

1. Hawk minus centralized

2. Hawk minus stealing

3. Hawk minus partitioning

(normalized to Hawk)

Hawk no centralized

50th short jobs 90th short jobs

50th long jobs 90th long jobs

Decomposing Hawk: no centralized

Decomposing Hawk: no stealing

Hawk no stealing

Decomposing Hawk: no partitioning

Hawk no partition

Hawk no centralized Hawk no stealing Hawk no partition

50th short jobs 90th short jobs 50th long jobs 90th long jobs

Decomposing Hawk summary

11.919.6

Hawk no centralized Hawk no stealing Hawk no partition

50th short jobs 90th short jobs 50th long jobs 90th long jobs

Decomposing Hawk summary

Absence of any component

reduces Hawk’s performance!

11.919.6

Sensitivity analysis

1. Incorrect estimates of runtime

2. Cut off long/short

3. Details of stealing

Sensitivity analysis

1. Incorrect estimates of runtime

2. Cut off long/short

3. Details of stealing

Bottom line: benefits of Hawk remain despite variation

See paper for details

Evaluation: 2. Implementation

Hawk daemon

Hawkscheduler

Hawk daemon

Experiment

• 100-node cluster

• Subset of Google trace

• Vary inter-arrival time to vary cluster utilization

• Measure: Job running time

• Report 50th and 90th percentile for short and long jobs

• Normalized to Sparrow

Short jobs

1 1.2 1.4 1.6 1.8 2 2.25

Inter-arrival time

real 90th simulated 90th

1 1.2 1.4 1.6 1.8 2 2.25

Inter-arrival time

lower better

Long jobs

1 1.2 1.4 1.6 1.8 2 2.25

Inter-arrival time

1 1.2 1.4 1.6 1.8 2 2.25

Inter-arrival time

lower better

Implementation

1. Hawk works well in real cluster

2. Good correspondence

implementation/simulation

Related work

Centralized: Hadoop Fair Scheduler, Quincy

Eurosys’10, SOSP‘09

Two level: Yarn, Mesos

SoCC’12, NSDI’11

Distributed schedulers: Omega, Sparrow

Eurosys’12,SOSP’13

Hybrid schedulers: Mercury

Conclusion

• Hawk: hybrid scheduler

long: centralized, short: distributed

work-stealing

cluster partitioning

• Hawk provides good results for short and long jobs

• Even under high cluster utilization

Hawk: Hybrid Datacenter Scheduling - Inria · Hawk no centralized Hawk no stealing Hawk no partition 50th short jobs 90th short jobs 50th long jobs 90th long jobs Decomposing Hawk

Documents

Leo Carter 90th!

90th Anniversary Rifles

Arri 90th anniversary_book

Happy 90th Anniversary - Provost News

90th Anniversary Special Edition

HAWK HAWK HAWK HAWK · HAWK HAWK HAWK HAWK ENTERPRISES...

90th in ComputerScience

90th Committee Report-Commerce.pdf

‘Happy 90th Birthday!’ Ranch Roping · 7/16/2020 ·.....

Happy 90th birthday Take 2

Mom's 90th Birthday

90th ANNIVERSARY - UELAC

1283 NW 90TH CT & 1298 NW 90TH ST · 1283 NW 90TH CT & 1298...

90th Anniversary Timeline

A Coloring Book - HAWK MOUNTAIN SANCTUARY | Hawk …

Ace Hardware's 90th Anniversary Sale