Top Banner
Multi-Resource Packing for Cluster Schedulers Robert Grandl Aditya Akella Srikanth Kandula Ganesh Ananthanarayanan Sriram Rao
12

Multi-Resource Packing for Cluster Schedulers

Jan 17, 2016

Download

Documents

iorwen

Multi-Resource Packing for Cluster Schedulers. Srikanth Kandula Ganesh Ananthanarayanan Sriram Rao. Robert Grandl Aditya Akella. Diverse Resource Requirements. Tasks need varying amounts of each resource E.g., Memory [100MB to 17GB] CPU [2% of a core to 6 cores]. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Multi-Resource Packing  for  Cluster Schedulers

Multi-Resource Packing for

Cluster Schedulers

Robert Grandl Aditya Akella

Srikanth KandulaGanesh AnanthanarayananSriram Rao

Page 2: Multi-Resource Packing  for  Cluster Schedulers

Diverse Resource Requirements

Tasks need varying amounts of each resource E.g., Memory [100MB to 17GB] CPU [2% of a core to 6 cores]

Need to match tasks with machines based on resource

Demands for resources are not correlated Correlation coefficient across resource demands [–0.11, 0.33]

Page 3: Multi-Resource Packing  for  Cluster Schedulers

Current Schedulers do not Pack

Resources allocated in terms of “slots”

Resource Fragmentation

T1: 2 GB

T2: 2 GB

T3: 4 GB

4 GB Memory Machine A

4 GB Memory Machine B

Current Schedulers Packer Schedulers

T1: 2 GB

T2: 2 GB

T3: 4 GB

4 GB Memory Machine A

4 GB Memory Machine B

Page 4: Multi-Resource Packing  for  Cluster Schedulers

Current Schedulers do not Pack

Resources allocated in terms of “slots”

Over-allocation

20 MB/sIn Nw.

T1: 2 GB Mem

Current Schedulers

4 GB Memory 20 MB/s In Nw.

Machine A

20 MB/sIn Nw.

T2: 2 GB Mem

20 MB/sIn Nw.

20 MB/sIn Nw.

Packer Schedulers

20 MB/sIn Nw.

T1: 2 GB Mem

4 GB Memory 20 MB/s In Nw.

Machine A

20 MB/sIn Nw.

T2: 2 GB Mem

T3: 2 GB Mem

Page 5: Multi-Resource Packing  for  Cluster Schedulers

Current Schedulers do not Pack

Slots allocated purely on fairness

considerations

Cluster [18 Cores, 36 GB] / Job: [Task Prof.], # tasks

A [1 Core, 2 GB], 18

B [3 Cores, 1 GB], 6

C [3 Cores, 1 GB], 6

6 tasks

2 tasks

2 tasks

18cores

16 GB

6 tasks

2 tasks

2 tasks

6 tasks

2 tasks

2 tasks

18cores

16 GB

18cores

16 GB

A

B

Ct 2t 3t

18 tasks

0 tasks

0 tasks

18cores

36 GB

6 tasks

0 tasks 6 tasks

18cores

6 GB

18cores

6 GB

A

B

Ct 2t 3t

Durations:

DRF Packer

Resources used: DRF share = 1/3 Resources used: Packer

Current Schedulers Packer Schedulers

A: 3tB: 3tC: 3t

Durations:A: tB: 2tC: 3t

33% improvement

Page 6: Multi-Resource Packing  for  Cluster Schedulers

It is all about packing ?

Multi-dimensional bin packing is NP-hard for #dimens. ≥ 2 Several heuristics proposed But they do not apply here … size of the ball, contiguity of allocation, resource demands are elastic in time

Will perfect packing suffice ?

Competing objectives: Cluster utilization vs. Job completion times vs.

Fairness

Page 7: Multi-Resource Packing  for  Cluster Schedulers

Something reasonably simple and which can be applied

Intuition behind the solution

Cluster efficiency Job completion time

Cluster efficiency Fairness

Page 8: Multi-Resource Packing  for  Cluster Schedulers

Tetris

Pack tasks along multiple resources Cosine similarity between task demand vector

and machine resource vector

Multi-resource version of SRTF Favor jobs with small remaining duration

and small resource consumption

Incorporate Fairness Fairness knob (0, 1] f → 0 close to perfect fairness f = 1 most efficient scheduling

A

T

F

1: while (resources R are free)2: among FJ jobs furthest from fair share3: score (j) = 4: max task t in j, demand(t) ≤ R A(t, R) + T(j)5: pick j*, t* = argmax score(j)6: R = R – demand(t*) 7: end while

(simplified) Scheduling procedure

Page 9: Multi-Resource Packing  for  Cluster Schedulers

Learning task requirements From tasks that have finished in the same phase Coefficient of variation [0.022, 0.41] Collecting statistics from recurring jobs

Task Requirements and resource usages

Resource Tracker measure actual usage of resources enforce allocations aware of activities on the cluster other than

tasks assignment: ingest and evacuation

Peak usage demands estimates for tasks

Machine - In Network

850

1024

0

512

MBy

tes /

sec

Time (sec)In Network UsedIn Network Free Task In Network Estimates

Resource Tracker

Page 10: Multi-Resource Packing  for  Cluster Schedulers

Prototype atop Hadoop 2.3

Evaluation

Tetris as a pluggable scheduler to RM Implement RT as a NM service Modified AM/RM resource allocation

protocol

Cluster capacity: 250 Nodes

4 hour synthetic workload

60 jobs with complementary task demands

Reduction (%) in Job Duration - DRF

CDF

Reduces average job duration by up to 40%

Reduces makespan by 39%

Large scale evaluation

Page 11: Multi-Resource Packing  for  Cluster Schedulers

Evaluation

Facebook production traces analysis

Fairness knob: fewer than 6% of jobs slow down; by not more than 8% on average

Knob value of 0.75 offers nearly the best possible efficiency with little unfairness

Trace-driven simulation

Fairness Knob - Fair

Slow

dow

n (%

)

Fairness Knob - DRF

Slow

dow

n (%

)

Page 12: Multi-Resource Packing  for  Cluster Schedulers

Conclusion

Identify the importance of scheduling all relevant resources in a cluster

Resource Fragmentation

Over-allocation and Interference

New scheduler that pack tasks along multiple resources

Reduce makespan

Job Completion Time

Enable a trade-off between packing efficiency and fairness

Fairness Knob

Come and see our poster !