Top Banner
INSTITUTE OF COMPUTING TECHNOLOGY An Adaptive Task Creation Strategy for Work-Stealing Scheduling Lei Wang, Huimin Cui, Yuelu Duan, Fang Lu, Xiaobing Feng, Pen-Chung Yew ICT, Chinese Academy of Sciences, China University of Minnesota, U.S.A 1
20

INSTITUTE OF COMPUTING TECHNOLOGY An Adaptive Task Creation Strategy for Work-Stealing Scheduling Lei Wang, Huimin Cui, Yuelu Duan, Fang Lu, Xiaobing Feng,

Dec 17, 2015

Download

Documents

Adam McKinney
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: INSTITUTE OF COMPUTING TECHNOLOGY An Adaptive Task Creation Strategy for Work-Stealing Scheduling Lei Wang, Huimin Cui, Yuelu Duan, Fang Lu, Xiaobing Feng,

INSTITUTE OF COMPUTING

TECHNOLOGY

An Adaptive Task Creation Strategy for Work-Stealing Scheduling

Lei Wang, Huimin Cui, Yuelu Duan, Fang Lu, Xiaobing Feng, Pen-Chung Yew

ICT, Chinese Academy of Sciences, China

University of Minnesota, U.S.A1

Page 2: INSTITUTE OF COMPUTING TECHNOLOGY An Adaptive Task Creation Strategy for Work-Stealing Scheduling Lei Wang, Huimin Cui, Yuelu Duan, Fang Lu, Xiaobing Feng,

INSTITUTE OF COMPUTING

TECHNOLOGY

Forecast

2

Adaptive task granularity

fine-grained parallelism

tasks

Multi-cores

An adaptive task creation strategy Work-stealing

Page 3: INSTITUTE OF COMPUTING TECHNOLOGY An Adaptive Task Creation Strategy for Work-Stealing Scheduling Lei Wang, Huimin Cui, Yuelu Duan, Fang Lu, Xiaobing Feng,

INSTITUTE OF COMPUTING

TECHNOLOGY

Outline

An adaptive task creation strategy

A new data attribute -- taskprivate

Evaluations

Conclusions

3

Page 4: INSTITUTE OF COMPUTING TECHNOLOGY An Adaptive Task Creation Strategy for Work-Stealing Scheduling Lei Wang, Huimin Cui, Yuelu Duan, Fang Lu, Xiaobing Feng,

INSTITUTE OF COMPUTING

TECHNOLOGY

Background Cilk, Cilk++, X10, OpenMP3.0, TBB, TPL …

Parallel programming languages and libraries to support task-level parallelism

Programmer: dividing work into tasks instead of threads

Runtime system: mapping and scheduling tasks into physical threads

Key technique Work-stealing scheduling

4

Page 5: INSTITUTE OF COMPUTING TECHNOLOGY An Adaptive Task Creation Strategy for Work-Stealing Scheduling Lei Wang, Huimin Cui, Yuelu Duan, Fang Lu, Xiaobing Feng,

INSTITUTE OF COMPUTING

TECHNOLOGY

Granularity

too fine scheduling overhead dominates

too coarse lose potential parallelism, cause starvation

5

cut-off = 3

cut-off = 1

Page 6: INSTITUTE OF COMPUTING TECHNOLOGY An Adaptive Task Creation Strategy for Work-Stealing Scheduling Lei Wang, Huimin Cui, Yuelu Duan, Fang Lu, Xiaobing Feng,

INSTITUTE OF COMPUTING

TECHNOLOGY

An unbalanced computation tree

6P0 – red, P1 – blue, P2 – green, P3 – yellow.

Page 7: INSTITUTE OF COMPUTING TECHNOLOGY An Adaptive Task Creation Strategy for Work-Stealing Scheduling Lei Wang, Huimin Cui, Yuelu Duan, Fang Lu, Xiaobing Feng,

INSTITUTE OF COMPUTING

TECHNOLOGY

A cut-off strategy

7P0 – red, P1 – blue, P2 – green, P3 -- yellow

Load imbalance

Page 8: INSTITUTE OF COMPUTING TECHNOLOGY An Adaptive Task Creation Strategy for Work-Stealing Scheduling Lei Wang, Huimin Cui, Yuelu Duan, Fang Lu, Xiaobing Feng,

INSTITUTE OF COMPUTING

TECHNOLOGY

An adaptive task creation strategy -- AdaptiveTC

8

A special task

P0 – red, P1 – blue, P2 – green, P3 -- yellow

Page 9: INSTITUTE OF COMPUTING TECHNOLOGY An Adaptive Task Creation Strategy for Work-Stealing Scheduling Lei Wang, Huimin Cui, Yuelu Duan, Fang Lu, Xiaobing Feng,

INSTITUTE OF COMPUTING

TECHNOLOGY

AdaptiveTC

When executing a spawn statement a task, a function call (a fake task), a special task the task the fake task the special task

Adaptively switching between tasks and fake tasks to get a better performance Cut-off A special task

9

Keeping idle threads busy

Improving performance

Good load balancing

a task a fake task

a fake task a task

Page 10: INSTITUTE OF COMPUTING TECHNOLOGY An Adaptive Task Creation Strategy for Work-Stealing Scheduling Lei Wang, Huimin Cui, Yuelu Duan, Fang Lu, Xiaobing Feng,

INSTITUTE OF COMPUTING

TECHNOLOGY

cilk int nqueens(int depth, int n, char x [ ]){…

tmpx = Cilk_alloca(n * sizeof(char)); memcpy(tmpx, x, n * sizeof(char)); sn += spawn nqueens(depth + 1, n, tmpx);…sync;return sn;}

(3)

cilk int nqueens(int depth, int n, char x [ ]){…

tmpx = (char *)malloc(n * sizeof(char)); memcpy(tmpx, x, n * sizeof(char)); sn += spawn nqueens(depth + 1, n, tmpx);...sync;free(x); return sn;}

(2) cilk int nqueens(int depth, int n, char x [ ]){...

tmpx =(char *)malloc(n * sizeof(char)); memcpy(tmpx, x, n * sizeof(char)); sn += spawn nqueens(depth + 1, n, tmpx);

free(tmpx);...sync;return sn;}

(1)

Which Cilk programs are correct?

10

N-queen problem

Page 11: INSTITUTE OF COMPUTING TECHNOLOGY An Adaptive Task Creation Strategy for Work-Stealing Scheduling Lei Wang, Huimin Cui, Yuelu Duan, Fang Lu, Xiaobing Feng,

INSTITUTE OF COMPUTING

TECHNOLOGY

A new data attribute -- taskprivate Workspace copying

Not easy to program Overhead is high

taskprivate Introduced for

workspace variables

11

cilk int nqueens(int depth, int n, char x [ ])

taskprivate: (x[]) (n * sizeof(char));{ int sn = 0; if(depth >= n){ sn++; return sn; } for(j = 0; j < n; j++){ if(place(depth, j, x)){ x[depth] = j; sn += spawn nqueens(depth + 1, n, x); } }

sync; return sn;}

An AdaptiveTC program for nqueens

In a fake task (a function call) x[depth] = j; sn += nqueens(depth + 1, n, x);

In a task

x[depth] = j; tmpx = Cilk_alloca(n * sizeof(char)); memcpy(tmpx, x, n * sizeof(char)); sn += nqueens(depth + 1, n, tmpx);

Page 12: INSTITUTE OF COMPUTING TECHNOLOGY An Adaptive Task Creation Strategy for Work-Stealing Scheduling Lei Wang, Huimin Cui, Yuelu Duan, Fang Lu, Xiaobing Feng,

INSTITUTE OF COMPUTING

TECHNOLOGY

Test system, test cases 8 cores

2-processor quad core Intel Xeon E5520 (2.26GHz, 8G memory)

8 test cases 6 are backtracking search programs. 2 are divide and conquer programs.

Compared systems Cilk-5.4.6, Tascell (PPoPP’09), AdaptiveTC gcc -O3

12

Page 13: INSTITUTE OF COMPUTING TECHNOLOGY An Adaptive Task Creation Strategy for Work-Stealing Scheduling Lei Wang, Huimin Cui, Yuelu Duan, Fang Lu, Xiaobing Feng,

INSTITUTE OF COMPUTING

TECHNOLOGY

Test case 1 -- performance

1 2 3 4 5 6 7 80

1

2

3

4

5

6

7

8

CilkCilk-SYNCHEDTascellAdaptiveTC

Number of Threads

Spee

dup

(Seconds) 1 thread 8 threads

C 61 61

Cilk 198 24.57

Cilk-SYNCHED 184 22.41

Tascell 85 14.24

AdaptiveTC 66 8.27

13Nqueen-array(16)

Page 14: INSTITUTE OF COMPUTING TECHNOLOGY An Adaptive Task Creation Strategy for Work-Stealing Scheduling Lei Wang, Huimin Cui, Yuelu Duan, Fang Lu, Xiaobing Feng,

INSTITUTE OF COMPUTING

TECHNOLOGY

Test case 1 -- analysis

Tascell Cilk Cilk-SYNCHED

AdaptiveTC0%

20%

40%

60%

80%

100%

120%working taskprivate variable

Load balanced

28.7% 69.2% 67% 7.9% The usage of cores with 8 threads

14

Tascell Cilk AdaptiveTC

83.3%99.9% 99.0%

16.7%0.1% 1.0%

busy idle

Breakdown of overhead

overhead

Page 15: INSTITUTE OF COMPUTING TECHNOLOGY An Adaptive Task Creation Strategy for Work-Stealing Scheduling Lei Wang, Huimin Cui, Yuelu Duan, Fang Lu, Xiaobing Feng,

INSTITUTE OF COMPUTING

TECHNOLOGY

1 2 3 4 5 6 7 80

1

2

3

4

5

6

7

8

Cilk

Cilk-SYNCHED

Tascell

AdaptiveTC

Number of Threads

Spee

dup

Test case 2 -- performance

(Seconds) 1 thread 8 threads

C 554 554

Cilk 669 85

Cilk-SYNCHED 661 88

Tascell 627 114

AdaptiveTC 612 77

15Nqueen-compute(16)

Page 16: INSTITUTE OF COMPUTING TECHNOLOGY An Adaptive Task Creation Strategy for Work-Stealing Scheduling Lei Wang, Huimin Cui, Yuelu Duan, Fang Lu, Xiaobing Feng,

INSTITUTE OF COMPUTING

TECHNOLOGY

Test case 2 -- analysis

11.7% 17.2% 16.2% 9.5%

Tascell Cilk Cilk-SYNCHED

AdaptiveTC0%

20%

40%

60%

80%

100%

120%

working taskprivate variabledeque/nested function

Load balanced

The usage of cores with 8 threads

Tascell Cilk AdaptiveTC

79.2%99.9% 99.1%

20.8%0.1% 0.9%

busy idle

16

Breakdown of overhead

overhead

Page 17: INSTITUTE OF COMPUTING TECHNOLOGY An Adaptive Task Creation Strategy for Work-Stealing Scheduling Lei Wang, Huimin Cui, Yuelu Duan, Fang Lu, Xiaobing Feng,

INSTITUTE OF COMPUTING

TECHNOLOGY

012345678

1 2 3 4 5 6 7 8

spee

dup

# of threads

Sudoku ( i nput_bal ance tree)

Ci l k

Ci l k-SYNCHED

Tascel l

Adapti veTC

Kni ght' s tour(6*6)

0123456789

10

1 2 3 4 5 6 7 8# of threads

spee

dup Ci l k

Ci l k-SYNCHEDTascel lAdapti veTC

St r i mko

0

1

2

3

4

5

6

7

8

1 2 3 4 5 6 7 8

# of threads

Spee

dup Ci l k

Ci l k- SYNCHEDTascel lAdapt i veTC

Pentomi no(13)

012345678

1 2 3 4 5 6 7 8

# of threads

Spee

dup Ci l k

Ci l k- SYNCHEDTascel lAdapt i veTC

Experimental results

17

Page 18: INSTITUTE OF COMPUTING TECHNOLOGY An Adaptive Task Creation Strategy for Work-Stealing Scheduling Lei Wang, Huimin Cui, Yuelu Duan, Fang Lu, Xiaobing Feng,

INSTITUTE OF COMPUTING

TECHNOLOGY

Comp(60000)

01

23

45

67

8

1 2 3 4 5 6 7 8

# of threads

Spee

dup Ci l k

Tascel lAdapti veTC

Fi b(45)

0123

4567

1 2 3 4 5 6 7 8

# of threads

spee

dup Ci l k

Tascel lAdapt i veTC

Nquee

n_ar

ray(

16)

Nquee

n_co

mpu

te(16

)

Strim

ko

Knigh

t's T

our(6

*6)

Sudok

u (b

alanc

e_tre

e)

Pento

min

o(13

)

Fib(4

5)

Comp(

6000

0)

Avera

ge0

0.51

1.52

2.53

3.54

Cilk Cilk_SYNCHED Tascell AdaptiveTC

Sp

eed

up

Experimental results (cont’d)

18

Figure: Speedup with 8 threads, baseline is Cilk’s execution time

speedup

Cilk 1

Cilk-SYNED 1.07

Tascell 1.5

AdaptiveTC 2.24

Page 19: INSTITUTE OF COMPUTING TECHNOLOGY An Adaptive Task Creation Strategy for Work-Stealing Scheduling Lei Wang, Huimin Cui, Yuelu Duan, Fang Lu, Xiaobing Feng,

INSTITUTE OF COMPUTING

TECHNOLOGY

Conclusions -- AdaptiveTC

An adaptive task creation strategy controls the tasks granularity. Reducing the system overhead Achieving a good load balancing

A new data attribute taskprivate is introduced for workspace variables. Improving the programmability Reducing the cost of workspace copying with an

adaptive task creation strategy

19

Page 20: INSTITUTE OF COMPUTING TECHNOLOGY An Adaptive Task Creation Strategy for Work-Stealing Scheduling Lei Wang, Huimin Cui, Yuelu Duan, Fang Lu, Xiaobing Feng,

INSTITUTE OF COMPUTING

TECHNOLOGY

Thanks!

20