YOU ARE DOWNLOADING DOCUMENT

Please tick the box to continue:

Transcript
Page 1: Cilk - An Efficient Multithreaded Runtime System

Cilk: An Efficient Multithreaded Runtime System

Mohanadarshan - 148241NShareek Ahamed - 148201T

Authors: Robert D. Blumofe, Christopher F. Joerg, Bradley C. Kuszmaul,

Charles E. Leiserson, Keith H. Randall and Yuli Zhou

MIT Laboratory for Computer Science, Cambridge

Page 2: Cilk - An Efficient Multithreaded Runtime System

Agenda

● What is Cilk ?

● Why Cilk ?

● Introduction

● Scheduling & Work Stealing

● How it Works ?

● Fibonacci Calculation

● Performance in Cilk Applications

● Current Usage

● Related Works

● Cilk Plus

● Conclusion

Page 3: Cilk - An Efficient Multithreaded Runtime System

What is Cilk ?

● Cilk is a C-based runtime system for multi-threaded parallel programming.

● Cilk guarantees efficient and predictable performance

● Lightweight fork and join

○ Own scheduler (Work Stealing Scheduler)

● Proofs for Performance and Space

● World Class chess programs like StarTech, *Socrates, and Cilkchess are

developed by Cilk.

Page 4: Cilk - An Efficient Multithreaded Runtime System

Why Cilk ?

Multithreading requires to implement dynamic, asynchronous, concurrent programs.

● A multithreaded system provides the programmer with a means to create,

synchronize, and schedule threads.

● Cilk reduces the complexity of implementing multithreaded programs.

● Programmer don’t have to worry about the complexity, only need to identify

region for parallelism.

● Cilk optimizes:

➔ Total work

➔ Critical path

Page 5: Cilk - An Efficient Multithreaded Runtime System

Introduction

Page 6: Cilk - An Efficient Multithreaded Runtime System

Introduction (contd..)

● Cilk program is a set of procedures

● A procedure is a sequence of threads

● Cilk threads are:

○ Represented by nodes in the dag

○ Non-blocking: run to completion: no waiting or suspension: atomic units

of execution

● Threads can spawn child threads

○ downward edges connect a parent to its children

Page 7: Cilk - An Efficient Multithreaded Runtime System

Introduction (contd..)

● A child & parent can run concurrently.

○ Non-blocking threads --> a child cannot return a value to its parent.

○ The parent spawns a successor that receives values from its children

● A thread & its successor are parts of the same Cilk procedure.

○ connected by horizontal arcs

● Children’s returned values are received before their successor begins:

○ They constitute data dependencies.

○ Connected by curved arcs

Page 8: Cilk - An Efficient Multithreaded Runtime System

How it Works ?

● spawn T (k, ?x)

- spawn a child thread

● spawn_next T(k, ?x)

- A successor thread is spawned the same way as a child, except the keyword spawn_next is used

● send_argument( k, value )

- sends value to the argument slot of a waiting closure specified by continuation k.

spawn_next

send_argumentspawn

Parent

Child

Successor

Page 9: Cilk - An Efficient Multithreaded Runtime System

Scheduling

Every Processor has own

- Scheduler

- Ready-Queue

Invoked when thread ends

- Schedules or steals another thread

Page 10: Cilk - An Efficient Multithreaded Runtime System

Work Stealing

● Cilk uses run time scheduling called work stealing.

● Works well on dynamic, asynchronous, MIMD-style programs.

● Work-stealing:

○ a process with no work selects a victim from which to get work.

○ it gets the shallowest thread in the victim’s spawn tree.

● In Cilk, thieves choose the victims randomly.

Page 11: Cilk - An Efficient Multithreaded Runtime System

Work Stealing (contd..)

void func f( )

{

work;

spawn g( );

work;

work;

work;

….

work;

}

thread void func g( )

{

work;

work;

work;

}

Worker1 Worker2

Page 12: Cilk - An Efficient Multithreaded Runtime System

Work Stealing (contd..)

void func f( )

{

work;

spawn g( );

work;

work;

work;

….

work;

}

thread void func g( )

{

work;

work;

work;

}

Worker1 Worker2

Page 13: Cilk - An Efficient Multithreaded Runtime System

Work Stealing (contd..)

void func f( )

{

work;

spawn g( );

work;

work;

work;

….

work;

}

thread void func g( )

{

work;

work;

work;

}

Worker1 Worker2

Page 14: Cilk - An Efficient Multithreaded Runtime System

Work Stealing (contd..)

void func f( )

{

work;

spawn g( );

work;

work;

work;

….

work;

}

thread void func g( )

{

work;

work;

work;

}

Worker1 Worker2

Page 15: Cilk - An Efficient Multithreaded Runtime System

Work Stealing (contd..)

void func f( )

{

work;

spawn g( );

work;

work;

work;

….

work;

}

thread void func g( )

{

work;

work;

work;

}

Worker1 Worker2

Page 16: Cilk - An Efficient Multithreaded Runtime System

Work Stealing (contd..)

void func f( )

{

work;

spawn g( );

work;

work;

work;

….

work;

}

thread void func g( )

{

work;

work;

work;

}

Worker1 Worker2

Page 17: Cilk - An Efficient Multithreaded Runtime System

Work Stealing (contd..)

void func f( )

{

work;

spawn g( );

work;

work;

work;

….

work;

}

thread void func g( )

{

work;

work;

work;

}

Worker1 Worker2

Page 18: Cilk - An Efficient Multithreaded Runtime System

How it Works ? (Example :Fibonacci)

thread int fib ( cont int k, int n ) {

if ( n < 2 ) send_argument( k, n );

else { cont int x, y;

spawn_next sum ( k, ?x, ?y );

spawn fib ( x, n - 1 );

spawn fib ( y, n - 2 );

}

}

thread sum ( cont int k, int x, int y ) {

send_argument ( k, x + y );

}

Page 19: Cilk - An Efficient Multithreaded Runtime System

Fibonacci Calculation

Page 20: Cilk - An Efficient Multithreaded Runtime System

Ready Queue

if ( ! readyDeque .isEmpty() )

take deepest thread

else

steal shallowest thread from readyDeque of randomly selected victim

Page 21: Cilk - An Efficient Multithreaded Runtime System

Performance in Cilk Application

Experiments were ran on a CM5 supercomputer to document the efficiency of the work-stealing scheduler.

Tested Applications

1. fib (fibonacci)2. queens (placing N queens on a N x N chessboard)3. pfold (protein-folding)4. ray (ray-tracing algorithm for graphics rendering) 5. Knary (at each node runs an empty “for” loop )6. Socrates (parallel chess program, uses the Jamboree search algorithm)

Page 22: Cilk - An Efficient Multithreaded Runtime System

Performance in Cilk Application (contd..)

Tserial

⇒ Time taken to run C program (gcc)

T1 ⇒ Time taken to run 1-processor Cilk program

T ∞ ⇒ Cilk computation timestamping each thread

Tp ⇒ Processor execution time of the Cilk program

Tserial

⇒ Efficiency of the Cilk program T

1

⇒ Efficiency is close to 1 for programs with moderately long threads

Cilk overhead is small.

Page 23: Cilk - An Efficient Multithreaded Runtime System

Performance of Cilk on various applications

Page 24: Cilk - An Efficient Multithreaded Runtime System

Performance in Cilk Application

Finding 33rd Fibonacci Number

Page 25: Cilk - An Efficient Multithreaded Runtime System

Example applications

Virus shell assembly

Graphics rendering

n-body simulation

Heuristic search

Dense and sparse matrix computations

Friction-stir welding simulation

Artificial evolution

Page 26: Cilk - An Efficient Multithreaded Runtime System

Related Works

EARTH (An Efficient Architecture for Running THreads)

EARTH supports an adaptive event Driven multithreaded execution model, containing two thread levels:

● threaded procedures● fibers

A threaded procedure is invoked asynchronously forking a parallel thread of execution.

A threaded procedure is statically divided into fibers fine grain threads communicating through dataflow-like synchronization operations.

Page 27: Cilk - An Efficient Multithreaded Runtime System

EARTH vs. CILK

EARTH Model CILK Model

Note: - EARTH has it origin in static dataflow model

- In comparison features of CILK Model is similar to the EARTH model

Page 28: Cilk - An Efficient Multithreaded Runtime System

Cilk Plus

● Maintained by Intel ©

● Only 3 keywords

– Cilk_spawn

– Cilk_sync

– Cilk_for

● Available in Intel Compilers & in gcc branch.

More info - http://www.cilkplus.org/

https://www.youtube.com/watch?v=mv5i3MEvX98

Page 29: Cilk - An Efficient Multithreaded Runtime System

Cilk Plus

cilk int fib (int n)

{

if (n < 2) return n;

else

{

int x, y;

x = spawn fib (n-1);

y = spawn fib (n-2);

sync;

return (x+y);

}

}

- Easier to implement than Cilk!- Less complex than Cilk!

Page 30: Cilk - An Efficient Multithreaded Runtime System

Conclusion

● Pros➔ Guaranteed runtime & space usage

➔ Good performance

➔ Critical Path is short compared to total work

➔ Low Overhead

➔ Very Simple to Use

● Cons➔ Only suitable for tree like computations

➔ Continuations are confusing

➔ No shared memory

Page 31: Cilk - An Efficient Multithreaded Runtime System

Thank You ...


Related Documents