Top Banner
Cilk: An Efficient Multithreaded Runtime System Mohanadarshan - 148241N Shareek Ahamed - 148201T Authors: Robert D. Blumofe, Christopher F. Joerg, Bradley C. Kuszmaul, Charles E. Leiserson, Keith H. Randall and Yuli Zhou MIT Laboratory for Computer Science, Cambridge
31
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Cilk - An Efficient Multithreaded Runtime System

Cilk: An Efficient Multithreaded Runtime System

Mohanadarshan - 148241NShareek Ahamed - 148201T

Authors: Robert D. Blumofe, Christopher F. Joerg, Bradley C. Kuszmaul,

Charles E. Leiserson, Keith H. Randall and Yuli Zhou

MIT Laboratory for Computer Science, Cambridge

Page 2: Cilk - An Efficient Multithreaded Runtime System

Agenda

● What is Cilk ?

● Why Cilk ?

● Introduction

● Scheduling & Work Stealing

● How it Works ?

● Fibonacci Calculation

● Performance in Cilk Applications

● Current Usage

● Related Works

● Cilk Plus

● Conclusion

Page 3: Cilk - An Efficient Multithreaded Runtime System

What is Cilk ?

● Cilk is a C-based runtime system for multi-threaded parallel programming.

● Cilk guarantees efficient and predictable performance

● Lightweight fork and join

○ Own scheduler (Work Stealing Scheduler)

● Proofs for Performance and Space

● World Class chess programs like StarTech, *Socrates, and Cilkchess are

developed by Cilk.

Page 4: Cilk - An Efficient Multithreaded Runtime System

Why Cilk ?

Multithreading requires to implement dynamic, asynchronous, concurrent programs.

● A multithreaded system provides the programmer with a means to create,

synchronize, and schedule threads.

● Cilk reduces the complexity of implementing multithreaded programs.

● Programmer don’t have to worry about the complexity, only need to identify

region for parallelism.

● Cilk optimizes:

➔ Total work

➔ Critical path

Page 5: Cilk - An Efficient Multithreaded Runtime System

Introduction

Page 6: Cilk - An Efficient Multithreaded Runtime System

Introduction (contd..)

● Cilk program is a set of procedures

● A procedure is a sequence of threads

● Cilk threads are:

○ Represented by nodes in the dag

○ Non-blocking: run to completion: no waiting or suspension: atomic units

of execution

● Threads can spawn child threads

○ downward edges connect a parent to its children

Page 7: Cilk - An Efficient Multithreaded Runtime System

Introduction (contd..)

● A child & parent can run concurrently.

○ Non-blocking threads --> a child cannot return a value to its parent.

○ The parent spawns a successor that receives values from its children

● A thread & its successor are parts of the same Cilk procedure.

○ connected by horizontal arcs

● Children’s returned values are received before their successor begins:

○ They constitute data dependencies.

○ Connected by curved arcs

Page 8: Cilk - An Efficient Multithreaded Runtime System

How it Works ?

● spawn T (k, ?x)

- spawn a child thread

● spawn_next T(k, ?x)

- A successor thread is spawned the same way as a child, except the keyword spawn_next is used

● send_argument( k, value )

- sends value to the argument slot of a waiting closure specified by continuation k.

spawn_next

send_argumentspawn

Parent

Child

Successor

Page 9: Cilk - An Efficient Multithreaded Runtime System

Scheduling

Every Processor has own

- Scheduler

- Ready-Queue

Invoked when thread ends

- Schedules or steals another thread

Page 10: Cilk - An Efficient Multithreaded Runtime System

Work Stealing

● Cilk uses run time scheduling called work stealing.

● Works well on dynamic, asynchronous, MIMD-style programs.

● Work-stealing:

○ a process with no work selects a victim from which to get work.

○ it gets the shallowest thread in the victim’s spawn tree.

● In Cilk, thieves choose the victims randomly.

Page 11: Cilk - An Efficient Multithreaded Runtime System

Work Stealing (contd..)

void func f( )

{

work;

spawn g( );

work;

work;

work;

….

work;

}

thread void func g( )

{

work;

work;

work;

}

Worker1 Worker2

Page 12: Cilk - An Efficient Multithreaded Runtime System

Work Stealing (contd..)

void func f( )

{

work;

spawn g( );

work;

work;

work;

….

work;

}

thread void func g( )

{

work;

work;

work;

}

Worker1 Worker2

Page 13: Cilk - An Efficient Multithreaded Runtime System

Work Stealing (contd..)

void func f( )

{

work;

spawn g( );

work;

work;

work;

….

work;

}

thread void func g( )

{

work;

work;

work;

}

Worker1 Worker2

Page 14: Cilk - An Efficient Multithreaded Runtime System

Work Stealing (contd..)

void func f( )

{

work;

spawn g( );

work;

work;

work;

….

work;

}

thread void func g( )

{

work;

work;

work;

}

Worker1 Worker2

Page 15: Cilk - An Efficient Multithreaded Runtime System

Work Stealing (contd..)

void func f( )

{

work;

spawn g( );

work;

work;

work;

….

work;

}

thread void func g( )

{

work;

work;

work;

}

Worker1 Worker2

Page 16: Cilk - An Efficient Multithreaded Runtime System

Work Stealing (contd..)

void func f( )

{

work;

spawn g( );

work;

work;

work;

….

work;

}

thread void func g( )

{

work;

work;

work;

}

Worker1 Worker2

Page 17: Cilk - An Efficient Multithreaded Runtime System

Work Stealing (contd..)

void func f( )

{

work;

spawn g( );

work;

work;

work;

….

work;

}

thread void func g( )

{

work;

work;

work;

}

Worker1 Worker2

Page 18: Cilk - An Efficient Multithreaded Runtime System

How it Works ? (Example :Fibonacci)

thread int fib ( cont int k, int n ) {

if ( n < 2 ) send_argument( k, n );

else { cont int x, y;

spawn_next sum ( k, ?x, ?y );

spawn fib ( x, n - 1 );

spawn fib ( y, n - 2 );

}

}

thread sum ( cont int k, int x, int y ) {

send_argument ( k, x + y );

}

Page 19: Cilk - An Efficient Multithreaded Runtime System

Fibonacci Calculation

Page 20: Cilk - An Efficient Multithreaded Runtime System

Ready Queue

if ( ! readyDeque .isEmpty() )

take deepest thread

else

steal shallowest thread from readyDeque of randomly selected victim

Page 21: Cilk - An Efficient Multithreaded Runtime System

Performance in Cilk Application

Experiments were ran on a CM5 supercomputer to document the efficiency of the work-stealing scheduler.

Tested Applications

1. fib (fibonacci)2. queens (placing N queens on a N x N chessboard)3. pfold (protein-folding)4. ray (ray-tracing algorithm for graphics rendering) 5. Knary (at each node runs an empty “for” loop )6. Socrates (parallel chess program, uses the Jamboree search algorithm)

Page 22: Cilk - An Efficient Multithreaded Runtime System

Performance in Cilk Application (contd..)

Tserial

⇒ Time taken to run C program (gcc)

T1 ⇒ Time taken to run 1-processor Cilk program

T ∞ ⇒ Cilk computation timestamping each thread

Tp ⇒ Processor execution time of the Cilk program

Tserial

⇒ Efficiency of the Cilk program T

1

⇒ Efficiency is close to 1 for programs with moderately long threads

Cilk overhead is small.

Page 23: Cilk - An Efficient Multithreaded Runtime System

Performance of Cilk on various applications

Page 24: Cilk - An Efficient Multithreaded Runtime System

Performance in Cilk Application

Finding 33rd Fibonacci Number

Page 25: Cilk - An Efficient Multithreaded Runtime System

Example applications

Virus shell assembly

Graphics rendering

n-body simulation

Heuristic search

Dense and sparse matrix computations

Friction-stir welding simulation

Artificial evolution

Page 26: Cilk - An Efficient Multithreaded Runtime System

Related Works

EARTH (An Efficient Architecture for Running THreads)

EARTH supports an adaptive event Driven multithreaded execution model, containing two thread levels:

● threaded procedures● fibers

A threaded procedure is invoked asynchronously forking a parallel thread of execution.

A threaded procedure is statically divided into fibers fine grain threads communicating through dataflow-like synchronization operations.

Page 27: Cilk - An Efficient Multithreaded Runtime System

EARTH vs. CILK

EARTH Model CILK Model

Note: - EARTH has it origin in static dataflow model

- In comparison features of CILK Model is similar to the EARTH model

Page 28: Cilk - An Efficient Multithreaded Runtime System

Cilk Plus

● Maintained by Intel ©

● Only 3 keywords

– Cilk_spawn

– Cilk_sync

– Cilk_for

● Available in Intel Compilers & in gcc branch.

More info - http://www.cilkplus.org/

https://www.youtube.com/watch?v=mv5i3MEvX98

Page 29: Cilk - An Efficient Multithreaded Runtime System

Cilk Plus

cilk int fib (int n)

{

if (n < 2) return n;

else

{

int x, y;

x = spawn fib (n-1);

y = spawn fib (n-2);

sync;

return (x+y);

}

}

- Easier to implement than Cilk!- Less complex than Cilk!

Page 30: Cilk - An Efficient Multithreaded Runtime System

Conclusion

● Pros➔ Guaranteed runtime & space usage

➔ Good performance

➔ Critical Path is short compared to total work

➔ Low Overhead

➔ Very Simple to Use

● Cons➔ Only suitable for tree like computations

➔ Continuations are confusing

➔ No shared memory

Page 31: Cilk - An Efficient Multithreaded Runtime System

Thank You ...