Top Banner
An Operational Semantics for Skeletons Marco Aldinucci ISTI – CNR National Research Council Pisa, Italy Marco Danelutto Computer Science Dept. University of Pisa Pisa, Italy ParCo 2003, Dresden, Germany
25

An Operational Semantics for Skeletons Marco Aldinucci ISTI – CNR National Research Council Pisa, Italy Marco Danelutto Computer Science Dept. University.

Dec 22, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: An Operational Semantics for Skeletons Marco Aldinucci ISTI – CNR National Research Council Pisa, Italy Marco Danelutto Computer Science Dept. University.

An Operational Semantics for Skeletons

Marco Aldinucci

ISTI – CNRNational Research Council

Pisa, Italy

Marco Danelutto

Computer Science Dept.University of Pisa

Pisa, Italy

ParCo 2003, Dresden, Germany

Page 2: An Operational Semantics for Skeletons Marco Aldinucci ISTI – CNR National Research Council Pisa, Italy Marco Danelutto Computer Science Dept. University.

2

Outline

SkeletonsSemantics – motivations The schema of semanticsAxioms – rules ExampleConcluding remarks

Page 3: An Operational Semantics for Skeletons Marco Aldinucci ISTI – CNR National Research Council Pisa, Italy Marco Danelutto Computer Science Dept. University.

3

Skeletons

Skeletons are language constructswell-defined input-output behaviorparallelism exploitation patterns(sometimes) can be nestedseveral prepackaged implementations

Two main familiesData Parallel (map, reduce, scan …)Task & Stream parallel (farm, pipeline, …)

Page 4: An Operational Semantics for Skeletons Marco Aldinucci ISTI – CNR National Research Council Pisa, Italy Marco Danelutto Computer Science Dept. University.

4

Motivations

Usually formal functional semantics, informal parallel behavior

Describe skeletonsin-out relationship (functional behavior)

parallel behavior

in uniform and precise way (non steady state)

in structural way

Theoretical work motivated by concrete needsEnable and automate performance-driven source-to-source optimizations

same in/out different parallel behaviors

Compare different skeleton sets expressive power

Page 5: An Operational Semantics for Skeletons Marco Aldinucci ISTI – CNR National Research Council Pisa, Italy Marco Danelutto Computer Science Dept. University.

5

ABB

farm (pipe (seq f1) (seq f2)

BAB

pipe

pipe

f1 f2

f1 f2

farm

channel,network …

sch

ed

gath

er

PE1 – PE4

e.g. with ASI

f1 = filterf2 = render

DipInf
sequential source code just plugged indata items arrives in sequence, we cannot assume data is already distributed, data distribution cost is large, several farm scheduling policies are possible, as well as several data mappings
Page 6: An Operational Semantics for Skeletons Marco Aldinucci ISTI – CNR National Research Council Pisa, Italy Marco Danelutto Computer Science Dept. University.

6

pipe (map fc (seq f1 ) fd) (map gc (seq f2 ) gd)

I IIII IIIIIII I III II

mapmap

pipe

f1

f1

f1

f2

f2

f2

fd gc

PE1 – PE6

gdfc

Page 7: An Operational Semantics for Skeletons Marco Aldinucci ISTI – CNR National Research Council Pisa, Italy Marco Danelutto Computer Science Dept. University.

7

Running example language: Lithium

Stream and Data Parallel farm, pipe

map, reduce, D&C, …

Can be freely nested

All skeletons have a stream as in/out

Java-based (skeletons are Java classes)

Implemented and running [FGCS 19(5):2003] http://www.di.unipi.it/~marcod/Lithium/ or sourceforge

Macro data-flow run-time

Support heterogeneous COWs

Includes parallel structure optimization performance-driven, source-to-source

Page 8: An Operational Semantics for Skeletons Marco Aldinucci ISTI – CNR National Research Council Pisa, Italy Marco Danelutto Computer Science Dept. University.

8

The schema of semantics

Axioms, three kind per skeleton:1. Describe skeletons within the steady state2. Mark the begin of stream *3. Manage the end of stream *

Six rules:1. Two describing parallel execution (SP, DP)

– Have a cost2. Four to navigate in the program structure

– No cost, ensure strict execution order

Look to SP/DP rules only to figure out program performance

Page 9: An Operational Semantics for Skeletons Marco Aldinucci ISTI – CNR National Research Council Pisa, Italy Marco Danelutto Computer Science Dept. University.

9

The meaning of labels

Label represent an enumeration of PEsTwo kind of labels:

On streams represent data mapping:

means x is available on PE3

On arrows represent computation mapping

means such computation is performed by PE4

Re-label O(l,x) a stream means communicate it

Semantics may embed an user-defined policy O(l,x)

Cost depend on label (topology) and data item x (size)

3x

4

Page 10: An Operational Semantics for Skeletons Marco Aldinucci ISTI – CNR National Research Council Pisa, Italy Marco Danelutto Computer Science Dept. University.

10

Axioms (steady state)

1. Apply inner skeleton F param to the stream head x

a. The arrow label gets left-hand side stream label()

b. Labels in the right-hand side may change (stream items may be bounced elsewhere)

1

2

3

2. Recur on the tail of the stream

3. Expressions 1 & 2 are joined by :: operator

Page 11: An Operational Semantics for Skeletons Marco Aldinucci ISTI – CNR National Research Council Pisa, Italy Marco Danelutto Computer Science Dept. University.

11

a. arrow label gets stream one – happens locally

b. label doesn’t change – keep 1st stage Δ1 locally

c. re-label R inserted in between 1st & 2nd stage – it will map 2nd stage elsewhere

d. tail is expected from the same source

Lithium axioms (for stream par skeletons)

Embed seq codeStream unfolded, Labels unchanged

a. stream item is distributed accordingly O policy

b. a reference of tail of the stream follows the head

a

b dc

Page 12: An Operational Semantics for Skeletons Marco Aldinucci ISTI – CNR National Research Council Pisa, Italy Marco Danelutto Computer Science Dept. University.

12

Lithium axioms (DP skeletons)

Page 13: An Operational Semantics for Skeletons Marco Aldinucci ISTI – CNR National Research Council Pisa, Italy Marco Danelutto Computer Science Dept. University.

13

Lithium rules overview

Page 14: An Operational Semantics for Skeletons Marco Aldinucci ISTI – CNR National Research Council Pisa, Italy Marco Danelutto Computer Science Dept. University.

14

sp rules details

Many semantics for each program i=j=1 always possible, i.e. no stream parallelism is exploitedAll of them are “functionally confluent”, describe the same in-out relationship All of them describe the same parallel behavior, but with different degrees of parallelism

Page 15: An Operational Semantics for Skeletons Marco Aldinucci ISTI – CNR National Research Council Pisa, Italy Marco Danelutto Computer Science Dept. University.

15

• Apply the farm inner skeleton to 1st elem

• Recur on tail and change the stream label

• Assume a round-robin scheduling policy O(l,x) with 2 elements (2 pipelines)

• Iterate the same operation on the whole stream

• farm now disappeared

• two different labels on streams: 0 and 1

• Mark the begin of the stream

• Add the stream label

Example (2-ways-2-stages pipeline) 765432121 ,,,,,, xxxxxxxff )) (seq ) (seq farm(pipe

• Apply pipe inner skeletons (stages) to the item•A re-labeling operation R is introduced in the middle

• Iterate the same operation on the whole stream

• pipe now disappeared

• two different labels on streams: 0 and 1• two different labels on R : 02, 12

Page 16: An Operational Semantics for Skeletons Marco Aldinucci ISTI – CNR National Research Council Pisa, Italy Marco Danelutto Computer Science Dept. University.

16

Example (continued)

• This formula no longer can be reduced by axioms

• sp rule can be applied:

“Any rightmost sequence of expressions can be reduced provided streams exploits different labels”

• In this case the longest sequence includes two expressions, i.e. the max. par degree is 2 (matching the double-pipeline startup phase)

Page 17: An Operational Semantics for Skeletons Marco Aldinucci ISTI – CNR National Research Council Pisa, Italy Marco Danelutto Computer Science Dept. University.

17

Example (continued)

Due to the re-labeling we have 4 adjacent expressions exploiting different labels: 02, 12, 0, 1 – i.e. a max. parallelism degree of 4The step can be iterated up to the end of streamMax parallelism degree 4 since no more than 4 different labels appear adjacently (easy to prove)

Page 18: An Operational Semantics for Skeletons Marco Aldinucci ISTI – CNR National Research Council Pisa, Italy Marco Danelutto Computer Science Dept. University.

18

Example (continued)

Count parallelismCount communicationsor reason about it

By iterating SP rule we eventually get

That can be joined to form the output stream

Page 19: An Operational Semantics for Skeletons Marco Aldinucci ISTI – CNR National Research Council Pisa, Italy Marco Danelutto Computer Science Dept. University.

19

Summary

Operational semantics for skeletonsDescribes both functional and parallel behaviorUser-defined mapping/schedulingUser-defined comm/comp costsGeneral, easy to extendNo similar results within the skeleton community

Enable performance reasoningSkeleton normal-form [PDCS99, FGCS03, web]Provably correct automatic optimizations

Formally describe your brand new skeleton and its performance

Page 20: An Operational Semantics for Skeletons Marco Aldinucci ISTI – CNR National Research Council Pisa, Italy Marco Danelutto Computer Science Dept. University.

20

Mammography app. (lithium)

raw

optimized15 – 20%

better

Page 21: An Operational Semantics for Skeletons Marco Aldinucci ISTI – CNR National Research Council Pisa, Italy Marco Danelutto Computer Science Dept. University.

Thank youQuestions ?

www.di.unipi.it/~aldinuc

Page 22: An Operational Semantics for Skeletons Marco Aldinucci ISTI – CNR National Research Council Pisa, Italy Marco Danelutto Computer Science Dept. University.

22

Stream skeletons

farmfunctionally the identity !a.k.a. parameter sweeping, embarrassingly parallel, replica manager … instead for some other group it is apply-to-all

pipe parallel functional compositionpipe f1 f2 < x > computes f2 ( f1 x )

f1 , f2 executed in parallel on different data items

Page 23: An Operational Semantics for Skeletons Marco Aldinucci ISTI – CNR National Research Council Pisa, Italy Marco Danelutto Computer Science Dept. University.

23

Describe skeletons

Usually functional behavior only describedParallel behavior does matter for performance

Usually performance described by cost formulas

OpOp tpp

nlpgt

p

nOpT

211)( comm_size scan

Doesn’t describe the behavior just the costWhat happens if Op is parallel ?

Not compositionalhandmade for each architectureData layout not described

Page 24: An Operational Semantics for Skeletons Marco Aldinucci ISTI – CNR National Research Council Pisa, Italy Marco Danelutto Computer Science Dept. University.

24

Axioms (begin/end of the stream)

Begin of stream marking:

End of stream management:

Page 25: An Operational Semantics for Skeletons Marco Aldinucci ISTI – CNR National Research Council Pisa, Italy Marco Danelutto Computer Science Dept. University.

25

An example of reduction