Top Banner
HaLoop: Efficient Iterative Data Processing On Large Scale Clusters by Yingyi Bu, Bill Howe, Magdalena Balazinska, & Michael D. Ernst Presentation by Carl Erhard & Zahid Mian 1
55

Presentation by Carl Erhard & Zahid Mian

Feb 22, 2016

Download

Documents

hateya

HaLoop: Efficient Iterative Data Processing On Large Scale Clusters by Yingyi Bu, Bill Howe, Magdalena Balazinska , & Michael D. Ernst. Presentation by Carl Erhard & Zahid Mian. Citations. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Presentation by  Carl Erhard & Zahid Mian

1

HaLoop: Efficient Iterative Data Processing On Large Scale Clusters by Yingyi Bu, Bill Howe, Magdalena Balazinska, & Michael D. Ernst

Presentation by Carl Erhard & Zahid Mian

Page 2: Presentation by  Carl Erhard & Zahid Mian

2

Many of the slides in this presentation were taken from the author’s website, which can be found here: http://www.ics.uci.edu/~yingyib/

Citations

Page 3: Presentation by  Carl Erhard & Zahid Mian

3

• Introduction / Motivation• Example Algorithms Used• Architecture of HaLoop• Task Scheduling• Caching and Indexing• Experiments & Results• Conclusion / Discussion

Agenda

Page 4: Presentation by  Carl Erhard & Zahid Mian

4

Motivation

• MapReduce can’t express recursion/iteration• Lots of interesting programs need loops

– graph algorithms– clustering– machine learning– recursive queries

• Dominant solution: Use a driver program outside of MapReduce

• Hypothesis: making MapReduce loop-aware affords optimization– …and lays a foundation for scalable implementations of

recursive languages

Bill Howe, UW

Page 5: Presentation by  Carl Erhard & Zahid Mian

5

Thesis – Make a Loop Framework

• Observation: MapReduce has proven successful as a common runtime for non-recursive declarative languages– HIVE (SQL)– Pig (RA with nested types)

• Observation: Many people roll their own loops– Graphs, clustering, mining, recursive queries – iteration managed by external script

• Thesis: With minimal extensions, we can provide an efficient common runtime for recursive languages– Map, Reduce, Fixpoint

Bill Howe, UW

Page 7: Presentation by  Carl Erhard & Zahid Mian

7

PageRank in MapReduce

M

M

M

M

M

r

r

Ri

L-split1

L-split0M

M

r

r

i=i+1 Converged?

Join & compute rank Aggregate fixpoint evaluation

Client

done

r

r

Bill Howe, UW

Page 8: Presentation by  Carl Erhard & Zahid Mian

8

PageRank in MapReduce

1. L is loaded on each iteration2. L is shuffled on each iteration3. Fixpoint evaluated as a separate MapReduce job per iteration

m

m

m

Ri

L-split1

L-split0M

M

r

r

1.2.

3.

L is loop invariant, but

r

r M

M

r

r

What’s the Problem?

Bill Howe, UW

Page 9: Presentation by  Carl Erhard & Zahid Mian

9

Example 2: Descendant Query

Friend Find all friends within two hops of Eric

{Eric, Elisa}

{Eric, Tom Eric, Harry}

{}

R1

R0 {Eric, Eric}

R2

R3

Bill Howe, UW

Page 10: Presentation by  Carl Erhard & Zahid Mian

10

Descendant Query in MapReduce

M

M

M

M

M

r

r

Si

Friend1

Friend0

i=i+1

Anything new?

JoinDupe-elim

Client

done

r

r

(compute next generation of friends) (remove the ones we’ve already seen)

Bill Howe, UW

Page 11: Presentation by  Carl Erhard & Zahid Mian

11

Descendant Query in MapReduce

1. Friend is loaded on each iteration2. Friend is shuffled on each iteration

Friend is loop invariant, but

M

M

M

M

M

r

r

Si

Friend1

Friend0

JoinDupe-elim

r

r

(compute next generation of friends) (remove the ones we’ve already seen)

1.2.

What’s the Problem?

Bill Howe, UW

Page 12: Presentation by  Carl Erhard & Zahid Mian

12

HaLoop – The Solution

HaLoop offers the following solutions to these problems:

1. A New Programming Model & Architecture for Iterative Programs

2. Loop-Aware Task Scheduling3. Caching for Loop Invariant Data4. Caching for Fixpoint Evaluation

Page 13: Presentation by  Carl Erhard & Zahid Mian

13

HaLoop Architecture

Page 14: Presentation by  Carl Erhard & Zahid Mian

14

HaLoop Architecture

Note: The loop control (i.e. determining when execution has finished) is pushed from the application into the infrastructure.

Page 15: Presentation by  Carl Erhard & Zahid Mian

15

• HaLoop will work given the following is true:

• In other words, the next result is a join of the previous result and loop-invariant data L.

HaLoop Architecture

Page 16: Presentation by  Carl Erhard & Zahid Mian

16

• AddMap and AddReduce Used to add a Map Reduce loop

• SetFixedPointThreshold Set a bound on the distance between

iterations• ResultDistance

A function that returns the distance between iterations

• SetMaxNumOfIterations Set the maximum number of iterations the

loop can take.

HaLoop Programming Interface

Page 17: Presentation by  Carl Erhard & Zahid Mian

17

• SetIterationInput A function which returns the input for a

certain iteration• AddStepInput

A function will allows injection of additional data in between the Map and Reduce

• AddInvariantTable Add a table which is loop invariant

Page 18: Presentation by  Carl Erhard & Zahid Mian

18

API’s in Action – PageRank in HaLoop

Page 19: Presentation by  Carl Erhard & Zahid Mian

19

PageRank In HaLoop

MapRank

ReduceRank

MapAggregate

ReduceAggregate

R0

L

Page 20: Presentation by  Carl Erhard & Zahid Mian

20

PageRank In HaLoop

MapRank

ReduceRank

MapAggregate

ReduceAggregate

R0

L

Source URL

Dest/Rank Source File

a.com 1.0 #2

a.com b.com,c.com,d.com

#1

b.com 1.0 #2

b.com #1

c.com 1.0 #2

c.com a.com, e.com #1

d.com 1.0 #2

d.com b.com #1

e.com 1.0 #2

e.com d.com,c.com #1

R0 U L

Only these values are given to reducer

Page 21: Presentation by  Carl Erhard & Zahid Mian

21

PageRank In HaLoop

MapRank

ReduceRank

MapAggregate

ReduceAggregate

R0

L

R0 U L

Destination New Rank

b.com 1.5

c.com 1.5

d.com 1.5

a.com 1.5

e.com 1.5

b.com 1.5

d.com 1.5

c.com 1.5

Calculate New Rank

Page 22: Presentation by  Carl Erhard & Zahid Mian

22

PageRank In HaLoop

MapRank

ReduceRank

MapAggregate

ReduceAggregate

R0

L

R0 U L

Destination New Rank

b.com 1.5

c.com 1.5

d.com 1.5

a.com 1.5

e.com 1.5

b.com 1.5

d.com 1.5

c.com 1.5

Calculate New Rank

Identity

Page 23: Presentation by  Carl Erhard & Zahid Mian

23

PageRank In HaLoop

MapRank

ReduceRank

MapAggregate

ReduceAggregate

R0

L

R0 U L

Destination New Rank

a.com 1.5

b.com 3.0

c.com 3.0

d.com 3.0

e.com 1.5

Calculate New Rank

Identity Aggregate

R1

Page 24: Presentation by  Carl Erhard & Zahid Mian

24

PageRank In HaLoop

MapRank

ReduceRank

MapAggregate

ReduceAggregate

R0

L

R0 U L

Destination New Rank

a.com 1.5

b.com 3.0

c.com 3.0

d.com 3.0

e.com 1.5

Calculate New Rank

Identity Aggregate

R1

Compare R0 and R1. If not under

threshold, repeat.

Destination New Rank

a.com 1.0

b.com 1.0

c.com 1.0

d.com 1.0

e.com 1.0

Page 25: Presentation by  Carl Erhard & Zahid Mian

25

PageRank In HaLoop

MapRank

ReduceRank

MapAggregate

ReduceAggregate

L

R1 U L Calculate New Rank

Identity AggregateR1

Source URL

Dest/Rank Source File

a.com 1.5 #2

a.com b.com,c.com,d.com

#1

b.com 1.5 #2

b.com #1

c.com 1.5 #2

c.com a.com, e.com #1

d.com 1.5 #2

d.com b.com #1

e.com 1.5 #2

e.com d.com,c.com #1

Page 26: Presentation by  Carl Erhard & Zahid Mian

26

• One goal of HaLoop is to schedule map/reduce tasks on the same machine as the data.– Scheduling the first

iteration is no different than Hadoop.

– Subsequent iterations put tasks that access the same data on the same physical node.

HaLoop Inter-Iteration Locality

Page 27: Presentation by  Carl Erhard & Zahid Mian

27

• The master node keeps a map of node ID Filesystem Partition

• When a node becomes free, the master tries to assign a task related to data contained on that node.

• If a task is required on a node with a full load, it will utilize a nearby node.

HaLoop Scheduling

Page 28: Presentation by  Carl Erhard & Zahid Mian

28

• Mapper Input Cache• Reducer Input Cache• Reducer Output Cache• Why is there no Mapper Output

Cache?• Haloop Indexes Cached Data

Keys and values stored in separate local files

Reduces I/O seek time (forward only)

Caching And Indexing

Page 29: Presentation by  Carl Erhard & Zahid Mian

29

Approach: Inter-iteration caching

Mapper input cache (MI)

Mapper output cache (MO)

Reducer input cache (RI)

Reducer output cache (RO)

M

M

M

r

r

Loop body

Bill Howe, UW

Page 30: Presentation by  Carl Erhard & Zahid Mian

30

RI: Reducer Input Cache

Bill Howe, UW

• Provides:– Access to loop invariant data without

map/shuffle• Used By:

– Reducer function• Assumes:

1. Mapper output for a given table constant across iterations

2. Static partitioning (implies: no new nodes)

• PageRank– Avoid shuffling the network at every step

• Transitive Closure– Avoid shuffling the graph at every step

• K-means– No help

Page 31: Presentation by  Carl Erhard & Zahid Mian

31

RO: Reducer Output Cache

Bill Howe, UW

• Provides:– Distributed access to output of previous

iterations• Used By:

– Fixpoint evaluation• Assumes:

1.Partitioning constant across iterations2.Reducer output key functionally

determines Reducer input key

• PageRank– Allows distributed fixpoint evaluation– Obviates extra MapReduce job

• Transitive Closure– No help

• K-means– No help

Page 32: Presentation by  Carl Erhard & Zahid Mian

32

MI: Mapper Input Cache

Bill Howe, UW

• Provides:– Access to non-local mapper input on later

iterations• Used:

– During scheduling of map tasks• Assumes:

1. Mapper input does not change

• PageRank– Subsumed by use of Reducer Input Cache

• Transitive Closure– Subsumed by use of Reducer Input Cache

• K-means– Avoids non-local data reads on iterations > 0

Page 33: Presentation by  Carl Erhard & Zahid Mian

33

• Cache Must be Reconstructed: Hosting Node Fails Hosting Node has Full Node (M/R Job Needs to be Scheduled

on a Different Substitution Node)

• Process is Transparent

Cache Rebuilding

Page 34: Presentation by  Carl Erhard & Zahid Mian

34

Results: Page Rank

-Run only for 10 iterations-Join and aggregate in every iteration-Overhead in first step for caching input

-Catches up soon and outperforms Hadoop.-Low shuffling time: time between RPC invocation by reducer and sorting of keys.

Page 35: Presentation by  Carl Erhard & Zahid Mian

35

Results: Descendant Query

-Join and duplicate elimination in every iteration.-Less striking Performance on LiveJournal: social network, high fan out, excessive duplicate generation which dominates the join cost and reducer input caching is less useful.

Page 36: Presentation by  Carl Erhard & Zahid Mian

36

Reducer Input Cache Benefit

Transitive Closure

Billion Triples Dataset (120GB)

90 small instances on EC2

Overall run time

Bill Howe, UW

Page 37: Presentation by  Carl Erhard & Zahid Mian

37

Reducer Input Cache Benefit

Transitive Closure

Billion Triples Dataset (120GB)

90 small instances on EC2

Overall run time

Livejournal, 12GB

Bill Howe, UW

Page 38: Presentation by  Carl Erhard & Zahid Mian

38

Reducer Input Cache Benefit

Bill Howe, UW

Transitive Closure

Billion Triples Dataset (120GB)

90 small instances on EC2

Reduce and Shuffle of Join Step

Livejournal, 12GB

Page 39: Presentation by  Carl Erhard & Zahid Mian

39

Reducer Input Cache Benefit

Join & compute rank

M

M

M

M

M

r

r

Ri

L-split1

L-split0M

M

r

r

Aggregate fixpoint evaluation

r

r

Page 40: Presentation by  Carl Erhard & Zahid Mian

40

Reducer Output Cache Benefit

Bill Howe, UW

Fixp

oint

eva

luat

ion

(s)

Iteration # Iteration #

Livejournal dataset

50 EC2 small instances

Freebase dataset

90 EC2 small instances

Page 41: Presentation by  Carl Erhard & Zahid Mian

41

Mapper Input Cache Benefit

Bill Howe, UW

5% non-local data reads; ~5% improvement

Page 42: Presentation by  Carl Erhard & Zahid Mian

42

Conclusions

• Relatively simple changes to MapReduce/Hadoop can support arbitrary recursive programs– TaskTracker (Cache management) – Scheduler (Cache awareness)– Programming model (multi-step loop bodies, cache control)

• Optimizations– Caching loop invariant data realizes largest gain– Good to eliminate extra MapReduce step for termination checks– Mapper input cache benefit inconclusive; need a busier cluster

• Future Work– Analyze expressiveness of Map Reduce Fixpoint– Consider a model of Map (Reduce+) Fixpoint

Page 43: Presentation by  Carl Erhard & Zahid Mian

43

The Good …

• Haloop extends MapReduce:– Easier programming of iterative algorithms– Efficiency improvement due to loop

awareness and caching– Lets users reuse major building blocks from

existing application implementations in Hadoop.

– Fully backward compatible with Hadoop.

Page 44: Presentation by  Carl Erhard & Zahid Mian

44

The Questionable …

• Only useful for algorithms which can be expressed as:

• Imposes constraints: fixed partition function for each iteration.

• Does not improve asymptotic running time. Still O(M+R) scheduling decisions, keep O(M*R) state in memory. And more overhead…

• Not completely novel: iMapReduce and Twister.• People still do iteration using traditional Map Reduce.

Google, Nutch, Mahout…

Page 45: Presentation by  Carl Erhard & Zahid Mian

45

BACKUP

Page 46: Presentation by  Carl Erhard & Zahid Mian

46

Page 47: Presentation by  Carl Erhard & Zahid Mian

47

API’s in Action – Descendant Query

Page 48: Presentation by  Carl Erhard & Zahid Mian

48

Descendant Query in HaLoop

MapJoin

ReduceJoin

MapDistinct

ReduceDistinct

∆S0

F

Page 49: Presentation by  Carl Erhard & Zahid Mian

49

Descendant Query in HaLoop

MapJoin

ReduceJoin

MapDistinct

ReduceDistinct

∆S0

F

Eric Eric #2

Eric Elisa #1

Elisa Tom #1

Elisa Harry #1

Page 50: Presentation by  Carl Erhard & Zahid Mian

50

Descendant Query in HaLoop

MapJoin

ReduceJoin

MapDistinct

ReduceDistinct

∆S0

F

Eric Eric 1

Eric Elisa 1

Elisa Tom 1

Elisa Harry 1

Eric Tom 1

Eric Harry 1

Elisa Eric 1

Page 51: Presentation by  Carl Erhard & Zahid Mian

51

Descendant Query in HaLoop

MapJoin

ReduceJoin

MapDistinct

ReduceDistinct

∆S0

F

Eric Eric

Eric Elisa

Elisa Tom

Elisa Harry

Eric Tom

Eric Harry

Elisa Eric

∆S1

Page 52: Presentation by  Carl Erhard & Zahid Mian

52

Descendant Query in HaLoop

MapJoin

ReduceJoin

MapDistinct

ReduceDistinct

∆S1

F

Eric Eric #2

Elisa Eric #2

Tom Eric #2

Harry Eric #2

Tom Elisa #2

Harry Elisa #2

Page 53: Presentation by  Carl Erhard & Zahid Mian

53

Descendant Query in HaLoop

MapJoin

ReduceJoin

MapDistinct

ReduceDistinct

∆S1

F

Eric Eric 2

Elisa Eric 2

Tom Eric 2

Harry Eric 2

Tom Elisa 2

Harry Elisa 2

Page 54: Presentation by  Carl Erhard & Zahid Mian

54

Descendant Query in HaLoop

MapJoin

ReduceJoin

MapDistinct

ReduceDistinct

∆S1

F

Eric Eric 2

Elisa Eric 2

Tom Eric 2

Harry Eric 2

Tom Elisa 2

Harry Elisa 2

Step Input

Eric Eric 1

Eric Elisa 1

Elisa Tom 1

Elisa Harry 1

Eric Tom 1

Eric Harry 1

Elisa Eric 1Eric Eric 0

Eric Eric

Tom Eric

Harry Eric

Tom Elisa

Harry Elisa

Eric Elisa

Eric Tom

Eric Harry

Elisa Tom

Elisa Harry

Step Input allows the union of all iterations to be input

Page 55: Presentation by  Carl Erhard & Zahid Mian

55

Fixpoint Algorithm Example