Top Banner
Giraph Neil Butcher
20

Giraph - University of Notre Dame · 2018. 10. 2. · •Developed by Apache •Based off ‘Pregel’ •Utilizes Hadoop MapReduce framework to target graph problems •Open Source

Aug 17, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Giraph - University of Notre Dame · 2018. 10. 2. · •Developed by Apache •Based off ‘Pregel’ •Utilizes Hadoop MapReduce framework to target graph problems •Open Source

Giraph

NeilButcher

Page 2: Giraph - University of Notre Dame · 2018. 10. 2. · •Developed by Apache •Based off ‘Pregel’ •Utilizes Hadoop MapReduce framework to target graph problems •Open Source

Background• Giraph scalableplatformforimplementinggraphalgorithms

• DevelopedbyApache• Basedoff‘Pregel’• UtilizesHadoopMapReduceframeworktotargetgraphproblems

• OpenSource

1

Page 3: Giraph - University of Notre Dame · 2018. 10. 2. · •Developed by Apache •Based off ‘Pregel’ •Utilizes Hadoop MapReduce framework to target graph problems •Open Source

Advantages of Solving Problems with Giraph• Message-basedcommunication:nolocks• Globalsynchronization:nosemaphores• Simpletoprogram• Massivelyparallel:taskbasedprogramming• Faulttolerant:Savesintermediateresults

2

Page 4: Giraph - University of Notre Dame · 2018. 10. 2. · •Developed by Apache •Based off ‘Pregel’ •Utilizes Hadoop MapReduce framework to target graph problems •Open Source

Giraph Algorithms: Basic Idea• Algorithmsarewrittenfromtheperspectiveofavertex

• Verticessendmessagestoeachothertosharepertinentinformation

3

Page 5: Giraph - University of Notre Dame · 2018. 10. 2. · •Developed by Apache •Based off ‘Pregel’ •Utilizes Hadoop MapReduce framework to target graph problems •Open Source

How it Works• ’compute’ functionhasabilityto:– modifystateofvertexanditsoutgoingedges– Cansendmessagestoothervertices– Receivemessagessentinprevioussuperstep

• Thingsthathappenduringasuperstep:– A‘compute’functionisinvokedoneachvertexthatreceivedamessageintheprevioussuperstep

– Nextsuperstep beginsonly afterallverticeshavecompletedtheirwork

– Ifnomessagesareinflight,haltprogram4

Page 6: Giraph - University of Notre Dame · 2018. 10. 2. · •Developed by Apache •Based off ‘Pregel’ •Utilizes Hadoop MapReduce framework to target graph problems •Open Source

Single Source Shortest Path Algorithm

5

Readupdatesfromothervertices,findminimum

Senddistancetoothervertices

Page 7: Giraph - University of Notre Dame · 2018. 10. 2. · •Developed by Apache •Based off ‘Pregel’ •Utilizes Hadoop MapReduce framework to target graph problems •Open Source

Single Source Shortest Path Example

6

Page 8: Giraph - University of Notre Dame · 2018. 10. 2. · •Developed by Apache •Based off ‘Pregel’ •Utilizes Hadoop MapReduce framework to target graph problems •Open Source

Single Source Shortest Path Example

7

Page 9: Giraph - University of Notre Dame · 2018. 10. 2. · •Developed by Apache •Based off ‘Pregel’ •Utilizes Hadoop MapReduce framework to target graph problems •Open Source

Single Source Shortest Path Example

8

Page 10: Giraph - University of Notre Dame · 2018. 10. 2. · •Developed by Apache •Based off ‘Pregel’ •Utilizes Hadoop MapReduce framework to target graph problems •Open Source

Single Source Shortest Path Example

9

Page 11: Giraph - University of Notre Dame · 2018. 10. 2. · •Developed by Apache •Based off ‘Pregel’ •Utilizes Hadoop MapReduce framework to target graph problems •Open Source

Single Source Shortest Path Example

10

Page 12: Giraph - University of Notre Dame · 2018. 10. 2. · •Developed by Apache •Based off ‘Pregel’ •Utilizes Hadoop MapReduce framework to target graph problems •Open Source

More Complex Example: PageRank

11

Page 13: Giraph - University of Notre Dame · 2018. 10. 2. · •Developed by Apache •Based off ‘Pregel’ •Utilizes Hadoop MapReduce framework to target graph problems •Open Source

Giraph Job Lifetime

12

Page 14: Giraph - University of Notre Dame · 2018. 10. 2. · •Developed by Apache •Based off ‘Pregel’ •Utilizes Hadoop MapReduce framework to target graph problems •Open Source

Implementing Algorithm in Giraph• DefineaVertex class– Subclassofexistingimplementations

• DefineaVertexInputFormat toreadthegraph• DefineVertexOutputFormat thatdefineshowtoextractresultbasedonVertexfinalstate

• Manyotherfeaturescanbeutilizedtoimproveperformance

13

Page 15: Giraph - University of Notre Dame · 2018. 10. 2. · •Developed by Apache •Based off ‘Pregel’ •Utilizes Hadoop MapReduce framework to target graph problems •Open Source

Aggregators• Eachvertexcanstorevaluesthatcanbereadbyallverticesinproceedingsuperstep

• Canmaintainvalues(sum,min,max,accumulate,userdefined,ect)

• Aggregatorsmustberegisteredonmaster

14

Page 16: Giraph - University of Notre Dame · 2018. 10. 2. · •Developed by Apache •Based off ‘Pregel’ •Utilizes Hadoop MapReduce framework to target graph problems •Open Source

Combiners• Userdefinedfunctiontocombinemessagesbeforebeingsentordelivered

• Savesonnetworkandmemory

15

Page 17: Giraph - University of Notre Dame · 2018. 10. 2. · •Developed by Apache •Based off ‘Pregel’ •Utilizes Hadoop MapReduce framework to target graph problems •Open Source

Checkpointing• Canbeexpensivebutnecessary• Ensuresnosinglepointoffailure• Storeworkatuserdefinedintervals• Restartonfailure

16

Page 18: Giraph - University of Notre Dame · 2018. 10. 2. · •Developed by Apache •Based off ‘Pregel’ •Utilizes Hadoop MapReduce framework to target graph problems •Open Source

Zookeeper Responsibilities: Computation State • Handlespartition/workermapping• Globalstate• Checkpointpaths,aggregatorvalues,statistics

17

Page 19: Giraph - University of Notre Dame · 2018. 10. 2. · •Developed by Apache •Based off ‘Pregel’ •Utilizes Hadoop MapReduce framework to target graph problems •Open Source

Master Responsibilties: Coordination

• Assignspartitionstoworkers– Hashmapping isdefault– Canbeuserdefined

• Monitorsworkers• Coordinatessupersteps (ending,startingect)

18

Page 20: Giraph - University of Notre Dame · 2018. 10. 2. · •Developed by Apache •Based off ‘Pregel’ •Utilizes Hadoop MapReduce framework to target graph problems •Open Source

Worker Responsibilities: Vertices

• Workersareassignedvertices• Performcompute• Passmessagesbetweenvertices• Computeslocalaggregationvalues

19