Distributed Systems Tree and Flood Algorithms Rik Sarkar University of Edinburgh 2016/2017
DistributedSystems
TreeandFloodAlgorithmsRikSarkar
UniversityofEdinburgh
2016/2017
DistributedComputaEon
• Howtosendmessagestoallnodesefficiently• Howtocomputesumsofvaluesatallnodesefficiently
• BroadcasEngmessages• CompuEngsumsinatree• CompuEngtreesinanetwork
DistributedSystems,Edinburgh,2016
Ref:NL
Networkasagraph
• Diameter– Themaximumdistancebetween2nodesinthenetwork
• Radius– Halfthediameter
• Spanningtreeofagraph:– Asubgraphwhichisatree,andreachesallnodesofthegraph
– Ifnetworkhasnnodes• Howmanyedgesdoesaspanningtreehave?
DistributedSystems,Edinburgh,2016
CompuEngsumsinatree
• Supposerootwantstoknowsumofvaluesatallnodes
DistributedSystems,Edinburgh,2016
root
CompuEngsumsinatree• Supposerootwantstoknowsumofvaluesatallnodes
• Itsends“compute”messagetoallchildren
• Theyforwardthemessagetoalltheirchildren(unlessitisaleafnode)
• Thevaluesmoveupwardfromleaves
• Eachnodeaddsvaluesfromallchildrenanditsownvalue
• Sendsittoitsparent
DistributedSystems,Edinburgh,2016
root
CompuEngsumsinatree
• Whatcanyoucomputeotherthansums?
• Howmanymessagesdoesittake?
• HowmuchEmedoesittake?
DistributedSystems,Edinburgh,2016
root
GlobalMessagebroadcast• Messagemustreachallnodesinthenetwork– DifferentfrombroadcasttransmissioninLAN– Allnodesinalargenetworkcannotbereachedwithsingletransmission
DistributedSystems,Edinburgh,2016
Source
GlobalMessagebroadcast• Messagemustreachallnodesinthenetwork– DifferentfrombroadcasttransmissioninLAN– Allnodesinalargenetworkcannotbereachedwithsingletransmissions
DistributedSystems,Edinburgh,2016
Source
FloodingforBroadcast
• ThesourcesendsaFloodmessagetoallneighbors
• Themessagehas– TypeFlood– Uniqueid:(sourceid,messageseq)– Data
DistributedSystems,Edinburgh,2016
FloodingforBroadcast
• ThesourcesendsaFloodmessage,withauniquemessageidtoallneighbors
• Everynodepthatreceivesafloodmessagem,doesthefollowing:– Ifm.idwasseenbefore,discardm– Otherwise,Addm.idtolistofpreviouslyseenmessagesandsendmtoallneighborsofp
DistributedSystems,Edinburgh,2016
Floodingforbroadcast
• Storage– Eachnodeneedstostorealistoffloodidsseenbefore
– Ifaprotocolrequiresxfloods,theneachnodemuststorexids• (thereisawaytoreducethis.Think!)
DistributedSystems,Edinburgh,2016
AssumpEons
• Weareassuming:– NodesareworkinginsynchronouscommunicaDonrounds(e.g.transmissionsoccurinintervalsof1secondexactly)
– MessagesfromallneighborsarriveatthesameEme,andprocessedtogether
– Ineachround,eachnodecansuccessfullysend1messagetoeachneighbor
– AnynecessarycomputaEoncanbecompletedbeforethenextround
DistributedSystems,Edinburgh,2016
CommunicaEoncomplexity
• Themessage/communicaEoncomplexityis:
DistributedSystems,Edinburgh,2016
CommunicaEoncomplexity
• Thethemessage/communicaEoncomplexityis:– O(|E|)
DistributedSystems,Edinburgh,2016
CommunicaEoncomplexity
• Thethemessage/communicaEoncomplexityis:– O(|E|)– Worstcase:O(n2)
DistributedSystems,Edinburgh,2016
ReducingCommunicaEoncomplexity(slightly)
• Nodepneednotsendmessagemtoanynodefromwhichithasalreadyreceivedm– Needstokeeptrackofwhichnodeshavesentthemessage
– Savessomemessages– DoesnotchangeasymptoEccomplexity
DistributedSystems,Edinburgh,2016
Timecomplexity
• Thenumberofroundsneededtoreachallnodes:diameterofG
DistributedSystems,Edinburgh,2016
CompuEngTreefromanetwork
• BFStree– TheBreadthfirstsearchtree– Withaspecifiedrootnode
DistributedSystems,Edinburgh,2016
BFSTree
• Breadthfirstsearchtree– Everynodehasaparentpointer– Andzeroormorechildpointers
– BFSTreeconstrucEonalgorithmsetsthesepointers
DistributedSystems,Edinburgh,2016
BFSTreeConstrucEonalgorithm• Breadthfirstsearchtree– Theroot(source)nodedecidestoconstructatree– Usesfloodingtoconstructatree– Everynodepongebngthemessageforwardstoallneighbors
– AddiEonally,everynodepstoresparentpointer:nodefromwhichitfirstreceivedthemessage• IfmulEpleneighborshadfirstsentpthemessageinthesameround,chooseparentarbitrarily.E.g.nodewithsmallestid
– pinformsitsparentoftheselecEon• Parentcreatesachildpointertop
DistributedSystems,Edinburgh,2016
BFSTree
• Property:BFStreeisashortestpathtree– Forsourcesandanynodep– TheshortestpathbetweensandpiscontainedintheBFStree
DistributedSystems,Edinburgh,2016
Time&messagecomplexity
• AsymptoEcallySameasFlooding
DistributedSystems,Edinburgh,2016
root
Treebasedbroadcast
• Sendmessagetoallnodesusingtree– BFStreeisaspanningtree:connectsallnodes
• Floodingonthetree
• Receivemessagefromparent,sendtochildren
DistributedSystems,Edinburgh,2016
root
Treebasedbroadcast
• Simplerthanflooding:sendmessagetoallchildren
• CommunicaEon:Numberofedgesinspanningtree:n-1
DistributedSystems,Edinburgh,2016
AggregaEon:Findthesumofvaluesatallnodes
• WithBFStree
• Startfromleafnodes– Nodeswithoutchildren– Sendthevaluetoparent
• Everyothernode:– Waitforallchildrentoreport– Sumvaluesfromchildren+ownvalue– Sendtoparent
DistributedSystems,Edinburgh,2016
AggregaEon
• Withoutthetree• Floodfromallnodes:– O(|E|)costpernode– O(n*|E|)totalcost:expensive– Eachnodeneedstostorefloodidsfromnnodes• RequiresΩ(n)storageateachnode
– Goodfaulttolerance• IfafewnodesfailduringoperaEon,alltherestsEllgetsomevalue
DistributedSystems,Edinburgh,2016
AggregaEon
• WithTree
• AlsocalledConvergecast
DistributedSystems,Edinburgh,2016
AggregaEon• WithTree
• Oncetreeisbuilt,anynodecanuseforbroadcast– Justfloodonthetree
• Anynodecanuseforconvergecast– FirstfloodamessageonthetreerequesEngdata– Nodesstoreparentpointer– Thenreceivedata
• WhatisthedrawbackoftreebasedaggregaEon?
DistributedSystems,Edinburgh,2016
AggregaEon• WithTree
• Oncetreeisbuilt,anynodecanuseforbroadcast– Justfloodonthetree
• Anynodecanuseforconvergecast– FirstfloodamessageonthetreerequesEngdata– Nodesstoreparentpointer– Thenreceivedata
• Faulttolerancenotverygood– Ifanodefails,themessagesinitssubtreewillbelost– WillneedtorebuildthetreeforfutureoperaEons
DistributedSystems,Edinburgh,2016
CompuEngTrees:
• Whatiftheedgeshaveweights?
DistributedSystems,Edinburgh,2016
AggregaEonusingTrees:
• Whatiftheedgeshaveweights?• ThecostmaynotbeO(n)sinceweightscanbehigher
• Howtogetthebestperformance?
DistributedSystems,Edinburgh,2016
Minimumspanningtreeis
• Aspanningtree(reachesallnodes)• Withminimumpossibletotalweight
• Howcanwecomputeaminimumspanningtreeefficientlyinadistributedsystem?
• (remember,anodeknowsonlyitsneighborsandedgeweights)
DistributedSystems,Edinburgh,2016