Top Banner
Corona Robust Low Atomicity Peer-To-Peer Systems Rizal Mohd Nor Mikhail Nesterenko Christian Scheideler
40

Corona Robust Low Atomicity Peer-To-Peer Systems

Feb 16, 2016

Download

Documents

Bart

Rizal Mohd Nor Mikhail Nesterenko Christian Scheideler. Corona Robust Low Atomicity Peer-To-Peer Systems. Overlay Networks and Stabilization. self-stabilization guarantees that starting from arbitrary initial state, system will converge to legal state in finite time. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript

PowerPoint Presentation

CoronaRobust Low Atomicity Peer-To-Peer SystemsRizal Mohd NorMikhail NesterenkoChristian Scheideler

1Overlay Networks and Stabilizationself-stabilization guarantees thatstarting from arbitrary initial state, system will converge to legal state in finite time

2Overlay Networks and Stabilizationself-stabilization guarantees thatstarting from arbitrary initial state, system will converge to legal state in finite timestabilization is good for overlay networks:overlay network is effective way to distribute information at scalemany users constantly leave and joinfaults and inconsistencies are the normesoteric faulty states may be reachedlarge scale precludes centralized fault tolerance and initialization

3Outlineoverlay networks and programming modelnecessary conditionscoronabottom levelskip-list4Overlay Network Terminologyin overlay network, any pair of peers can establish a linktopologyunstructured: low mainte-nance, high search coststructured : predictable performance, fast search,higher maintenanceour focus: structured overlay networks

5Related WorkSelf-stabilizing overlay networks:Sorted lists and rings [Cramer & Fuhrmann 2005] [Shaker & Reeves 2005] [Onus & Richa & S 2007] Skip lists [Clouser & Nesterenko & S 2008]Delaunay graph [Jacob & Ritscher & S & Schmid 2009]Skip graph [Jacob & Richa & S & Schmid & Tubig 2009]Chord [Kniesburges & Koustopoulos & S 2011]De Bruijn graph [Richa & S & Stevens 2011]Universal [Berns & Ghosh & Pemmaraju 2011]

Only simple communication models: shared memory or synchronous message passing

Our ApproachSelf-stabilizing skip list (and skip graph) for asynchronous message passing model

Basic assumptions:actions: triggered by messages (or predicate true) , only compare-store-sendatomic action executionweak fairness: action enabled in all but finitely many states will be executed infinitely oftenfair message receipt: every message eventually received (i.e., triggers an action) no FIFO needed!Outlineoverlay networks and programming modelnecessary conditionscoronabottom levelskip-list8Necessary ConditionsGraph model:nodes with unique, ordered identifiers (ids)directed links application level: knowledge (msgs, connections)network level: reliable channels of unbounded capacityabcde9Necessary Conditionsgraph has to be initially weakly connected

ids of non-existing nodes are not present in the system.abcdeA disconnected graph, will always remain disconnectedabcdea non-existing identifier, or an unresponsive one,may result in a graph to be disconnected foreverOutlineoverlay networks and programming modelnecessary conditionscoronabottom levelskip-list11Skip-Listskip-list: enables searches & updates in logarithmic work0th is a sorted list of nodes (peers)only a fraction of the nodes is promoted to each subsequent level1-2 skip-list: at each level i>0, any two nodes a and b are neighbors at level i if the distance between a and b at level i-1 is no less than 2 and no more than 3 hops

0a123bcdefghik12Execution Modelnotationleft node lower idright node higher idprocess UP state promoted to upper levelsprocess DOWN state not promoted to upper levelschannel links act as storageabcde13L-corona (bottom level)invariant: at the bottom level, each node stores IDs of closest left and right neighbors seen so far (in l and r)actionsreceive ID from right: update r, forward remaining ID to rreceive ID from left:update l, forward remaining ID to l true: send own ID to r & labcde14L-coronaNode a sends its idabcdeaactionsreceive ID from right: update r, forward remaining ID to rreceive ID from left:update l, forward remaining ID to l true: send own ID to r & l15L-coronaNode e receives a's id, forwards to cabcdeaactionsreceive ID from right: update r, forward remaining ID to rreceive ID from left:update l, forward remaining ID to l true: send own ID to r & l16L-coronaNode c receives a's id, forwards to babcdeaactionsreceive ID from right: update r, forward remaining ID to rreceive ID from left:update l, forward remaining ID to l true: send own ID to r & l17L-coronaNode b receives a's id, set its new left neighborabcdeactionsreceive ID from right: update r, forward remaining ID to rreceive ID from left:update l, forward remaining ID to l true: send own ID to r & l18L-coronaNode b sendsits id to node aabcdebactionsreceive ID from right: update r, forward remaining ID to rreceive ID from left:update l, forward remaining ID to l true: send own ID to r & l19L-coronaNode a receives b's id, set its new right neighborabcdeactionsreceive ID from right: update r, forward remaining ID to rreceive ID from left:update l, forward remaining ID to l true: send own ID to r & l20L-coronaProcesses execute actions one by onec1.abcde21L-coronae1.2.abcdecabcdedProcesses execute actions one by one22L-corona1.2.3.abcdecabcdedabcdedProcesses execute actions one by one23L-corona1.2.3.4.abcdecabcdedabcdedabcdeProcesses execute actions one by one24L-coronae1.2.3.4.5.abcdecabcdedabcdedabcdeabcdeProcesses execute actions one by one25L-coronaabcdeabcdecdabcdedabcded1.2.3.4.5.6.eabcdeabcdeeProcesses execute actions one by one26L-coronaabcdeabcdecdabcdedabcded1.2.3.4.5.6.7.eabcdeabcdeeeabcdeProcesses execute actions one by one27S-coronaconstructs the 1-2 skip list at each level. lower levels have to stabilize before level can stabilize.actions for process preceive state from right: if (right & p state = UP) then p state := DOWNreceive state from left:if (left & right & p state = DOWN) then p state := UP true: send state to right & left nodeslevel ilevel i+1Starting from an arbitrary state where level i is linearized from L-coronaState is always UP if right neighbor is undefined28S-coronaactions for process preceive state from right: if (right & p state = UP) then p state := DOWNreceive state from left:if (left & right and p state = DOWN) then p state := UP true: send state to right & left nodesLevel iaLevel i+1bcdefghikLeft and right neighbor stateis DOWNLevel iaLevel i+1bcdefghikChange stateto UP29S-coronaLevel iaLevel i+1bcdefghikLevel iaLevel i+1bcdefghikLevel iaLevel i+1bcdefghik30Correctness Proof Outlinelevel by level stabilization proofassume bottom level is connected. stabilization of L-coronaclosure link connecting consequent nodes is never removedconvergence path connecting consequent nodes is always shortenedstabilization of S-corona (by level)assume lower level(s) are stable,closure a node UP/DOWN state only changes a finite number of timesconvergence eventually satisfy the predicate rules of a skip-list31Skip-grapha single node can disconnect a skip list. solution: concurrently construct a skip-list of out the remaining nodes that are not upgraded to the next levels.0a123bcdefghikbehkekkabcdefghikdgkdkkSkip-graph with 2 layersbConclusionCorona: first self-stabilizing overlay network under asynchronous message passing

Open problems:What if ids of non-existing nodes are present in systemChurnByzantine behaviorFault containment

Any questions?

Topology Updates (Joins)Joins require L-corona to linearize its lower level, then S-corona will make sure correct 1-2 skip list structureConsider a node c' joining at node cLevel iLevel i+1bcdc'Node c' sendsits ID to caLevel iLevel i+1bcdc'aNode c & dupdatesright and leftneighbor followingL-corona alg.34Topology Updates (Joins)Lower level stabilizes, S-corona can correctly stabilizeLevel iLevel i+1bcc'adLevel iLevel i+1bcc'aOnly node c will be enabled to change state to UPby following the S-corona alg.d35Topology Updates (departures)Similar to joins, L-corona must stabilize before S-corona can correctly stabilize.

Level iLevel i+1bcc'aDeparting node cddLevel iLevel i+1bcaFrom the algorithmnode c receives noded UP statusLevel iLevel i+1bcanode c is and node d is UP, node c changesits state to DOWNd36Minimal Detectorsrecall: it is necessary for the channel connectivity graphs to be initially connected non-existing identifiers to be not present in the system.thus, node crashes have to be removed from the systemproposed solution:redesign detectors used by Chandra & Toueg to detect failed nodes.remove crashed ids challengeidentifying the weakest kind of detectors to be sufficient for self-stabilizationidentify absolute minimality of these detectors.Byzantine fault containmentByzantine node is a faulty node that behaves arbitrarily can corrupt its own local statesend arbitrary messages to neighboring nodesmay also send messages aimed to disrupt the peer-to-peer systemmany security attacks are modeled as Byzantine faults propose Byzantine fault containmentextend Corona to identify subset of Byzantine nodes isolate Byzantine nodes from affecting other nodesFault ContainmentCorona is a self-stabilizing algorithm designed toconverge to a desired stable state starting from an arbitrary initial state arising from a large number of faults. However, large number of faults in a peer-to-peer system are rarelymostly, faults are limited to a specific regionmotivation:contain the effects of limited transient fault limited to a local set of nodesdoes not perturb the operations of the larger set in the network.propose fault containmentextend Corona's algorithm to recover very fast from a bounded number of faults. Resistance to ChurnChurnparticipating nodes join and leave the system continuously. may cause a partition in the topologymay render the topology to be inefficient. proposeimprove Corona to maintain correct topology despite churn.methodsredesign Corona to allow non-interfering concurrent topology updates extend self-stabilizing dining philosophers allow only one neighbor to modify its state and links at a time. redesign Corona to allow non-interfering concurrent topology updates