Top Banner
1 PASTRY PASTRY
28

1 PASTRY. 2 Pastry paper “ Pastry: Scalable, decentralized object location and routing for large- scale peer-to-peer systems ” by Antony Rowstron (Microsoft.

Jan 11, 2016

Download

Documents

Milton Wilcox
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1 PASTRY. 2 Pastry paper “ Pastry: Scalable, decentralized object location and routing for large- scale peer-to-peer systems ” by Antony Rowstron (Microsoft.

1

PASTRYPASTRY

Page 2: 1 PASTRY. 2 Pastry paper “ Pastry: Scalable, decentralized object location and routing for large- scale peer-to-peer systems ” by Antony Rowstron (Microsoft.

2

Pastry paper “Pastry: Scalable, decentralized object location and routing for

large-scale peer-to-peer systems” by Antony Rowstron (Microsoft Research) and Peter Druschel (Rice University), IFIP/ACM International Conference on Distributed Systems Platforms (Middleware), Heidelberg, Germany, pages 329-350, November, 2001

Pastry Homepage http://research.microsoft.com/en-us/um/people/antr/Pastry/default.htm

Sources

Page 3: 1 PASTRY. 2 Pastry paper “ Pastry: Scalable, decentralized object location and routing for large- scale peer-to-peer systems ” by Antony Rowstron (Microsoft.

Related work

Chord [Sigcomm’01] CAN [Sigcomm’01] Tapestry [TR UCB/CSD-01-1141]

PNRP [unpub.] Viceroy [PODC ’02] Kademlia [IPTPS ’02] Small World [Kleinberg ‘99, ‘00] Plaxton Trees [Plaxton et al. ‘97] Generalized Hypercube [Bhuyan et al. ‘84]

Page 4: 1 PASTRY. 2 Pastry paper “ Pastry: Scalable, decentralized object location and routing for large- scale peer-to-peer systems ” by Antony Rowstron (Microsoft.

4

Pastry

Generic p2p location and routing substrate (DHT)

Self-organizing overlay network (join, departures, locality repair)

Consistent hashing Lookup/insert object in < log2

b N routing steps

(expected) O(log N) per-node state Network locality heuristics

Scalable, fault resilient, self-organizing, locality aware, secure

Page 5: 1 PASTRY. 2 Pastry paper “ Pastry: Scalable, decentralized object location and routing for large- scale peer-to-peer systems ” by Antony Rowstron (Microsoft.

5

Pastry: Object distribution

objId/key

Consistent hashing

128 bit circular id space

nodeIds (uniform random)

objIds/keys (uniform random)

Invariant: node with numerically closest nodeId maintains object

nodeIds

O2128 - 1

Page 6: 1 PASTRY. 2 Pastry paper “ Pastry: Scalable, decentralized object location and routing for large- scale peer-to-peer systems ” by Antony Rowstron (Microsoft.

6

Pastry: Object insertion/lookup

X

Route(X)

Msg with key X is routed to live node with nodeId closest to X

Problem:

complete routing table not feasible

O2128 - 1

Page 7: 1 PASTRY. 2 Pastry paper “ Pastry: Scalable, decentralized object location and routing for large- scale peer-to-peer systems ” by Antony Rowstron (Microsoft.

CMPT 880: P2P Systems - SFU 7

Pastry Node

Represented by 128-bit randomly chosen nodeId (Hash of IP or public key)

NodeId is in base 2b (b is a configuration parameter; b typical value 2 or 4)

Evenly distributed nodeIds along the circular namespace (0-2128 – 1 space).

Routes a message in O(log N) steps to destination N: size of network

Node state contains: Leaf Set ( L ) Routing table ( R ) Neighborhood Set ( M )

Page 8: 1 PASTRY. 2 Pastry paper “ Pastry: Scalable, decentralized object location and routing for large- scale peer-to-peer systems ” by Antony Rowstron (Microsoft.

8

Pastry node state

Leaf set: L/2 Numerically closest nodes (L is a configuration parameter = 16, 32 typically )

Routing Table (Prefix-based)

Neighborhood Set: M physically closest nodes

Page 9: 1 PASTRY. 2 Pastry paper “ Pastry: Scalable, decentralized object location and routing for large- scale peer-to-peer systems ” by Antony Rowstron (Microsoft.

9

Pastry node state (Leaf Set)

Serves as a fall back for routing table and contains: L/2 numerically closest and larger nodeIds L/2 numerically closest and smaller

nodIds Size of L is typically 2b or 2 x 2b

Nodes in L are numerically close (could be geographically diverse)

Page 10: 1 PASTRY. 2 Pastry paper “ Pastry: Scalable, decentralized object location and routing for large- scale peer-to-peer systems ” by Antony Rowstron (Microsoft.

10

Pastry node state: Neighborhood set (M)

Contains the IP addresses and nodeIds of closest nodes according to proximity metric

Size of |M| is typically 2b or 2x2b

Not used in routing, but instead for maintaining locality properties

Page 11: 1 PASTRY. 2 Pastry paper “ Pastry: Scalable, decentralized object location and routing for large- scale peer-to-peer systems ” by Antony Rowstron (Microsoft.

11

Node state: Routing Table

Matrix of Log2b N rows and 2b – 1

columns (N is the number of nodes in the network) Entries in row n match the first n digits of

current nodeId AND Column number follows matched digits:

Format: matched digits–column number–rest of ID

Log2b N populated on average

Page 12: 1 PASTRY. 2 Pastry paper “ Pastry: Scalable, decentralized object location and routing for large- scale peer-to-peer systems ” by Antony Rowstron (Microsoft.

12

Node10233102 (2), (b = 2, l = 8)

0 1 2 302212102 22301203 31203203

11301233 12230203 1302102210031203 10132102 1032330210200230 10211302 102230210230322 10231000 1023212110233001 10233232

10233120

Page 13: 1 PASTRY. 2 Pastry paper “ Pastry: Scalable, decentralized object location and routing for large- scale peer-to-peer systems ” by Antony Rowstron (Microsoft.

13

Pastry: Routing

Tradeoff

O(log N) routing table size 2b * log2

bN + 2l

O(log N) message forwarding steps

Page 14: 1 PASTRY. 2 Pastry paper “ Pastry: Scalable, decentralized object location and routing for large- scale peer-to-peer systems ” by Antony Rowstron (Microsoft.

Prefix Routing Node IDs and keys from randomized namespace (SHA-1)

incremental routing towards destination ID each node has small set of outgoing routes log (n) neighbors per node, log (n) hops between any node pair

To: ABCE

ID: ABCE

A930

AB5F

ABC0

Page 15: 1 PASTRY. 2 Pastry paper “ Pastry: Scalable, decentralized object location and routing for large- scale peer-to-peer systems ” by Antony Rowstron (Microsoft.

15

Pastry: Routing table (# 10233102)

L nodes in leaf set

log2b N Rows

(actuallylog2b 2128=

128/b)

2b columns

L neighbors

Page 16: 1 PASTRY. 2 Pastry paper “ Pastry: Scalable, decentralized object location and routing for large- scale peer-to-peer systems ” by Antony Rowstron (Microsoft.

16

D: Message KeyLi: ith closest NodeId in leaf setshl(A, B): Length of prefix shared by nodes A and BRi

j: (j, i)th entry of routing table

(1) Node is in the leaf set

(2) Forward message to a closer node (Better match)

(3) Forward towards numericallyCloser node (not a better match)

Pastry: Routing procedure

Page 17: 1 PASTRY. 2 Pastry paper “ Pastry: Scalable, decentralized object location and routing for large- scale peer-to-peer systems ” by Antony Rowstron (Microsoft.

17

Pastry: Routing procedure

If (destination is within range of our leaf set) forward to numerically closest member

elselet l = length of shared prefix let d = value of l-th digit in D’s addressif (Rld exists)

forward to Rld

else forward to a known node (from ) that (a) shares at least as long a prefix(b) is numerically closer than this node

MRL

Page 18: 1 PASTRY. 2 Pastry paper “ Pastry: Scalable, decentralized object location and routing for large- scale peer-to-peer systems ” by Antony Rowstron (Microsoft.

CMPT 880: P2P Systems - SFU 18

If message with key D is within range of leaf set, forward to numerically closest leaf

Else forward to node that shares at least one more digit with D in its prefix than current nodeId

If no such node exists, forward to node that shares at least as many digits with D as current nodeId but numerically nearer than current nodeId

Pastry: Routing procedure

Page 19: 1 PASTRY. 2 Pastry paper “ Pastry: Scalable, decentralized object location and routing for large- scale peer-to-peer systems ” by Antony Rowstron (Microsoft.

19

Pastry: Routing

Properties• log2

b N steps • O(log N) state

d46a1c

Look for (d46a1c)

d462ba

d4213f

d13da3

65a1fc

d467c4d471f1

Page 20: 1 PASTRY. 2 Pastry paper “ Pastry: Scalable, decentralized object location and routing for large- scale peer-to-peer systems ” by Antony Rowstron (Microsoft.

20

Pastry: Locality properties

Assumption: scalar proximity metric e.g. ping/RTT delay, # IP hops traceroute, subnet masks a node can probe distance to any other node

Proximity invariant: Each routing table entry refers to a node closeto the local node (in the proximity space), amongall nodes with the appropriate nodeId prefix.

Page 21: 1 PASTRY. 2 Pastry paper “ Pastry: Scalable, decentralized object location and routing for large- scale peer-to-peer systems ” by Antony Rowstron (Microsoft.

21

Pastry: Geometric Routing in proximity space

d46a1c

Route(d46a1c)

d462ba

d4213f

d13da3

65a1fc

d467c4d471f1 d467c4

65a1fcd13da3

d4213f

d462ba

Proximity space

The proximity distance traveled by message in each routing step is exponentially increasing (entry in row l is chosen from a set of nodes of size N/2bl)The distance traveled by message from its source increases monotonically at each step (message takes larger and larger strides)

NodeId space

Page 22: 1 PASTRY. 2 Pastry paper “ Pastry: Scalable, decentralized object location and routing for large- scale peer-to-peer systems ” by Antony Rowstron (Microsoft.

22

Pastry: Locality properties

Each routing step is local, but there is no guarantee of globally shortest path

Nevertheless, simulations show: Expected distance traveled by a message

in the proximity space is within a small constant of the minimum

Among k nodes with nodeIds closest to the key, message likely to reach the node closest to the source node first

Page 23: 1 PASTRY. 2 Pastry paper “ Pastry: Scalable, decentralized object location and routing for large- scale peer-to-peer systems ” by Antony Rowstron (Microsoft.

23

Pastry: Self-organization

Initializing and maintaining routing tables and leaf sets

Node addition Node departure (failure)

The goal is to maintain all routing table entries

to refer to a near node, among all live nodes with appropriate prefix

Page 24: 1 PASTRY. 2 Pastry paper “ Pastry: Scalable, decentralized object location and routing for large- scale peer-to-peer systems ” by Antony Rowstron (Microsoft.

24

New node X contacts nearby node A A routes “join” message to X, which arrives

to Z, closest to X X obtains leaf set from Z, i’th row for

routing table from i’th node from A to Z X informs any nodes that need to be aware

of its arrival X also improves its table locality by requesting

neighborhood sets from all nodes X knows In practice: optimistic approach

Pastry: Node addition

Page 25: 1 PASTRY. 2 Pastry paper “ Pastry: Scalable, decentralized object location and routing for large- scale peer-to-peer systems ” by Antony Rowstron (Microsoft.

25

Pastry: Node addition

X=d46a1c

Route(d46a1c)

d462ba

d4213f

d13da3

A = 65a1fc

Z=d467c4d471f1

New node: X=d46a1c

A is X’s neighbor

Page 26: 1 PASTRY. 2 Pastry paper “ Pastry: Scalable, decentralized object location and routing for large- scale peer-to-peer systems ” by Antony Rowstron (Microsoft.

26

d467c4

65a1fcd13da3

d4213f

d462ba

Proximity space

Pastry: Node addition

New node: d46a1c

d46a1c

Route(d46a1c)

d462bad4213f

d13da3

65a1fc

d467c4d471f1

NodeId space

X is close to A, B is close to B1. Why X is close to B1?The expected distance from B to its row one entries (B1) is

much largerthan the expected distance from A to B (chosen from

exponentially decreasing set size)

X

B1 is first row of B

Page 27: 1 PASTRY. 2 Pastry paper “ Pastry: Scalable, decentralized object location and routing for large- scale peer-to-peer systems ” by Antony Rowstron (Microsoft.

27

Node departure (failure)

Leaf set repair (eager – all the time): Leaf set members exchange keep-alive

messages request set from furthest live node in set

Routing table repair (lazy – upon failure): get table from peers in the same row, if not

found – from higher rows Neighborhood set repair (eager)

Page 28: 1 PASTRY. 2 Pastry paper “ Pastry: Scalable, decentralized object location and routing for large- scale peer-to-peer systems ” by Antony Rowstron (Microsoft.

28

Pastry: Summary

Generic p2p overlay network Scalable, fault resilient, self-

organizing, secure O(log N) routing steps (expected) O(log N) routing table size Network locality properties