Top Banner
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors
32

Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.

Dec 14, 2015

Download

Documents

Kaleigh Forton
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.

Pastry

Peter Druschel, Rice University

Antony Rowstron, Microsoft Research UK

Some slides are borrowed from the original presentation by the authors

Page 2: Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.

Outline

• Background

• Pastry

• Pastry proximity routing

• PAST

Page 3: Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.

Background

Peer-to-peer systems

• distribution

• decentralized control

• self-organization

• symmetry (communication, node roles)

Page 4: Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.

Common issues

• Organize, maintain overlay network

• Resource allocation/load balancing

• Resource location

• Network proximity routing

Pastry provides a generic p2p substrate

Page 5: Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.

Architecture

TCP/IP

Pastry

Network storage

Event notification

Internet

P2p substrate (self-organizingoverlay network)

P2p application layer?

Page 6: Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.

Structured p2p overlays

The primitive route(M, X) routes message M to the live node with node Id closest to key X

Node ids and keys are from a large, sparse id space

Page 7: Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.

Distributed Hash Tables (DHT)

k6,v6

k1,v1

k5,v5

k2,v2

k4,v4

k3,v3

nodes

Operations:insert(k,v)lookup(k)

P2P overlay networ

k

P2P overlay networ

k

• p2p overlay maps keys to nodes• completely decentralized and self-organizing• robust, scalable

Page 8: Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.

Outline

• Background

• Pastry

• Pastry proximity routing

• PAST

• SCRIBE

• Conclusions

Page 9: Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.

Pastry: Object distribution

objId

Consistent hashing [Karger et al. ‘97]

128 bit circular id space

nodeIds (uniform random)

objIds (uniform random)

Invariant: node with numerically closest nodeId maintains object

nodeIds

O2128-1

Page 10: Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.

Pastry: Object insertion/lookup

X

Route(X)

Msg with key X is routed to live node with nodeId closest to X

Problem: complete

routing table not feasible

O2128-1

Page 11: Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.

Pastry: Routing

Tradeoff

• O(log N) routing table size

• O(log N) message forwarding steps

Page 12: Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.

Routing table of # 65a1fcx0x

1x

2x

3x

4x

5x

7x

8x

9x

ax

bx

cx

dx

ex

fx

60x

61x

62x

63x

64x

66x

67x

68x

69x

6ax

6bx

6cx

6dx

6ex

6fx

650x

651x

652x

653x

654x

655x

656x

657x

658x

659x

65bx

65cx

65dx

65ex

65fx

65a0x

65a2x

65a3x

65a4x

65a5x

65a6x

65a7x

65a8x

65a9x

65aax

65abx

65acx

65adx

65aex

65afx

log16 N rows

Row 0

Row 1

Row 2

Row 3

Page 13: Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.

Pastry: Routing

Propertieslog16 N steps O(log N) state

d46a1c

Route(d46a1c)

d462ba

d4213f

d13da3

65a1fc

d467c4d471f1

Page 14: Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.

Pastry: Leaf sets

Each node maintains IP addresses of the nodes with the L/2 numerically closest larger and smaller nodeIds, respectively.

• routing efficiency/robustness

• fault detection (keep-alive)

• application-specific local coordination

Page 15: Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.

Pastry: Routing procedureif (destination is within range of our leaf set)

forward to numerically closest memberelse

let l = length of shared prefix let d = value of l-th digit in D’s addressif (Rl

d exists) (Rld = entry at column d row l)

forward to Rld

else forward to a known node that (a) shares at least as long a prefix(b) is numerically closer than this node

Page 16: Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.

Pastry: Performance

Integrity of overlay/ message delivery:• guaranteed unless L/2 simultaneous failures

of nodes with adjacent nodeIds

Number of routing hops:• No failures: < log16 N expected, 128/b + 1 max

• During failure recovery:– O(N) worst case, average case much better

Page 17: Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.

Self-organization

How are the routing tables and leaf sets

initialized and maintained?

• Node addition

• Node departure (failure)

Page 18: Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.

Pastry: Node addition

d46a1c

Route(d46a1c)

d462ba

d4213f

d13da3

65a1fc

d467c4d471f1

New node: d46a1c

Page 19: Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.

Node departure (failure)

Leaf set members exchange heartbeat

• Leaf set repair (eager): request set from farthest live node in set

• Routing table repair (lazy): get table from peers in the same row, then higher rows

Page 20: Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.

Pastry: Experimental results

Prototype

• implemented in Java

• emulated network

• deployed currently at ~25 sites worldwide

Page 21: Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.

Pastry: Average # of hops

L=16, 100k random queries

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

1000 10000 100000

Number of nodes

Average number of hops

Pastry

Log(N)

Page 22: Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.

Outline

• Background

• Pastry

• Pastry proximity routing

Page 23: Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.

Pastry: Proximity routing

Proximity metric = time delay estimated by ping

A node can probe distance to any other node

Each routing table entry uses a node close to the local node (in the proximity space), among all nodes with the appropriate node Id prefix.

Page 24: Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.

Pastry: Routes in proximity space

d46a1c

Route(d46a1c)

d462ba

d4213f

d13da3

65a1fc

d467c4d471f1

NodeId space

d467c4

65a1fcd13da3

d4213f

d462ba

Proximity space

Page 25: Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.

Pastry: Distance traveled

L=16, 100k random queries, Euclidean proximity space

0.8

0.9

1

1.1

1.2

1.3

1.4

1000 10000 100000Number of nodes

Relative DistancePastry

Complete routing table

Page 26: Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.

Pastry: Locality properties

Expected distance traveled by a message in the proximity space is within a small constant of the minimum

Among k nodes with node Ids closest to the key, message likely to reach the node closest to the source node first

Page 27: Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.

d467c4

65a1fcd13da3

d4213f

d462ba

Proximity space

Pastry: Node addition

New node: d46a1c

d46a1c

Route(d46a1c)

d462ba

d4213f

d13da3

65a1fc

d467c4d471f1

NodeId space

Page 28: Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.

Pastry delay vs IP delay

0

500

1000

1500

2000

2500

0 200 400 600 800 1000 1200 1400

Distance between source and destination

Distance traveled by Pastry message

Mean = 1.59

GATech top., .5M hosts, 60K nodes, 20K random messages

Page 29: Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.

Pastry: API

• route(M, X): route message M to node with nodeId numerically closest to X

• deliver(M): deliver message M to application• forwarding(M, X): message M is being

forwarded towards key X• newLeaf(L): report change in leaf set L to

application

Page 30: Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.

Pastry: Security

• Secure nodeId assignment

• Secure node join protocols

• Randomized routing

• Byzantine fault-tolerant leaf set membership protocol

Page 31: Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.

Pastry: Summary

• Generic p2p overlay network

• Scalable, fault resilient, self-organizing, secure

• O(log N) routing steps (expected)

• O(log N) routing table size

• Network proximity routing

Page 32: Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.

PAST: File Retrieval

fileId file located in log16 N steps (expected)

usually locates replica nearest client C

Lookup

k replicasC