Top Banner
Tapestry Architecture and status UCB ROC / Sahara Retreat January 2002 Ben Y. Zhao [email protected]
26

Tapestry Architecture and status UCB ROC / Sahara Retreat January 2002 Ben Y. Zhao [email protected].

Mar 30, 2015

Download

Documents

Montana Janis
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Tapestry Architecture and status UCB ROC / Sahara Retreat January 2002 Ben Y. Zhao ravenben@eecs.berkeley.edu.

TapestryArchitecture and status

UCB ROC / Sahara RetreatJanuary 2002

Ben Y. [email protected]

Page 2: Tapestry Architecture and status UCB ROC / Sahara Retreat January 2002 Ben Y. Zhao ravenben@eecs.berkeley.edu.

ROC/Sahara Retreats, 1/2002 2

Why TapestryToday’s Internet Route failures not uncommon

BGP too slow to recover, redundant routes unexploited IPv4 constrains deployment of new protocols

IP multicast, security protocols (DDoS traceback), … Wide-area applications straining existing systems

Scalable management of large scale resources

Our goals Wide-area scalable network overlay

Highly fault-tolerant routing / location Introspective / self-tuning platform Support application-specific protocols Efficient (b/w, latency) data delivery

Pass on wide-area solutions to application layer

Page 3: Tapestry Architecture and status UCB ROC / Sahara Retreat January 2002 Ben Y. Zhao ravenben@eecs.berkeley.edu.

ROC/Sahara Retreats, 1/2002 3

What is Tapestry?

A prototype of a decentralized, fault-tolerant, adaptive overlay infrastructure(Zhao, Kubiatowicz, Joseph et al. 2000)

Network substrate of OceanStore Routing: Suffix-based hypercube

Similar to Plaxton, Rajamaran, Richa (SPAA97) Decentralized location:

Virtual hierarchy per object with cached location references

Dynamic algorithms using local information

Core API: publishObject(ObjectID) routeMsgToObject(ObjectID) routeMsgToNode(NodeID)

Page 4: Tapestry Architecture and status UCB ROC / Sahara Retreat January 2002 Ben Y. Zhao ravenben@eecs.berkeley.edu.

ROC/Sahara Retreats, 1/2002 4

Routing and Location

Namespace (nodes and objects) 160 bits length 280 names before name collision Each object has its own hierarchy rooted at Root

f (ObjectID) = RootID, via a dynamic mapping function

Suffix routing from A to B At hth hop, arrive at nearest node hop(h) such that:

hop(h) shares suffix with B of length h digits Example: 5324 routes to 0629 via

5324 2349 1429 7629 0629

Object location: Root responsible for storing object’s location Publish / search both route incrementally to root

Page 5: Tapestry Architecture and status UCB ROC / Sahara Retreat January 2002 Ben Y. Zhao ravenben@eecs.berkeley.edu.

ROC/Sahara Retreats, 1/2002 5

4

2

3

3

3

2

2

1

2

4

1

2

3

3

1

34

1

1

4 3

2

4

Tapestry MeshIncremental suffix-based routing

NodeID0x43FE

NodeID0x13FENodeID

0xABFE

NodeID0x1290

NodeID0x239E

NodeID0x73FE

NodeID0x423E

NodeID0x79FE

NodeID0x23FE

NodeID0x73FF

NodeID0x555E

NodeID0x035E

NodeID0x44FE

NodeID0x9990

NodeID0xF990

NodeID0x993E

NodeID0x04FE

NodeID0x43FE

Page 6: Tapestry Architecture and status UCB ROC / Sahara Retreat January 2002 Ben Y. Zhao ravenben@eecs.berkeley.edu.

ROC/Sahara Retreats, 1/2002 6

Object LocationRandomization and Locality

Page 7: Tapestry Architecture and status UCB ROC / Sahara Retreat January 2002 Ben Y. Zhao ravenben@eecs.berkeley.edu.

ROC/Sahara Retreats, 1/2002 7

Fault-tolerant Routing

Strategy: Detect failures via soft-state probe packets Route around problematic hop via backup pointers

Handling: 3 forward pointers per outgoing route

(2 backups) 2nd chance algorithm for intermittent failures Upgrade backup pointers and replace

Protocols: First Reachable Link Selection (FRLS) Proactive Duplicate Packet Routing

Page 8: Tapestry Architecture and status UCB ROC / Sahara Retreat January 2002 Ben Y. Zhao ravenben@eecs.berkeley.edu.

ROC/Sahara Retreats, 1/2002 8

Talk Outline

Tapestry overview

Architecture

Evaluation

Brocade

Conclude

Page 9: Tapestry Architecture and status UCB ROC / Sahara Retreat January 2002 Ben Y. Zhao ravenben@eecs.berkeley.edu.

ROC/Sahara Retreats, 1/2002 9

Architecture Background

OceanStore implementation Java with asynchronous I/O Event-based, stage driven architecture

(Sandstorm – M. Welsh)

Operating SystemJava Virtual Machine

Sandstorm (async I/O, event arch.)Tapestry

OceanStoreApplications

Page 10: Tapestry Architecture and status UCB ROC / Sahara Retreat January 2002 Ben Y. Zhao ravenben@eecs.berkeley.edu.

ROC/Sahara Retreats, 1/2002 10

Key StagesStaticTClient / Federation Uses config files to bootstrap initial Tapestry

DynamicTClient Integrates new nodes into static Tapestry

Router Primary handler of routing and location

Patchwork Introspective monitoring and fault-detection

Sandstorm (async I/O, event arch.)

OceanStore

Applications

RouterStatic TClientDynamic TClient

Patchwork

Page 11: Tapestry Architecture and status UCB ROC / Sahara Retreat January 2002 Ben Y. Zhao ravenben@eecs.berkeley.edu.

ROC/Sahara Retreats, 1/2002 11

Static TClient

Federation used as rendezvous point

Pair-wise pings to generate route tables

Federation used as global barrier to begin

FS

S

S

S1. Si says hello to F2. F informs group of Si

3. Nodes do pair-wise pings

4. Nodes signal readiness

5. Barrier reached at F, signals start

Page 12: Tapestry Architecture and status UCB ROC / Sahara Retreat January 2002 Ben Y. Zhao ravenben@eecs.berkeley.edu.

ROC/Sahara Retreats, 1/2002 12

Dynamic TClientNode Integration1. Hill-climb to find nearest Gateway2. Route to surrogate / copy routes3. Move relevant objects to new root4. Directed multicast notifies nearby nodes

G S

Routes Request

Routes Response

Moving Object Pointers

Directed Multicast?

F

Page 13: Tapestry Architecture and status UCB ROC / Sahara Retreat January 2002 Ben Y. Zhao ravenben@eecs.berkeley.edu.

ROC/Sahara Retreats, 1/2002 13

Routing / Location

Router class

Maintains: RoutingTable:

[ ][ ] of RouteEntries ObjectPointers:

Hash(Guid)PublishInfoHash(Guid)LastHop

Handles: Object publication / unpublication / mobile objects Route / location message handling

Page 14: Tapestry Architecture and status UCB ROC / Sahara Retreat January 2002 Ben Y. Zhao ravenben@eecs.berkeley.edu.

ROC/Sahara Retreats, 1/2002 14

Patchwork

Fault-handling / introspective stage Granulated periodic beacons measure loss and

network latency to entries in routing table Promote/demote routes in single RouteEntry

Routernetwork

XA

B

C

A B CB C A

Page 15: Tapestry Architecture and status UCB ROC / Sahara Retreat January 2002 Ben Y. Zhao ravenben@eecs.berkeley.edu.

ROC/Sahara Retreats, 1/2002 15

Deployment Status Object Location

Publish / unpublish / route to object Mobile objects (backtracking unpublish) Active deletes, confirmation of non-existence

General Routing Route to node, redundant routes Soft-state fault-detection, limited optimization Advanced policies for fault recovery

Dynamic Integration Integration w/ limited optimizations Best effort fault-resilient integration mechanisms

Background threads for optimization / refresh

Page 16: Tapestry Architecture and status UCB ROC / Sahara Retreat January 2002 Ben Y. Zhao ravenben@eecs.berkeley.edu.

ROC/Sahara Retreats, 1/2002 16

Talk Outline

Tapestry overview

Architecture

Evaluation

Brocade

Conclude

Page 17: Tapestry Architecture and status UCB ROC / Sahara Retreat January 2002 Ben Y. Zhao ravenben@eecs.berkeley.edu.

ROC/Sahara Retreats, 1/2002 17

Generalized Results

Cached object pointersEfficient lookup for nearby objectsReasonable storage overhead

Multiple object roots Improves availability under attack Improves performance and perf. stability

Reliable packet deliveryRedundant pointers approximate optimal

reachabilityFRLS, a simple fault-tolerant UDP protocol

Page 18: Tapestry Architecture and status UCB ROC / Sahara Retreat January 2002 Ben Y. Zhao ravenben@eecs.berkeley.edu.

ROC/Sahara Retreats, 1/2002 18

First Reachable Link Selection

Use periodic UDP packets to gauge link condition

Packets routed to shortest “good” link

Assumes IP cannot correct routing table in time for packet delivery

ABCDE

IP Tapestry

No path exists to dest.

Page 19: Tapestry Architecture and status UCB ROC / Sahara Retreat January 2002 Ben Y. Zhao ravenben@eecs.berkeley.edu.

ROC/Sahara Retreats, 1/2002 19

Some Numbers

Measurements PIII 800, L2.2.18, IBM JDK 1.3 Simulating 6 nodes

(4 staticTC, 1 federation, 1 dynamicTC) Publishing / locating ~10 objects PublishMsg, RouteMsg: ~ 0-2 ms Integration: ~2600ms (w/ pings)

Integration messages: Assuming latency data available 2 x n (routing and objects)

16M (directed multicast notification) (M 3)

Page 20: Tapestry Architecture and status UCB ROC / Sahara Retreat January 2002 Ben Y. Zhao ravenben@eecs.berkeley.edu.

ROC/Sahara Retreats, 1/2002 20

Talk Outline

Tapestry overview

Architecture

Evaluation

Brocade

Conclude

Page 21: Tapestry Architecture and status UCB ROC / Sahara Retreat January 2002 Ben Y. Zhao ravenben@eecs.berkeley.edu.

ROC/Sahara Retreats, 1/2002 21

Landmark Routing on P2P

Brocade Exploit non-uniformity Minimize wide-area routing hops / bandwidth

Secondary overlay on top of Tapestry Select super-nodes by admin. domain

Divide network into cover sets

Super-nodes form secondary Tapestry Advertise cover set as local objects

Routing (AB) uses brocade to route directly into B’s local network

Page 22: Tapestry Architecture and status UCB ROC / Sahara Retreat January 2002 Ben Y. Zhao ravenben@eecs.berkeley.edu.

ROC/Sahara Retreats, 1/2002 22

Brocade Mechanisms

Selective utilization Nodes cache local cover set Only utilize brocade if dest. not in cache

Forwarding messages to supernodes1. Super-node does IP-snooping

2. Direct: cover set caches supernode

Inter-domain routing: AB1. ASN(A) via IP

2. SN(A) finds SN(B) via Tapestry location

3. SN(B)B via Tapestry/Chord/Pastry/CAN

Page 23: Tapestry Architecture and status UCB ROC / Sahara Retreat January 2002 Ben Y. Zhao ravenben@eecs.berkeley.edu.

ROC/Sahara Retreats, 1/2002 23

Brocade Routing RDPBrocade Latency RDP 3:1

00.5

11.5

22.5

33.5

44.5

5

2 4 6 8 10 12 14 16 18 20 22 24 26

Interdomain-adjusted Latency on Optimal Route

Re

lati

ve

De

lay

Pe

na

lty

Original Tapestry IP Snooping Brocade Directed Brocade

Local cover set cache on; interdomain:intradomain = 3:1Packet simulator, Transit-stub 4096 T nodes, 16 SuperN

Page 24: Tapestry Architecture and status UCB ROC / Sahara Retreat January 2002 Ben Y. Zhao ravenben@eecs.berkeley.edu.

ROC/Sahara Retreats, 1/2002 24

Brocade Bandwidth UsageBrocade Aggregate Bandwidth Usage

0

10

20

30

40

50

60

2 4 6 8 10 12 14

Physical Hops in Optimal Route

Ap

pro

x. B

W p

er M

essa

ge

Original Tapestry IP Snooping Brocade Directed Brocade

Local cover set cache onB/W unit: (sizeof (Msg) * Hops)

Page 25: Tapestry Architecture and status UCB ROC / Sahara Retreat January 2002 Ben Y. Zhao ravenben@eecs.berkeley.edu.

ROC/Sahara Retreats, 1/2002 25

Ongoing / Future Work

Fill in full functionality Fault-handling policies, introspection, self-repair

More realistic experiments Artificial topologies on SOSS simulator Larger scale dynamic integration experiments

Code development External deployment / Code release

Sprint programmable routers Academic networks

Introspective measurement platform Implementing applications (Bayeux, Brocade … )

Page 26: Tapestry Architecture and status UCB ROC / Sahara Retreat January 2002 Ben Y. Zhao ravenben@eecs.berkeley.edu.

ROC/Sahara Retreats, 1/2002 26

For More Information

Tapestry and related projects (and these slides):http://www.cs.berkeley.edu/~ravenben/tapestry

OceanStore:http://oceanstore.cs.berkeley.edu

Related papers:http://oceanstore.cs.berkeley.edu/publications

http://www.cs.berkeley.edu/~ravenben/publications

[email protected]