Top Banner
A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003
66

A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

Jan 02, 2016

Download

Documents

Mariah Mason
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

A Scalable Content-Addressable Network (CAN)

Seminar “Peer-to-peer Information Systems”

Speaker Vladimir Eske

Advisor Dr. Ralf Schenkel

November 2003

Page 2: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

Content

1. Basic architecture

a. Data Model

b. CAN Routing

c. CAN construction

2. Architecture improvements

3. Summary

Page 3: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

What is CAN?

The goal was to make a scalable peer-to-peer file distribution system

Napster problem: centralized File Index

Gnutella problem: File Index completely decentralized

• There is a single point of failure: Low data availability• Non scalable : No way to decentralize it except to build a new system

• Network flood: Low data availability• Non scalable: No way to group data

CAN - Content Addressable Network

Page 4: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

What is CAN?

CAN - Distributed, Internet-Scale, Hash table.CAN provides Insertion, Lookup and Deletion operations under Key, Value pairs (K,V), e.g. file name, file address

• CAN is designed completely Distributed(does not require any centralized control)

• CAN design is Scalable, every part of the system maintains only a small amount of control state and independent of the # of parts

• CAN is Fault-tolerance (It provides a rooting even some part of the system is crashed)

CAN features

Page 5: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

CAN architecture 1

Hash Table works on d-dimension Cartesian coordinate space on D-torus

d-values hash function hash(K)=(x1, …, xd)

Cartesian distance

• Cyclical d-dimension Space

.

1-cartesian space, 0.5 + 0.7 = 0.2

Page 6: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

CAN architecture 1

Hash Table works on d-dimension Cartesian coordinate space on D-torus

d-values hash function hash(K)=(x1, …, xd)

Cartesian distance

• Cyclical d-dimension Space

.

Page 7: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

CAN architecture 1

Hash Table works on d-dimension Cartesian coordinate space on D-torus

d-values hash function hash(K)=(x1, …, xd)

Cartesian distance

• Cyclical d-dimension Space

.

0.40.5) mod (-0.60.5) mod p2)-((p1p2)1,CartDist(p

0.8p2 0.2;p122

Page 8: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

CAN architecture 1

Hash Table works on d-dimension Cartesian coordinate space on D-torus

d-values hash function hash(K)=(x1, …, xd)

Cartesian distance

• Cyclical d-dimension Space

.

Zone – chunk of the entire Hash Table, a piece of Cartesian space

Coordinate Zone

1-cartesian space, 0.5 + 0.7 = 0.2

Page 9: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

CAN architecture 1

Hash Table works on d-dimension Cartesian coordinate space on D-torus

d-values hash function hash(K)=(x1, …, xd)

Cartesian distance

• Cyclical d-dimension Space

.

Zone – chunk of the entire Hash Table, a piece of Cartesian space

Coordinate Zone

1-cartesian space, 0.5 + 0.7 = 0.2

Zone is a valid if it has a squared shape

Page 10: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

CAN architecture 2

CAN Nodes

• Node is machine in the network

• Node is not a Peer

• Node stores a chunk of Index (Hash Table)

• Every Node owns one distinct Zone

• Node stores a piece of Hash Table and all objects ([K,V] pairs) which belong to its Zone

• All Nodes together cover the whole Space (Hash Table)

Nodes own Zones

Page 11: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

CAN architecture 3

Neighbors in CAN

2 nodes are neighbors if their zones overlap among d-1 dimensions and abut along one dimension

• Node knows IP addresses of all its neighbor Nodes

• Node knows Zone coordinates of all neighbors

• Node can communicate only with its neighbors

Page 12: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

CAN architecture: Access

How to get an access to CAN system

1. CAN has an associated DNS domain

2. CAN domain name is resolved by DNS domain to Bootstrap server’s IP addresses

3. Bootstrap is special CAN Node which holds only a list of several Nodes are currently in the system

User scenario

1. A user wants to join the system and sends the request using CAN domain name

4. The user chooses one of them and establishes a connection.

2. DNS domain redirects it to one of Bootstraps

3. A Bootstrap sends a list of Nodes to the user

Page 13: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

CAN architecture: Access

How to get an access to CAN system

1. CAN has an associated DNS domain

2. CAN domain name is resolved by DNS domain to Bootstrap server’s IP addresses

3. Bootstrap is special CAN Node which holds only a list of several Nodes are currently in the system

User scenario

1. A user wants to join the system and sends the request using CAN domain name

4. The user chooses one of them and establishes a connection.

2. DNS domain redirects it to one of Bootstraps3. A Bootstrap sends a list of Nodes to the user

3 level access algorithmreduces the failure probability.

•DNS domain just redirect all requests

• Many Bootstraps

• Many Nodes in the Bootstrap list

Page 14: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

CAN: routing algorithm

1. Start from some Node

2. P = hash value of the Key

3. Greedy forwarding

Current Node:1. Checks whether it or its neighbors contain

the point P2. IF NOT

a. Orders the neighbors by Cartesian distance between them and the point P

b. Forward the search request to the closest one

c. Repeat step 13. OTHERWISE

The answer (Key, Value) pair is sent to the user

Page 15: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

CAN: routing algorithm

1. Start from some Node

2. P - hash value of the Key

3. Greedy forwarding

Current Node:1. Checks whether it or its neighbors contain

the point P2. IF NOT

a. Orders the neighbors by Cartesian distance between them and the point P

b. Forwards the search request to the closest one

c. Repeat step 13. OTHERWISE

The answer (Key, Value) pair is sent to the user

Page 16: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

CAN: routing algorithm

1. Start from some Node

2. P - hash value of the Key

3. Greedy forwarding

Current Node:1. Checks whether it or its neighbors contain

the point P2. IF NOT

a. Orders the neighbors by Cartesian distance between them and the point P

b. Forwards the search request to the closest one

c. Repeat step 13. OTHERWISE

The answer (Key, Value) pair is sent to the user

Page 17: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

CAN: routing algorithm

1. Start from some Node

2. P = hash value of the Key

3. Greedy forwarding

Current Node:1. Checks whether it or its neighbors

contain the point P2. IF NOT

a. Orders the neighbors by Cartesian distance between them and the point P

b. Forwards the search request to the closest one

c. Repeat step 13. OTHERWISE

The answer (Key, Value) pair is sent to the user

Page 18: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

CAN: routing algorithm

1. Start from some Node

2. P = hash value of the Key

3. Greedy forwarding

Current Node:1. Checks whether it or its neighbors contain

the point P2. IF NOT

a. Orders the neighbors by Cartesian distance between them and the point P

b. Forwards the search request to the closest one

c. Repeat step 13. OTHERWISE

The answer (Key, Value) pair is sent to the user

Page 19: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

CAN: routing algorithm

1. Start from some Node

2. P = hash value of the Key

3. Greedy forwarding

Current Node:1. Checks whether it or its neighbors contain

the point P2. IF NOT

a. Orders the neighbors by Cartesian distance between them and the point P

b. Forwards the search request to the closest one

c. Repeat step 13. OTHERWISE

The answer (Key, Value) pair is sent to the user

Page 20: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

CAN: routing algorithm

1. Start from some Node

2. P = hash value of the Key

3. Greedy forwarding

Current Node:1. Checks whether it or its neighbors

contain the point P2. IF NOT

a. Orders the neighbors by Cartesian distance between them and the point P

b. Forwards the search request to the closest one

c. Repeat step 13. OTHERWISE

The answer (Key, Value) pair is sent to the user

Page 21: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

CAN: routing algorithm

1. Start from some Node

2. P = hash value of the Key

3. Greedy forwarding

Current Node:1. Checks whether it or its neighbors

contain the point P2. IF NOT

a. Orders the neighbors by Cartesian distance between them and the point P

b. Forwards the search request to the closest one

c. Repeat step 13. OTHERWISE

The answer (Key, Value) pair is sent to the user

Page 22: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

CAN: routing algorithm

1. Start from some Node

2. P = hash value of the Key

3. Greedy forwarding

Current Node:1. Checks whether it or its neighbors contain

the point P2. IF NOT

a. Orders the neighbors by Cartesian distance between them and the point P

b. Forwards the search request to the closest one

c. Repeat step 13. OTHERWISE

The answer (Key, Value) pair is sent to the user

Page 23: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

CAN: routing algorithm

Average path length is average # hops should be done to reach a destination node

In the case when:1. All Zones have the same volume2. There is not any crashed Node

Total path length = 0 * 1 + 1 * 2d + 2 * 4d + 3 * 6d + 4 * 7d + 5 * 6d + 6 * 4d + 7 * 2d + 8 * 1

Page 24: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

CAN: routing algorithm

Average path length is average # should be done to reach a destination node

In the case when:1. All Zones have the same volume2. There is not any crashed Node

Total path length = 0 * 1 + 1 * 2d + 2 * 4d + 3 * 6d + 4 * 7d + 5 * 6d + 6 * 4d + 7 * 2d + 8 * 1

1*ni)d2(n*i1)d(n*2

n2id*i1*0 TPL 1/d

n

12

ni

1/d1/d1/d1

2n

1i

1/d

1/d

1/d

Page 25: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

CAN: routing algorithm

Average path length is average # should be done to reach a destination node

In the case when:1. All Zones have the same volume2. There is not any crashed Node

Total path length = 0 * 1 + 1 * 2d + 2 * 4d + 3 * 6d + 4 * 7d + 5 * 6d + 6 * 4d + 7 * 2d + 8 * 1

4n

*dNodes) of (# n

length) path (Total TPL length path Avg.

1/d

1*ni)d2(n*i1)d(n*2

n2id*i1*0 TPL 1/d

n

12

ni

1/d1/d1/d1

2n

1i

1/d

1/d

1/d

Page 26: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

CAN: routing algorithm

Fault tolerance routing

1. Start from some Node

2. P = hash value of the Key

3. Greedy forwarding

a. Before sending the request, the current node checks for neighbor’s availability

b. The request is sent to the best available node

Page 27: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

CAN: routing algorithm

Fault tolerance routing

1. Start from some Node

2. P = hash value of the Key

3. Greedy forwarding

a. Before sending the request, the current node checks for neighbor’s availability

b. The request is sent to the best available node

Page 28: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

CAN: routing algorithm

Fault tolerance routing

1. Start from some Node

2. P = hash value of the Key

3. Greedy forwarding

a. Before sending the request, the current node checks for neighbor’s availability

b. The request is sent to the best available node

Page 29: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

CAN: routing algorithm

Fault tolerance routing

1. Start from some Node

2. P = hash value of the Key

3. Greedy forwarding

a. Before sending the request, the current node checks for neighbor’s availability

b. The request is sent to the best available node

Page 30: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

CAN: routing algorithm

Fault tolerance routing

1. Start from some Node

2. P = hash value of the Key

3. Greedy forwarding

a. Before sending the request, the current node checks for neighbor’s availability

b. The request is sent to the best available node

The destination Node will be reachedIf there exists at least one path

Page 31: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

CAN construction: New Node arrival 1

1. Finding an access point

New Node, a server in internet wants to join the system and shares a piece of Hash Table.

1. New Node needs to get an access to the CAN

2. The system should allocate a piece of Hash Table to the New Node

3. New Node should start working in the system: provide routing

New Node uses the basic algorithm described later:

• Sends a request to the CAN domain name

• Gets a IP address of one of the Node currently in the system

•Connects to this Node

Page 32: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

CAN construction: New Node arrival 2

2. Finding a Zone

1. Randomly choose a point P

2. JOIN request is sent to the P-owner node

3. The request is forwarded via CAN routing

4. Desired node (P-owner) splits its Zone in half• One half is assigned to the New Node• Another half stays with Old Node

6. Hash table contents associated with New Node’s Zone are moved from Old Node to the New Node

5. Zone is split along only one dimension: The greatest dim. with the lowest order

Page 33: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

CAN construction: New Node arrival 2

2. Finding a Zone

1. Randomly choose a point P

2. JOIN request is sent to the P-owner node

3. The request is forwarded via CAN routing

4. Desired node (P-owner) splits its Zone in half• One half is assigned to the New Node• Another half stays with Old Node

6. Hash table contents associated with New Node’s Zone are moved from Old Node to the New Node

5. Zone is split along only one dimension: The greatest dim. with the lowest order

Page 34: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

CAN construction: New Node arrival 2

2. Finding a Zone

1. Randomly choose a point P

2. JOIN request is sent to the P-owner node

3. The request is forwarded via CAN routing

4. Desired node (P-owner) splits its Zone in half• One half is assigned to the New Node• Another half stays with Old Node

6. Hash table contents associated with New Node’s Zone are moved from Old Node to the New Node

5. Zone is split along only one dimension: The greatest dim. with the lowest order

Page 35: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

CAN construction: New Node arrival 2

2. Finding a Zone

1. Randomly choose a point P

2. JOIN request is sent to the P-owner node

3. The request is forwarded via CAN routing

4. Desired node (P-owner) splits its Zone in half• One half is assigned to the New Node• Another half stays with Old Node

6. Hash table contents associated with New Node’s Zone are moved from Old Node to the New Node

5. Zone is split among only one dimension: The greatest dim. with the lowest order

Page 36: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

CAN construction: New Node arrival 2

2. Finding a Zone

1. Randomly choose a point P

2. JOIN request is sent to the P-owner node

3. The request is forwarded via CAN routing

4. Desired node (P-owner) splits its Zone in half• One half is assigned to the New Node• Another half stays with Old Node

6. Hash table contents associated with New Node’s Zone are moved from Old Node to the New Node

5. Zone is split along only one dimension: The greatest dim. with the lowest order

Page 37: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

CAN construction: New Node arrival 3

3. Joining the routing

1. New Node gets a list of neighbors from Old Node (old owner of the split Zone)

2. Old Node refreshes its list of neighbors:• Removes the lost neighbors• Adds New Node

3. All neighbors get a message to update their neighbor lists:•Remove Old Node•Add New Node

Page 38: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

CAN construction: New Node arrival 3

3. Joining the routing

1. New Node gets a list of neighbors from Old Node (old owner of the split Zone)

2. Old Node refreshes its list of neighbors:• Removes the lost neighbors• Adds New Node

3. All neighbors get a message to update their neighbor lists:•Remove Old Node•Add New Node

Page 39: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

CAN construction: New Node arrival 3

3. Joining the routing

1. New Node gets a list of neighbors from Old Node (old owner of the split Zone)

2. Old Node refreshes its list of neighbors:• Removes the lost neighbors• Adds New Node

3. All neighbors get a message to update their neighbor lists:•Remove Old Node•Add New Node

Page 40: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

CAN construction: New Node arrival 3

3. Joining the routing

1. New Node gets a list of neighbors from Old Node (old owner of the split Zone)

2. Old Node refreshes its list of neighbors:• Removes the lost neighbors• Adds New Node

3. All neighbors get a message to update their neighbor lists:•Remove Old Node•Add New Node

Page 41: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

CAN construction: Node departure 1

Node departure

b. Otherwise one of the neighbors handles two different zones

a. If Zone of one of the neighbors can be merged with departing Node’s Zone to produce a valid Zone. This neighbors handles merged Zone

Page 42: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

CAN construction: Node departure 1

2. Node departure

b. Otherwise one of the neighbors handles two different zones

a. If Zone of one of the neighbors can be merged with departing Node’s Zone to produce a valid Zone. This neighbors handles merged Zone

Page 43: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

CAN construction: Node departure 1

1. Node departure

b. Otherwise one of the neighbors handles two different zones

a. If Zone of one of the neighbors can be merged with departing Node’s Zone to produce a valid Zone. This neighbors handles merged Zone

In both cases (a and b):1. Data from departing Node is moved to the

receiving Node

2. The receiving Node should update its neighbor list

3. All their neighbors are notified about changes and should update their neighbor lists

Page 44: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

CAN construction: Node departure 2

Node is crashed

1. Periodically every node sends a message to all its neighbors

2. If Node does not receive from one of its neighbors a message for period of time t it starts a TAKEOVER mechanism

3. It sends a takeover message to each neighbor of the crashed Node, the neighbor which did not send a periodical message

4. Neighbors receive a message and compare its own Zone with the Zone of the sender. If it has a smaller Zone it sends a new takeover message to all crashed Node neighbors.

5. The crashed Node’s Zone is handled by the Node which does not get an answer on its message for period of time t

Data stored on the crashed Node are unavailable until source owner refreshes the CAN state.

Page 45: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

CAN problems

Main problems:

1. Routing Latency

a. Path Latency - avg. # of hops per path

b. Hop Latency - avg. real hop duration

2. Increasing fault tolerance

3. Increasing data availability

Basic CAN architecture archives:

1. Scalability, State of distribution

2. Increasing data availability (Napster, Gnutella)

Page 46: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

Content

1. Basic architecture

a. Data Model

b. CAN Routing

c. CAN construction

2. Architecture improvements

3. Summary

a. Path Latency Improvement

b. Hop Latency Improvement

c. Mixed approaches

d. Construction Improvement

Page 47: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

Path latency Improvements 1

Realities: multiple coordinate spaces

• Maintain multiple (R) coordinate spaces with each Node

• Every Node contains different Zones in different Realities, all zones are chosen randomly

• Contents of hash table replicated on every reality

• Each coordinate Space is called Reality

• All Realities have The same # of Zones The same data The same hash function

Page 48: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

Path latency Improvements 2

The extended routing Algorithm for Realities

b. The request is forwarded in the best Reality

a. Every Node on the path checks in which of its realities a distance to the destination is the closest one

1. The destination Zone are the same for all realities

2. Each Zone can be own by many Nodes

3. For routing is applied a basic algorithm with following extensions:

Page 49: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

Path latency Improvements 2

The extended routing Algorithm for Realities

b. The request is forwarded in the best Reality

a. Every Node on the path checks in which of its realities a distance to the destination is the closest one

1. The destination Zone are the same for all realities

2. Each Zone can be own by many Nodes

3. For routing is applied a basic algorithm with following extensions:

Page 50: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

Path latency Improvements 2

The extended routing Algorithm for Realities

b. The request is forwarded in the best Reality

a. Every Node on the path checks in which of its realities a distance to the destination is the closest one

1. The destination Zone are the same for all realities

2. Each Zone can be own by many Nodes

3. For routing is applied a basic algorithm with following extensions:

Page 51: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

Path latency Improvements 3

Multi-dimensioned Coordinates Spaces

• Average path length is

• the # of dimensions d increases

• the average path Length decreases

)n*O(d 1/d

n = 1000, equal zones

d Avg. path length

2 15

3 7.5

5 5

10 4.95

Page 52: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

Multiple Dimensions vs. Multiple Realities

Path latency Improvements 4

Multiple Dimensions

Multiple Realities

Average # of neighbors

O(d) O(r*d)

Size of data store increasing

none r times

Data availability increasing

none O(r) times

Total path latency reduction

stronger strong

Page 53: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

Hop latency improvement

RTT CAN Routing Metrics

2. New Metrics: Cartesian Distance + RTT

1. RTT is Round Trip Time (ping)

• Expanded Node is the closest to the destination by Cartesian Distance

• RRT between current Node and expanded Node is minimal for all optimal Nodes

number of dimensions

routing without RTT (ms) per hop

routing with RTT (ms) per hop

2 116.8 88.3

3 116.7 76.1

4 115.8 71.2

5 115.4 70.9

Page 54: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

Mixed Improvement: Overloading Zones 1

Overloading coordinate zones

• One Zone – many Nodes

• MAXPEERS – max # of Nodes per Zone

• Every Node keeps list of its Peers

• The number of neighbors stays the same(O(1) in each direction)

•The general routing algorithm is used(from neighbor to neighbor)

Page 55: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

Mixed Improvement: Overloading Zones 2

Extended construction algorithm

New node A joins the system:

1. It discovers a Zone (owner Node B)

2. B checks: how many peers does it have

3. If less than MAXPEERS 1. A is added as a new Peer2. A gets a list of Peers and Neighbors from B

4. Otherwise1. Zone is split in half2. Peer list is split in half too3. Refresh the peer and neighbor lists

Page 56: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

Mixed Improvement: Overloading Zones 2

Extended construction algorithm

New node A joins the system:

1. It discovers a Zone (owner Node B)

2. B checks: how many peers does it have

3. If less than MAXPEERS 1. A is added as a new Peer2. A gets a list of Peers and Neighbors from B

4. Otherwise1. Zone is split in half2. Peer list is split in half too3. Refresh the peer and neighbor lists

Page 57: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

Mixed Improvement: Overloading Zones 2

Extended construction algorithm

New node A joins the system:

1. It discovers a Zone (owner Node B)

2. B checks: how many peers does it have

3. If less than MAXPEERS 1. A is added as a new Peer2. A gets a list of Peers and Neighbors from B

4. Otherwise1. Zone is split in half2. Peer list is split in half too3. Refresh the peer and neighbor lists

Page 58: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

Mixed Improvement: Overloading Zones 2

Extended construction algorithm

New node A joins the system:

1. It discovers a Zone (owner Node B)

2. B checks: how many peers does it have

3. If less than MAXPEERS 1. A is added as a new Peer2. A gets a list of Peers and Neighbors from B

4. Otherwise1. Zone is split in half2. Peer list is split in half too3. Refresh the peer and neighbor lists

Page 59: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

Mixed Improvement: Overloading Zones 2

Periodical self updating

1. Periodically, Node gets a peer list ofeach its neighbors

2. Node estimates a RRT to every node in peer list

3. Node chooses the closest peer Node as a New Neighbor Node in this direction

Page 60: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

Mixed Improvement: Overloading Zones 2

Periodical self updating

Approach Benefits

1. Periodically, Node gets a peer list ofeach its neighbors

2. Node estimates RRT to every node in peer list

3. Node chooses the closest peer Node as New Neighbor Node in this direction

• Reduced Path Latency (reduced # of Zones)

• Reduced Hop Latency (periodical self updating)

• Improved fault tolerance and data availability (Hash Table Contents are replicated among several Nodes)

MAXPEERS

Per-hop Latency (ms)

1 116.4

2 92.8

3 72.9

4 64.4

Page 61: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

CAN construction improvements

Uniform Partitioning

1. The Node to be split compares the volume of its Zone with Zones of its Neighbors

2. The Zone with the largest volume should be split

Page 62: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

CAN construction improvements

Uniform Partitioning

1. The Node to be split compares the volume of its Zone with Zones of its Neighbors

2. The Zone with the largest volume should be split

Page 63: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

CAN: Summary 1

Parameter “bare bones”CAN

“knobs on full” CAN

# of dimensions 2 10

MAXPEERS 0 4

RTT weighted routing metrics

OFF ON

Uniform partitioning OFF ON

Total Improvement

“bare bones” CAN uses only basic CAN architecture

“knobs on full” CAN uses most of additional design features

Page 64: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

CAN: Summary 2

Metric “bare bones” “knobs on full”

Avg. Path length 142.0 4.899

# of neighbors 4.2 24.4

# of peers 0 2.95

Data availability increasing

none 2.95 times (zones overloading)

Avg. Path Latency 19671 ms 135 ms

Page 65: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

CAN: Summary 3

CAN is scalable, distributed Hash Table

CAN provides:• Dynamical Zone allocation• Fault Tolerance Access Algorithm• Stable Fault Tolerance Routing Algorithm

There are many improve techniques which• Increase Routing Latency• Increase Data availability• Increase Fault Tolerance

The scalable, distributed, efficient P2P system was designed and developed

Page 66: A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

CAN: Summary 3

CAN is scalable, distributed Hash Table

CAN provides:• Dynamical Zone allocation• Fault Tolerance Access Algorithm• Stable Fault Tolerance Routing Algorithm

There are many improve techniques which• Increase Routing Latency• Increase Data availability• Increase Fault Tolerance

THANK YOU

The scalable, distributed, efficient P2P system was designed and developed