-
1
FOUNDATION OFFOUNDATION OF
PEERPEER--TOTO--PEERPEER
SYSTEMSSYSTEMS
LECT-10, S-39FP2P08, [email protected]
Javed I. Khan@2008
Pastry Advanced Idea: Pastry Advanced Idea: Pastry Advanced
Idea:
Proximity RoutingProximity RoutingProximity Routing
FOUNDATION OFFOUNDATION OF
PEERPEER--TOTO--PEERPEER
SYSTEMSSYSTEMS
LECT-10, S-40FP2P08, [email protected]
Javed I. Khan@2008
Teaching NoteTeaching Note
• [Euclidian distance]
• [notion of progressive distance- the rare is the
address the far away we have to travel]
• [village, earth, orbit, solar system, constellation
analogy]
-
2
FOUNDATION OFFOUNDATION OF
PEERPEER--TOTO--PEERPEER
SYSTEMSSYSTEMS
LECT-10, S-41FP2P08, [email protected]
Javed I. Khan@2008
Pastry: Proximity RoutingPastry: Proximity Routing
• The basic routing of Pastry is based on the notion of
numerical proximity of the source and destination nodes ids and
expected pastry routing hops.
• The demonstrated routing is complete because it will find the
closest peer. The expected number of pastry hops on O( log n).
• But still it may be non-optimum in terms of physical routing
hops and distances.
• However, Pastry’s routing efficiency can be improved according
to second notion of proximity.
FOUNDATION OFFOUNDATION OF
PEERPEER--TOTO--PEERPEER
SYSTEMSSYSTEMS
LECT-10, S-42FP2P08, [email protected]
Javed I. Khan@2008
Internet Distance vs. Identifier Distance Internet Distance vs.
Identifier Distance
65a1fcd46a1c
d34213f
d467c4
-
3
FOUNDATION OFFOUNDATION OF
PEERPEER--TOTO--PEERPEER
SYSTEMSSYSTEMS
LECT-10, S-43FP2P08, [email protected]
Javed I. Khan@2008
Scalar Proximity MetricScalar Proximity Metric
• Various measures of distance can be used as a scalar proximity
metric which is Euclidean or which obeys the triangulation
inequality.
• Useful metric in hand is internet distance which can be
approximated by quantities such as ping delay, # of IP hops,
etc.
• A node can probe internet distance to any other node.
• Note: such internet distance does not strictly obey
triangulation inequality, yet they are close that can take
advantage of proximity routing.
FOUNDATION OFFOUNDATION OF
PEERPEER--TOTO--PEERPEER
SYSTEMSSYSTEMS
LECT-10, S-44FP2P08, [email protected]
Javed I. Khan@2008
Analysis of Proximity RoutingAnalysis of Proximity Routing
• Generally multiple nodes shares the same prefix with a
given node.
• Thus, for each routing table entry there are multiple
choices of nodes.
• Though it cannot be guaranteed that a node will
always find out the closest node for a particular prefix,
but over time it can be always improved as it keeps in
touch with more node and keeps replacing nodes with
closer nodes.
-
4
FOUNDATION OFFOUNDATION OF
PEERPEER--TOTO--PEERPEER
SYSTEMSSYSTEMS
LECT-10, S-45FP2P08, [email protected]
Javed I. Khan@2008
Proximity InvariantProximity Invariant
• Ideally, let us assume that following proximity
property holds for each routing table:
• Proximity Invariant: Each routing table entry
refers to a node which is close to the local node
in the proximity space, among all nodes with the
appropriate nodeId prefix.
FOUNDATION OFFOUNDATION OF
PEERPEER--TOTO--PEERPEER
SYSTEMSSYSTEMS
LECT-10, S-46FP2P08, [email protected]
Javed I. Khan@2008
Two Questions?Two Questions?
• Question 1: Can this invariant be preserved by any
practical incremental route table construction and
maintenance process?
• Question 2: If this invariant is maintained in each
routing
can a packets be forwarded completely to the right node
ID yet efficiently in scalar proximity metric space?
•Proximity Invariant: Each routing table entry
refers to a node which is close to the local node
in the proximity space, among all nodes with
the appropriate nodeId prefix.
-
5
FOUNDATION OFFOUNDATION OF
PEERPEER--TOTO--PEERPEER
SYSTEMSSYSTEMS
LECT-10, S-47FP2P08, [email protected]
Javed I. Khan@2008
Concept: Prefix Length & Euclidian DistanceConcept: Prefix
Length & Euclidian Distance
• A node which has larger prefix match
with current node is further away from
the current node.
• With each extra prefix match the nodes
becomes more rare. Thus, with the
assumption of uniform distribution of the
modeID in proximity space the distance
increases exponentially.
0
x
1
x
2
x
3
x
4
x
5
x
7
x
8
x
9
x
a
x
b
x
c
x
d
x
e
x
f
x
6
0
x
6
1
x
6
2
x
6
3
x
6
4
x
6
6
x
6
7
x
6
8
x
6
9
x
6
a
x
6
b
x
6
c
x
6
d
x
6
e
x
6
f
x
6
5
0
x
6
5
1
x
6
5
2
x
6
5
3
x
6
5
4
x
6
5
5
x
6
5
6
x
6
5
7
x
6
5
8
x
6
5
9
x
6
5
b
x
6
5
c
x
6
5
d
x
6
5
e
x
6
5
f
x
6
5
a
0
x
6
5
a
2
x
6
5
a
3
x
6
5
a
4
x
6
5
a
5
x
6
5
a
6
x
6
5
a
7
x
6
5
a
8
x
6
5
a
9
x
6
5
a
a
x
6
5
a
b
x
6
5
a
c
x
6
5
a
d
x
6
5
a
e
x
6
5
a
f
xlog16 Nrows
Row 0
Row 1
Row 2
Row 3
FOUNDATION OFFOUNDATION OF
PEERPEER--TOTO--PEERPEER
SYSTEMSSYSTEMS
LECT-10, S-48FP2P08, [email protected]
Javed I. Khan@2008
Flashback: Route Table ConstructionFlashback: Route Table
Construction
• X borrows A’s Neighborhood Set
• X’s leaf set derived from Z’s leaf set
• X0 set to A0
• X1 set to B1, X2 set to C2, …
X
A
B
C
Z
-
6
FOUNDATION OFFOUNDATION OF
PEERPEER--TOTO--PEERPEER
SYSTEMSSYSTEMS
LECT-10, S-49FP2P08, [email protected]
Javed I. Khan@2008
Node Join Message Path in Euclidian Space Node Join Message Path
in Euclidian Space
D
A
XB
C
FOUNDATION OFFOUNDATION OF
PEERPEER--TOTO--PEERPEER
SYSTEMSSYSTEMS
LECT-10, S-50FP2P08, [email protected]
Javed I. Khan@2008
Routes in NodeID Space vs. Proximity Space Routes in NodeID
Space vs. Proximity Space
d46a1c
Route(d46a1c)
d462ba
d4213f
d13da3
65a1fc
d467c4d471f1
NodeId space
d467c4
65a1fc
d13da3
d4213f
d462ba
Proximity space
-
7
FOUNDATION OFFOUNDATION OF
PEERPEER--TOTO--PEERPEER
SYSTEMSSYSTEMS
LECT-10, S-51FP2P08, [email protected]
Javed I. Khan@2008
Q1: Preservation of Invariant (approximately)Q1: Preservation of
Invariant (approximately)
• Let us consider row one of X’s routing table, which is
obtained from node B. The entries in this row are near B, however,
it is not clear how close B is to X.
• Intuitively, it would appear that for X to take row one of its
routing table from node B does not preserve the desired property,
since the entries are close to B, but not necessarily to X. Right?
Not exactly!
• In reality, the entries tend to be reasonably close to X.
Recall that the entries in each successive row are chosen from an
exponentially decreasing set size. Therefore, the expected distance
from B to its row one entries (B1) is much larger than the expected
distance traveled from node A to B. As a result, B1 is a reasonable
choice for X1.
• This same argument applies for each successive level and
routing steps.
FOUNDATION OFFOUNDATION OF
PEERPEER--TOTO--PEERPEER
SYSTEMSSYSTEMS
LECT-10, S-52FP2P08, [email protected]
Javed I. Khan@2008
Q1: Preservation of Invariant (Refinement)Q1: Preservation of
Invariant (Refinement)
• After X has initialized its state in this fashion, its routing
table and neighborhood set approximate the desired locality
property.
• However, the quality of this approximation must be improved to
avoid cascading errors that could eventually lead to poor route
locality.
• For this purpose, there is a second stage in which X requests
the state from each of the nodes in its routing table and
neighborhood set.
• It then compares the distance of corresponding entries found
in those nodes’ routing tables and neighborhood sets, respectively,
and updates its own state with any closer nodes it finds.
• Also note, the neighborhood set contributes valuable
information in this process, because it maintains and propagates
information about nearby nodes regardless of their nodeId
prefix.
-
8
FOUNDATION OFFOUNDATION OF
PEERPEER--TOTO--PEERPEER
SYSTEMSSYSTEMS
LECT-10, S-53FP2P08, [email protected]
Javed I. Khan@2008
Which Rows to Use?Which Rows to Use?
• [All nodes down the path has more and more prefix match with
X. Thus, X can use not only i-th row but i, i-1, i-2, ..0 th rows
from node at ith hop, nodes where will have at least one digit
prefix map with X. –javed]
• Intuitively, why incorporating the state of nodes mentioned in
the routing and neighborhood tables from stage one provides good
representatives for X?
• The circles show the average distance of the entry from each
node along the route, corresponding to the rows in the routing
table. Observe that X lies within each circle, albeit off-center.
In the second stage, X obtains the state from the entries
discovered in stage one, which are located at an average distance
equal to the perimeter of each respective circle.
• These states must include entries that are appropriate for X,
but were not discovered by X in stage one, due to its off-center
location.
FOUNDATION OFFOUNDATION OF
PEERPEER--TOTO--PEERPEER
SYSTEMSSYSTEMS
LECT-10, S-54FP2P08, [email protected]
Javed I. Khan@2008
Q2(a): Is a Route Optimum in Proximity Space?Q2(a): Is a Route
Optimum in Proximity Space?
• The entries in the routing table of each Pastry node are
chosen to be close to the present node, according to the proximity
metric, among all nodes with the desired nodeId prefix.
• As a result, in each routing step, a message is forwarded to a
relatively close node with a nodeId that shares a longer common
prefix or is numerically closer to the key than the local node.
• That is, each step moves the message closer to the destination
in the nodeId space, while traveling the least possible distance in
the proximity space.
• Since only local information is used, Pastry minimizes the
distance of the next routing step with no sense of global
direction. This procedure clearly does not guarantee that the
shortest path from source to destination is chosen.
• However, it does give rise to relatively good routes.
-
9
FOUNDATION OFFOUNDATION OF
PEERPEER--TOTO--PEERPEER
SYSTEMSSYSTEMS
LECT-10, S-55FP2P08, [email protected]
Javed I. Khan@2008
• Fact#1: First, given a message was routed from node A to node
B at distance d from A, the message cannot subsequently be routed
to a node with a distance of less than d from A.
• Fact#2: Second, the expected distance traveled by a messages
during each successive routing step is exponentially
increasing.
• Implication: (No return to a past node) Jointly, these two
facts imply that although it cannot be guaranteed that the distance
of a message from its source increases monotonically at each step,
a message tends to make larger and larger strides with no
possibility of returning to a node within di of any node i
encountered on the route, where di is the distance of the routing
step taken away from node i. (diagram)
• Therefore, the message has nowhere to go but towards its
destination.
Q2(a): Is a Route Good in Proximity Space?Q2(a): Is a Route Good
in Proximity Space?
FOUNDATION OFFOUNDATION OF
PEERPEER--TOTO--PEERPEER
SYSTEMSSYSTEMS
LECT-10, S-56FP2P08, [email protected]
Javed I. Khan@2008
• Sample trajectory of a typical message in the Pastry network,
based on experimental data. The message cannot re-enter the circles
representing the distance of each of its routing steps away from
intermediate nodes. Although the message may partly “turn back”
during its initial steps, the exponentially increasing distances
traveled in each step cause it to move toward its destination
quickly.
-
10
FOUNDATION OFFOUNDATION OF
PEERPEER--TOTO--PEERPEER
SYSTEMSSYSTEMS
LECT-10, S-57FP2P08, [email protected]
Javed I. Khan@2008
Q2(b): Is the Route Complete? Q2(b): Is the Route Complete?
• Pastry routes messages towards the node with the nodeId
closest to the key, while attempting to travel the smallest
possible distance in each step. Therefore, among the k numerically
closest nodes to a key, a message tends to first reach a node near
the client.
• But there are two approximations. – Firstly, Pastry makes only
local routing decisions, minimizing the
distance traveled on the next step with no sense of global
direction.
– Secondly, since Pastry routes primarily based on nodeId
prefixes, it may miss nearby nodes with a different prefix than the
key.
• Based on this estimation, a heuristic detects when a message
approaches the set of k numerically closest nodes, and then it must
switche to numerically nearest address based routing to locate the
nearest replica (target).
FOUNDATION OFFOUNDATION OF
PEERPEER--TOTO--PEERPEER
SYSTEMSSYSTEMS
LECT-10, S-58FP2P08, [email protected]
Javed I. Khan@2008
PastryPastryPastry
PerformancePerformancePerformance
-
11
FOUNDATION OFFOUNDATION OF
PEERPEER--TOTO--PEERPEER
SYSTEMSSYSTEMS
LECT-10, S-59FP2P08, [email protected]
Javed I. Khan@2008
Pastry: Average # of HopsPastry: Average # of Hops
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
1000 10000 100000
Number of nodes
Avera
ge n
um
ber
of
ho
ps
Pastry
Log(N)
L=16, 100k random queries
FOUNDATION OFFOUNDATION OF
PEERPEER--TOTO--PEERPEER
SYSTEMSSYSTEMS
LECT-10, S-60FP2P08, [email protected]
Javed I. Khan@2008
Pastry: # of Hops (100k nodes)Pastry: # of Hops (100k nodes)
0.0000 0.0006 0.0156
0.1643
0.6449
0.1745
0.00000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0 1 2 3 4 5 6
Number of hops
Pro
bab
ilit
y
L=16, 100k random queries
-
12
FOUNDATION OFFOUNDATION OF
PEERPEER--TOTO--PEERPEER
SYSTEMSSYSTEMS
LECT-10, S-61FP2P08, [email protected]
Javed I. Khan@2008
Pastry: Distance traveledPastry: Distance traveled
L=16, 100k random queries, Euclidean proximity space
0.8
0.9
1
1.1
1.2
1.3
1.4
1000 10000 100000
Number of nodes
Rela
tive D
ista
nce
Pastry
Complete routing table
b=4; |L|=16; |M|=32; 200,000 lookups; Random end points
FOUNDATION OFFOUNDATION OF
PEERPEER--TOTO--PEERPEER
SYSTEMSSYSTEMS
LECT-10, S-62FP2P08, [email protected]
Javed I. Khan@2008
Pastry: Locality propertiesPastry: Locality properties
• 1) Expected distance traveled by a message in the
proximity space is within a small constant of the
minimum.
• 2) Routes of messages sent by nearby nodes with
same keys converge at a node near the source nodes.
• 3) Among k nodes with nodeIds closest to the key,
message likely to reach the node closest to the source
node first.
-
13
FOUNDATION OFFOUNDATION OF
PEERPEER--TOTO--PEERPEER
SYSTEMSSYSTEMS
LECT-10, S-63FP2P08, [email protected]
Javed I. Khan@2008
Pastry Delay vs IP DelayPastry Delay vs IP Delay
0
500
1000
1500
2000
2500
0 200 400 600 800 1000 1200 1400
Distance between source and destination
Dis
tan
ce
tra
ve
led
by
Pa
str
y m
es
sa
ge
Mean = 1.59
GATech top., .5M hosts, 60K nodes, 20K random messages
FOUNDATION OFFOUNDATION OF
PEERPEER--TOTO--PEERPEER
SYSTEMSSYSTEMS
LECT-10, S-64FP2P08, [email protected]
Javed I. Khan@2008
Quality of Routing EntriesQuality of Routing Entries
• Routing Effort
– “SL” is a hypothetical method where the joining node considers
only the appropriate row from each the route from itself to the
node with the closest existing nodeId (see Section 2.4).
– With “WT”, the joining node fetches the entire state of each
node along the path, but does not fetch state from the resulting
entries. This is equivalent to omitting the second stage.
– “WTF” is the actual method used in Pastry.
• Quality
– Empty: Does a node get any IP for the prefix?
– Optimum: Does the node get the closest node for that
prefix?
– Sub-Optimum: A node got a node- but which is not the best
one.
-
14
FOUNDATION OFFOUNDATION OF
PEERPEER--TOTO--PEERPEER
SYSTEMSSYSTEMS
LECT-10, S-65FP2P08, [email protected]
Javed I. Khan@2008
Quality of Routing TablesQuality of Routing Tables
b=4; |L|=16; |M|=32; 5000 New Nodes
Quiz: Can you
compare the
messaging
complexity of
the three
schemes- SL,
WT, & WTF?
FOUNDATION OFFOUNDATION OF
PEERPEER--TOTO--PEERPEER
SYSTEMSSYSTEMS
LECT-10, S-66FP2P08, [email protected]
Javed I. Khan@2008
Node FailureNode Failure
• A 5000 node pastry network.
• 10% nodes fails silently.
• A key is chosen and routing is performed.
-
15
FOUNDATION OFFOUNDATION OF
PEERPEER--TOTO--PEERPEER
SYSTEMSSYSTEMS
LECT-10, S-67FP2P08, [email protected]
Javed I. Khan@2008
Impact of Node FailureImpact of Node Failure
FOUNDATION OFFOUNDATION OF
PEERPEER--TOTO--PEERPEER
SYSTEMSSYSTEMS
LECT-10, S-68FP2P08, [email protected]
Javed I. Khan@2008
Pastry: # Routing Hops (failures)Pastry: # Routing Hops
(failures)
L=16, 100k random queries, 5k nodes, 500 failures
2.73
2.96
2.74
2.6
2.65
2.7
2.75
2.8
2.85
2.9
2.95
3
No Failure Failure After routing table repair
Avera
ge h
op
s p
er
loo
ku
p
-
16
FOUNDATION OFFOUNDATION OF
PEERPEER--TOTO--PEERPEER
SYSTEMSSYSTEMS
LECT-10, S-69FP2P08, [email protected]
Javed I. Khan@2008
Next Class: Next Class: Next Class:
PresentationsPresentationsPresentations