Formalizing Peer-to-Peer Systems based on Content Addressable Network

Formalizing Peer-to-Peer Systems based on Content Addressable Network

Adrian Iftene, Gabriel Ciobanu

"Al. I. Cuza" University, Faculty of Computer Science

General Berthelot, 16, 700483, Iasi

[email protected], [email protected]

Proceedings of ICCCC 2006, Băile Felix Spa-Oradea, Romania 1-3 June 2006

2

Abstract

Peer-to-peer systems architecture facilitates fault tolerance, availability, scalability and performance

We present a formal description of the peer-to-peer system implemented with Content Addressable Network architecture

We describe this system using a variant of the distributed -calculus called P2P -calculus

We define a new bisimulation for well-located systems called go-bisimulation and give few related results

3

Peer-to-Peer Systems We consider CAN as part of a virtual d-dimensional

Cartesian coordinate space. This is a logical space, and it is not related to a specific physical system.

A 2-dimensional space [0, 100] x [0, 100] with 5 nodes(0,100) (100,100)

(0,0) (100,0)

C (0-50,50-100)

D (50-75,50-100)

E (75-100,50-100)

A(0-50,0-50)

B(50-100,0-50)

4

CAN - Movement Two nodes are neighbours if their coordinates coincide

along (d-1) dimensions and differ in only one dimension. A route from the source node to the destination node

follows the straight path through the Cartesian space

6 2

3 1 5

4

k j

Neighbours for 1 = {2, 3, 4, 5}Neighbours for 7 = {}

5

CAN - Insertion

CAN Insertion starts when a new node looks for a source node which is already in CAN.

After that, the new node learns the number of CAN dimensions and generates a d-point in this space (this point is called the destination node). A route from the source node to the destination node follows the straight path through the Cartesian space.

The destination node splits its zone in half, and assigns one half to the new joining node.

Finally, the neighbours of the split zone must be notified about the new node.

6

CAN – Insertion – Sample

1 2 3 4 5 6 70

1

2

3

4

5

6

7

0

1 2 3 4 5 6 70

1

2

3

4

5

6

7

0

Node 1

1 2 3 4 5 6 70

1

2

3

4

5

6

7

0

Node 1 Node 2

1 2 3 4 5 6 70

1

2

3

4

5

6

7

0

Node 1

Node 2

Node 3

A node can refuse to split if it thinks it

neighbors should be split instead, keeps the load balanced

1 2 3 4 5 6 70

1

2

3

4

5

6

7

0

Node 1

Node 2

Node 3

Node 4

1 2 3 4 5 6 70

1

2

3

4

5

6

7

0

Node 1

Node 2Node 3

Node 4

1 2 3 4 5 6 70

1

2

3

4

5

6

7

0

Node 1

Node 2

Node 3

Node 4

Node 51 2 3 4 5 6 70

1

2

3

4

5

6

7

0

1 2 3 4 5 6 70

1

2

3

4

5

6

7

0

Node 1

Node 2

Node 3

Node 5

Nod

e 4

Nod

e 6

7

Representation A CAN region HP can be uniquely identified either by an associated d-

dimensional hyper-parallelepiped and by its direct neighbours HP1, HP2, ..., HPnv. A hyper-parallelepiped HP S Rd has the shape given by

[min1,max1] × [min2,max2] × … × [mind,maxd], where

- S denotes the initial region associated at the whole network, - nv represent the number of direct neighbours of the current region HP.

In this d-dimensional space a hyper-point has the form p=(p1, p2,...,pd). From now, region denotes the associated hyper-parallelepiped, point denotes the

hyper-point given by the center of a region. In this space S the movement from a region HPc to a point pdn requires a

calculation of a minimal distance from a point to a region. This distance is calculated by using the distance from the point to the center of the corresponding region: HPc. The center and the distance are given by

2

minmax,...,

2

minmax,

2

minmax 1211c

dc

dcccc

ccHP

2222

211 )(...)()(),( c

ddn

dcdncdncdn cHPpcHPpcHPpcHPpd

8

Problem At a moment we are in a region HPc (where is our source node ps), and

we want to move to a region HPdn (where is our destination node). We take all the neighbours HP1, HP2,...,HPnv of HPc, and we calculate d(pdn,cHPi) for each of them. We move from the current region HPc to the HPmin which satisfies:

This is the ideal situation when we have not dead or overloaded regions on path from the source node to the destination node.

These failure regions do not respond at the identification messages, and they should be by-passed on the way to the destination.

}1|),(min{),( min nvicHPpdcHPpd idndn

9

Syntax and semantic of P2P -calculus

Distributed -calculus (D) is an extension of the -calculus with explicit notions for localization and code migration [5].

In D communication is local, and messages to the remote resources have a clear itinerary. We adapt distributed -calculus [6], and modify it by including elements specific to P2P; the resulting calculus is called P2P -calculus.

Some elements of this new calculus are taken from a process algebra used in PEPITO1 project [2]; this part is used to describe the static aspects of the network.

We use additional elements from the distributed -calculus to describe the dynamic part of network.1http://www.sics.se/pepito

10

Location confluence For every current node HPc (initial we have HPc=HPs) we build

the process expressions which describe the following steps:

)(0)(:

0.:min)),(),((.:::)(

:min:)(

min0

min

min

0

HPPelsethenHPpifQ

elseQithenHPpdHPpdifHPgoHPHPP

cHPP

dn

dnidnicci

c

1.2.3.

We denote this movement by HP HP' And a sequence of movements with HP HP'

Proposition 1 (Location confluence). If HPc HP1 and HPc HP2, and there is an optimal path to the destination node p belonging to a region HPd, then we have HP1 HPd and HP2 HPd.

go,p

go,p*

go,p*

go,p*

go,p*

go,p*

11

Go-bisimulation - Definitions

Definition 1 (go-LTS) A go-Labelled Transitional System go-LTS is a pair (Q, T) composed from a set of regions Q = {HP0, HP1,... } and a transition relation T (Q×Points×Q). (HP, p, HP') T is denoted by HP HP'.

Definition 2 (go-Bisimulation)1. Let (Q, T) be a go-LTS and S (Q × Q). Then S is a go-simulation

over (Q, T) related to p if for every HP S HP': if HP HP1 then it exists HP'1 such that we have HP‘ HP'1 and HP' S HP'1. We say that HP' is a go-simulation for HP related to p.

2. Let B be a binary relation over Q. B is a go-bisimulation related to p

if both B and B-1 are go-simulations related to p. We say such HP and

HP' are go-bisimilar related to p, and denote HP ~p HP', if there exist

a go-bisimulation related to p such that we have HP B HP'.

go,p

go,p

go,p

12

Proposition 2.1. ~p is an equivalence relation.2. ~p is itself a go-bisimulation related to p.

Proof.(1) For reflexivity, it is enough to prove that identity IdQ = {(HP, HP) | HP ∈

Q} is a go-bisimulation related to p. Because IdQ = IdQ-1 it is enough to prove that IdQ is a go-simulation related to p. This means that for every HP with HP IdQ HP, if HP HP1 then exist HP'1 such that HP HP'1 and HP1 IdQ HP'1. (HP'1 = HP1 satisfy both conditions.)

For symmetry, we must prove that if B is a go-bisimulation, then B-1 is also a go-bisimulation. Since B is a go-bisimulation, then we have B and B-1 are go-simulations. This is equivalent with the fact that B-1 and (B-1)-1 are go-simulations. Therefore B-1 is a go-bisimulation.

For transitivity, we must prove that if B1 and B2 are go-bisimulations, then their composition B1B2 = {(HP, HP") | HP' such that HP B1 HP' and HP‘ B2 HP"} is also a go-bisimulation. Let (HP, HP")∈ B1 B2, and HP HP#. Since there exists HP' such that HP B1 HP' and HP‘ B2 HP", then there exists HP‘# such that HP‘ HP‘# and HP# B1 HP‘#, and also HP“# such that HP" HP“# and HP‘# B2 HP“#. Thus (HP#, HP“#) ∈ B1 B2, which means that B1B2 is a go-simulation.

Go-bisimulation - 1

go,p go,p

go,p

go,pgo,p

13

(2) Let be HP ~p HP'. Then from definition we have HP B HP' for a go-bisimulation B related to p. Accordingly, if HP HP1, then it exists HP'1 for which we have HP' HP'1, and HP1 B HP'1. From this we infer that HP1

~p HP'1. Thus ~p is a go-simulation, and also its inverse is a go-simulation.

The failure regions can evolve in time. This means that we can speak about static and dynamic failure regions. Static failure regions which can be by-passed preserve ~p as an equivalence relation. The dynamic failure regions require a re-verification of the go-bisimulation. One situation is given by the nodes which previously satisfy the go-bisimulation, but now are down and induce failure regions. Therefore we should find other regions able to preserve the go-bisimulation relation. Another situation is when a failure region becomes active. The go-bisimulation relation can remain the same. It is also possible that the former paths are not optimal anymore, and we could define new go-bisimulation relations.

Go-bisimulation - 2

go,p

go,p

14

Conclusions and Future Work In order to conclude, we can say that in this article

we present a location confluence of a P2P specific architecture called CAN with respect to the insertion of a new P2P node, and we define go-bisimulation using a formalism called P2P -calculus. We formalize the movement in an ideal P2P system.

Then we describe what should be done to avoid failure regions. This presentation is a first step in formalizing P2P systems based on CAN architecture. We intend to extend it by adding timers in order to catch the fail situations when a node does not respond for a certain period of time.

15

Bibliography1. S. Androutsellis-Theotokis, and S. Spinellis, `À Survey of Peer-to-

Peer File Sharing Technologies,'' Athens University of Economics and Business White Paper (WHP-02-03), 2002.

2. J. Borgstrom, and U. Newstmann, ``Verifying a Structured Peer-to-Peer Overlay Network: The Static Case,'' Technical Report, EU Project PEPITO, IC, 2004.

3. M. Hennessy, and J. Riely, `Ìnformation Flow vs. Resource Access in the Asynchronous π-calculus,''ACM Transactions on Programming Languages and Systems, Vol.24, No.5, pp. 566-591, 2002.

4. A. Ingolfsdottir, ``Semantic Models for Communicating Processes with Value Passing,'' University of Sussex, Technical Report 8/94, 1994.

5. R. Milner, ``Communicating and Mobile Systems: the π -calculus'', Cambridge University Press, 1999.

6. R. Milner, J. Parrow, and D. Walker, `À Calculus of Mobile Processes, Part.I/II,''Information and Computation, 100, pp. 1-77, 1992.

7. S. Ratnasamy et. al., `À Scalable Content Addressable Network'', Proceedings SIGCOMM, ACM Press, pp.161-172, 2001.

16

Thank You!

Formalizing Peer-to-Peer Systems based on Content Addressable Network

Technology

new node

current node hp cinitial

new joining node

hp nvof hp c

source node p s

hp minwhich

hp dandhp

direct neighbours hp