Top Banner
Peer to Peer File Sharing Huseyin Ozgur TAN
26

Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer? Every node is designed to(but may not by user choice) provide some service that helps.

Dec 19, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.

Peer to Peer File Sharing

Huseyin Ozgur TAN

Page 2: Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.

What is Peer-to-Peer? Every node is designed to(but may not by

user choice) provide some service that helps other nodes in the network to get service

Each node potentially has the same responsibility

Sharing can be in different ways: CPU cycles: SETI@Home Storage space: Napster, Gnutella, Freenet…

Page 3: Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.

P2P: Why so attractive? Peer-to-peer applications fostered

explosive growth in recent years. Low cost and high availability of large

numbers of computing and storage resources,

Increased network connectivity As long as these issues keep their importance,

peer-to-peer applications will continue to gain importance

Page 4: Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.

Main Design Goals of P2P Ability to operate in a dynamic

environment Performance and scalability Reliability Anonymity: Freenet, Freehaven, Publius Accountability: Freehaven, Farsite

Page 5: Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.

First generation P2P routing and location schemes Napster, Gnutella, Freenet… Intended for large scale sharing of data

files Reliable content location was not

guaranteed Self-organization and scalability: two

issues to be addressed

Page 6: Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.

Second generation P2P systems Pastry, Tapestry, Chord, CAN… They guarantee a definite answer to a

query in a bounded number of network hops.

They form a self-organizing overlay network.

They provide a load balanced, fault-tolerant distributed hash table, in which items can be inserted and looked up in a bounded number of forwarding hops.

Page 7: Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.

CAN Content Addressable Network

distributed infrastructure based on hash tables serves for Internet scale networks

Advantages Scalable Fault-tolerant Self orginazing

Page 8: Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.

CAN Current P2P systems at that time

Napster Gnutella Both not scalable

Napster Central server Expensive (for server) Vulnerable (single point of failure)

Page 9: Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.

CAN Gnutella

De-centralized file location as well File location is based on location Not scalable

Aim Building a scalable distributed infrastructure

for p2p networks

Page 10: Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.

CAN Basic operations

Insertion Lookup Deletion

Each node stores a zone of the entire hash table stores information about its neighbors

Page 11: Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.

CAN - Design Virtual d-dimensional

Cartesian coordinate space on a d-torus

The entire space is partitioned among all nodes

The space is used to store key,value pairs K -> P -> zone

Each node holds info about each of its neigbors for efficient routing

Page 12: Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.

CAN - Routing Works by following the straight line

path between source and destination Local neighbor state is sufficient for

routing CAN message includes dest. Coordinates A node routes it to the neighbor with

closest to the destination coordinates Complexity

(d/4)(n1/d) avg path length 2d neighbors for a node Good for scalability

Many paths between source and destination

In case of crashes the node routes the message to the next best available path

Page 13: Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.

CAN - Construction A new node that joins the system must be

allocated its own zone Achieved by an existing node splitting its zone

in half, retaining half and handing the other half to the new node

3 Steps New node must find a node already in CAN It must find a node whose zone will be split The neighbors of split zone must be notified

Page 14: Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.

CAN - Construction Bootstrap

New node discovers an existing node CAN DNS name resolves to one or more nodes IP addresses

which are believed to be in the system By querying the DNS new node gets a bootstrap node and from

this node gets other nodes IP addresses Finding a zone

Chooses randomly a point P in space and sends JOIN request destined P

Destination splits its zone and key,value pairs are transferred Joining the routing

The new node learns the IP addresses of the neighbors Both new and existing nodes update their neighbor set To inform others two nodes send an immediate update,

followed by periodic refreshes Complexity

O(d)

Page 15: Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.

CAN – Maintenance Node departure

Explicitly hand over its zone to one of its neighbors If node’s zones can be merged it is done If not, the responsibility is handed to the neighbor whose

current zone is smallest Takeover (a node becomes unreachable)

One of its neighbors takes over the zone But the pairs are lost Nodes send periodic advertisements Absence of such adv. signals its failure

Each neighbor starts a timer When it expires node sends TAKEOVER message to all

neighbors of failed node

Page 16: Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.

Chord Distributed lookup protocol

One operation : lookup = key -> node Advantages

Load balance Decentralization Scalability Availability Flexible naming

An infrastructure for p2p system

Page 17: Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.

Chord - Overview Specifies

How to find locations of keys How new nodes join the system How to recover from failure of existing nodes

It uses consistent hashing Improves the scalability of it by avoiding the

requirement that every node know about every other node.

In N-node network each node maintains information about only O (log N)

nodes a lookup requires O (log N) messages Updating info on join and leave requires O (log2 N)

Page 18: Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.

Chord – Consistent Hashing Consistent hash function each node and key an

m-bit identifier using a base hash function Node’s identifier = Hash of its IP Key’s identifier = Hash of key M must be big enough to make probability of collisions

negligible Assignment of keys to nodes

Identifiers are ordered in an identifier circle Key k is assigned to the first node whose identifier is

equal to or follows k in the identifier space This node is called successor (k)

Page 19: Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.

Chord – Consistent Hashing Scalable key location

Very small amount of routing information suffices Each node need only beware of its successor

Routing can be done by following successors until the node is found

Not scalable: O(N) time lookup To accelerate Chord maintains additional routing

information Each node, n, maintains a routing table with at most m

entries (finger table) ith entry contains the identity of first node, s, that succeeds

n by at least 2i-1 on the identifier circle i.e. s= successor(n+2i-1)

Page 20: Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.

Chord

Page 21: Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.

Chord – Node Joins Main challenge is preserving the ability to

locate every key in the network 2 invariants

Each nodes successor is correctly maintained For every key k, node successor(k) is

responsible for k Also in order to be the lookup fast the

finger tables must be correct

Page 22: Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.

Chord – Node joins To simplify the join and leave mechanism

each node in Chord maintains a predecessor pointer

3 tasks Init predecessor and fingers of node n Update the finger and predecessors of existing

nodes Notify higher layer so that it can transfer state

associated with keys that node n is now responsible for

Page 23: Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.

Chord – Node joins Initializing fingers and predecessor

Learns its predecessor and fingers by asking n’ to look them up Updating fingers of existing nodes

Node n will become the ith finger of node p iff P precedes n by at least 2i-1

The ith finger of p succeeds n The first node p that can meet these 2 conditions is the

immediate predecessor of n- 2i-1

The algorithm starts with the ith finger of node n and then continues to walk in the counter clockwise direction on identifier circle until it encounters a node whose ith finger precedes n

Transferring keys New node must contact the successor of it and transfer

responsibility

Page 24: Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.

Chord – Concurrent Operations Stabilization

The invariants are difficult to maintain in the face of concurrent joins in large networks

Separate correctness and performance goals Stabilization protocol

Keep nodes’ successor pointers up to date, which is sufficient to correctness of lookups

Those successors are then used to verify and correct finger table entries

Page 25: Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.

Stabilization

Page 26: Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.

Chord