Peer-to-Peer Networkstbma/teaching/cs4226y15... · 2015-10-23 · Outline P2P vs. traditional paradigm Properties, Advantages and Challenges Practical P2P systems Napster, Gnutella,

Richard T. B. Ma School of Computing

National University of Singapore

Peer-to-Peer Networks

CS 4226: Internet Architecture

Outline

P2P vs. traditional paradigm Properties, Advantages and Challenges

Practical P2P systems Napster, Gnutella, KaZaa, Skype, BitTorrent

Key technologies for P2P lookup services Distributed Hash Table (DHT) Two example architectures: Chord and CAN

The client/server model and extension

Client/server model: Asymmetric traditional communication model roles: ad-hoc clients vs. dedicated servers

Extended model: Delegation a new role for server (client remains the same) can be recursive or iterative

server client

request

response secondary server

delegation

response

Root DNS Servers

com DNS servers org DNS servers edu DNS servers

poly.edu DNS servers

umass.edu DNS servers yahoo.com

DNS servers amazon.com DNS servers

pbs.org DNS servers

An example: Domain Name System (DNS)

client wants IP for www.amazon.com; 1st approx: client queries a root server to find com DNS server client queries com DNS server to get amazon.com DNS server client queries amazon.com DNS server to get IP address for

www.amazon.com

requesting host cis.poly.edu

gaia.cs.umass.edu

root DNS server

local DNS server dns.poly.edu

1

2 3

4

5

6

authoritative DNS server dns.cs.umass.edu

7 8

TLD DNS server

Domain name resolution: iterative vs. recursive

requesting host cis.poly.edu

gaia.cs.umass.edu

root DNS server

local DNS server dns.poly.edu

1

2

4 5

6

authoritative DNS server dns.cs.umass.edu

7

8

TLD DNS server

3

Severed-based vs. Peer-to-peer

peer-peer Properties and problems

Properties of (pure) P2P no always-on server or central entity arbitrary end systems directly communicate no a-priori knowledge/structure flat architecture/namespace

Problems: peers are intermittently connected and change IP addresses unreliable service providers how to stay connected? how to do resource lookup?

File Distribution: Server-Client vs P2P

Question : How much time to distribute file from one server to N peers?

us

u2 d1 d2 u1

uN

dN

Server

Network (with abundant bandwidth)

File, size F

us: server upload bandwidth ui: peer i upload bandwidth

di: peer i download bandwidth

File distribution time: server-client

us

u2 d1 d2 u1

uN

dN

Server


F server sequentially

sends N copies: NF/us time

client i takes F/di time to download

increases linearly in 𝑁 (for large 𝑁)

= 𝑑𝑐𝑐 = max 𝑁𝑁𝑢𝑠

, 𝑁min 𝑑𝑖

Time to distribute F

to N clients using client/server approach

File distribution time: P2P

us

u2 d1 d2 u1

uN

dN

Server


F

server must send one copy: F/us time

client i takes F/di time to download

NF bits must be downloaded (aggregate) fastest possible upload rate: us + Σui

𝑑𝑃𝑃𝑃 = max𝐹𝑢𝑐

,𝐹

min 𝑑𝑖,

𝑁𝐹𝑢𝑠 + ∑𝑢𝑖

0

0.5

1

1.5

2

2.5

3

3.5

0 5 10 15 20 25 30 35

N

Min

imum

Dis

tribu

tion

Tim

e P2PClient-Server

Server-client vs. P2P: example

Client upload rate = u, F/u = 1 hour, us = 10u, dmin ≥ us

When and when not P2P?

When is P2P the right/wrong solution?

Claim: P2P vision is technically feasible In other words, possible to build everything on

Internet without any dedicated servers But just because it’s technically feasible,

doesn’t necessarily make sense… In other words, just because we can do it P2P,

doesn’t mean that we should do it P2P

So, when is P2P the right solution?!?

Some Criteria

Budget How much money do we have?

Resource relevance How widely are resources interesting to users?

Trust How much trust there is between users?

Rate of system change How fast do things change in the system?

Criticality How critical is the service to the users?

P2P Applications and Systems File sharing

Napster (99-01), KaZaA (01-12), Gnutella Content distribution

BitTorrent VoIP and messaging

Skype Video streaming

PPLive, PPStream Others applications

P2P computation, P2P storage, …

Napster: How does it work?

Based on a central index server user registers with the central server server sends list of files to be shared server knows all the peers and files in network

Searching based on keywords search results: a list of files with information

about the file and the peer sharing it e.g., encoding rate, file size, peer’s bandwidth some information entered by user, unreliable

Napster: How does it work?

Pretty much like the use of delegation

However, change the role of client/server, making peer-to-peer

Napster: Pros and Cons Weaknesses:

downloading from a single peer only

single point of failure of the server

large computation to handle queries

unreliable content

vulnerable to attacks

lawsuits

Strengths: a consistent view of the

network

fast and efficient searching

guarantee correct search answers

Gnutella: How does it work?

Has only peers, all of which are fully equal conceptually an overlay network

To join the network, peer needs the address of another active peer out-of-band channel, e.g., get it from a website

Once joined, peer learns about others and learns about the topology of the network

Queries are flooded into the network

Downloads directly between peers

Gnutella: How does it work?

Query

Query

Hit

Hit

HTTP File transfer

Gnutella: Pros and Cons Weaknesses:

inefficient queries flooding

• wastes lot of network and peer resources

• how to deal with it?

inefficient network management

• constant probing is needed

Strengths: fully distributed

open protocol • Easy to write clients, e.g., no

KaZaA for Linux

robust against node failures • only true for random failures,

as it forms a power-law network

less susceptible to denial of service attack

KaZaA: How does it work?

Two kinds of nodes Ordinary Nodes (ON): a normal user peer Supernodes (SN): a user peer with more

resources/responsibilities than ON

Forms a two-tier hierarchy top level has only SN, lower level only ON ON belongs to one SN: can change at will, but

only one SN at a time

SN acts as a “hub” for all its ON-children Keeps track of files in those ON-children peers

KaZaA Super nodes

exchange information between themselves

do not form a complete mesh

Ordinary nodes obtains address of SN,

sends request and gives list of files to share

SN starts keeping track of this ON

not visible to other SN

KaZaA: Ordinary vs. Super Nodes ON can be promoted to SN if it has sufficient

resources (bandwidth, up time) user can typically refuse to become a SN typical bandwidth requirement: 160-200 kbps

13% of ON responsible for 80% of uploads SN change connections to other SN on a time

scale of tens of minutes allows for larger network range to be explored avg. lifetime of SN 2.5 hours, but high variance

SN don’t cache info from disconnected ON estimated 30,000 SN at any given time one SN has connections to 30-50 other SN

Skype Allows the user to make calls to

other computers on Internet real phone network and real phone number

forwarded to Skype (costs money) very popular, ~300 million downloads, ~15 million

concurrent users online

Similar architecture to that of KaZaA supernodes and ordinary nodes but: Skype is perfectly legal (the affected

industry is “only” telcos, they sell DSL…)

Skype: How does it work?

inherently P2P: pairs of users communicate.

proprietary, encrypted application-layer protocol (inferred via reverse engineering)

hierarchical overlay with SNs

index maps usernames to IP addresses; distributed over SNs

Skype clients (SC)

Supernode (SN)

Skype login server

Peers (supernodes) as relays problem when both

Alice and Bob are behind “NATs”. NAT prevents an outside

peer from initiating a call to insider peer

solution: using Alice’s and Bob’s

SNs, relay is chosen each peer initiates

session with relay. peers can now

communicate through NATs via relay

BitTorrent: P2P Content Distribution BitTorrent builds a network (swarm) for

every file that is being distributed

Big advantage: can send “link” (.torrent) to a friend “link” always refers to the same file not feasible on search-based Napster, Gnutella,

or KaZaA (hard to identify particular files)

Downside : no searching possible websites with “link collections” and search

capabilities exist, but no name service

BitTorrent: How does it work?

For each shared file, there is (initially) one server (seed) which hosts the original copy file is broken into chunks

“torrent” file: metadata about the content torrent file hosted typically on a web server

Client downloads torrent file: Metadata indicates the sizes and checksums of chunks identifies a tracker

BitTorrent: To start with … Web server

Tracker

Seed

.torrent file

Tracker: 137.89.211.1 Chunks: 42 Chunk 1: … Chunk 2: … …

1 2

1. seed starts tracker 2. seed creates torrent-file and host it somewhere 3. a new client obtains the torrent file 4. the new client contacts tracker and obtain the “peers” 5. the new client download/exchange chunks with peers

New client

3

4

5

BitTorrent: file distribution tracker: a server that keeps track of which seeds and peers are in the swarm; doesn’t participate in actual file distribution

obtain list of peers

trading chunks

peer

torrent: group of peers exchanging chunks of a file

swarm: seeds+peers

file divided into 256KB chunks.

peer joining torrent: has no chunks, but will accumulate them over time registers with tracker to get list of peers,

connects to subset of peers (“neighbors”)

when downloading, peer uploads chunks to others

peers may come and go

once peer has entire file, it may (selfishly) leave or (altruistically) remain

BitTorrent: a bit details

Pulling Chunks at any given time, peers

have different subsets of file chunks

periodically, a peer (Alice) asks each neighbor for list of chunks that they have.

Alice sends requests for her missing chunks rarest first

Pushing Chunks: tit-for-tat Alice sends chunks to four

neighbors currently sending her chunks at the highest rate re-evaluate top 4 every 10

secs

every 30 secs: randomly select another peer, starts sending chunks newly chosen peer may join

top 4 “optimistically un-choke”

BitTorrent: more details

BitTorrent: Tit-for-tat (1) Alice “optimistically unchokes” Bob (2) Alice becomes one of Bob’s top-four providers; Bob reciprocates (3) Bob becomes one of Alice’s top-four providers

With higher upload rate, can find better trading partners & get file faster!

BitTorrent: Open Issues Everyone must contribute

clients behind a firewall? low-bandwidth clients have a disadvantage?

BT’s impact on the network fast download != nearby in network

Optimal chunk selection algorithm rarest-first seems to work well in practice is it optimal? fastest for single peer or overall?

Is tit-for-tat really necessary? are there situations where free-riding should

be allowed or even be encouraged?

Related issues Dealing with today‘s users

usenet/email worked when users behaved well; now, spam is everywhere!

need accountability: identify individuals, even if “pseudonymously“

Preserve privacy (somehow conflicting goal) Prevent “freeriding“

reputation tracking mechanisms help voting mechanisms and payment schemes effort went into accountability in P2P systems tit-for-tat scheme in BitTorrent

Outline

P2P vs. traditional paradigm Properties, Advantages and Challenges

Practical P2P systems Napster, Gnutella, KaZaa, Skype, BitTorrent

Key technologies for P2P lookup services Distributed Hash Table (DHT) Two example architectures: Chord and CAN

Searching and Addressing

Two ways to find objects, which determine how network is constructed how objectives are placed how efficient objects can be found

Examples (search or addressing?)

Google DNS, IP routing Napster, Gnutella, KaZaa, BitTorrent

Searching vs. Addressing Searching:

no need to know unique names (more user friendly)

hard to make efficient (can solve with $$, see Google)

need to compare actual objects to know if they are the same

Addressing: object location can

be made efficient

each object uniquely identifiable

need to know unique names

need to maintain structure required for addressing

Two types of P2P Unstructured networks/systems

cause the need for searching does not mean complete lack of structure

• has graph structure, e.g., power-law, hierachy … but peers are free to join anywhere, choose

neighbors freely, objects are stored anywhere

Structured networks/systems allow for addressing, deterministic routing network structure determines where peers

belong in the net and where objects are stored how can we build such structured networks?

Key Value Store Database contains entries in the form of

(key, value) pairs key: ss number; value: human name key: content type; value: IP address

Operations/interface Put(key, value) Get(key) value

Looks like a table find an object takes 𝑂 𝑁 how to locate an object efficiently?

key value John 8732-7436 Adam 2349-5763 Mary 8734-7263 Linda 3682-8923

Recall: Hash Tables Data structure

fixed-sized array of hash buckets

allow insertions, deletions and lookups in 𝑂 1

Hash function maps keys to hash buckets with desirable properties fast to compute even distribution of keys

0 1 2 3 4 5 6 7

16

26

45 84

31

ℎ𝑎𝑐ℎ 𝑥 = 𝑥 𝑚𝑚𝑑 8

index of hash buckets

keys that map to the bucket

42

Distributed Hash Table (DHT)

Idea: distribute hash buckets to peers

Core question: how to design and implement an efficient mechanism to find which peer is

responsible for which hash bucket?

route between them?

0 1 2

16

26

3 4 5 45

84

6 7 31

42

DHT: Principles Each node is responsible

for one or more buckets as nodes joins and leaves,

the responsibilities change Nodes communicate

among themselves to find the responsible node Scalable communications

make DHT efficient DHTs support all the

hash table operations

0 1 2

16

26

3 4 5 45

84

6 7 31

42

DHT: Examples We’ll study: Chord (2001) and CAN (2001)

Other examples Pastry/Tapestry (2001): based on Plaxton routing Kademlia (2002): based on XOR-metric

All provide the same abstraction store key-value pairs when given a key, can retrieve/store the value no semantics associated with key or value

Major differences design of namespace and routing in the overlay

References

I. Stoica, R. Morris, D. Karger, M. F. Kaashoek, and H. Balakrishnan, “Chord: A scalable peer-to-peer lookup service for internet applications,” in Proc. SIGCOMM, San Diego, CA, Aug. 2001, pp. 149–160.

S. Ratnasamy, P. Francis, M. Handley, R. Karp, and S. Schenker, “A scalable content-addressable network,” in Proc. SIGCOMM, San Diego, CA, Aug. 2001, pp. 161–172.

Chord: Basics

From MIT, used in P2P storage systems

Uses SHA-1 hash function in practice results in a 160-bit object/node identification same hash function for both objects and nodes

• node ID hashed from IP address • object ID hashed from object name

Organized in a ring which wraps around nodes keep track of predecessor and successor

Chord

example: namespace 0, 𝑃3 − 1

an overlay network who are the successor and

predecessor of node 3?

Chord: how to assign indices?

In general, assign identifier to each node/object in the range 0,𝑃𝑚 − 1 each identifier can be represented by 𝑚 bits

Central issue: assign (key, value) pairs to nodes/peers

Rule: assign indices to the node that has the closest ID convention: closest is the immediate successor

successor 1 = 1 successor 𝑃 = 3 successor 6 = 0

who is taking care of indices: 1, 2 and 6?

1

2

6

Chord: find a particular node

If we look for index 7, and we start at node 2, how many steps? successor 7 = 0 𝑃 ⇒ 3 ⇒ 4 ⇒ 6 ⇒ 0

In general, it

takes 𝑂 𝑁 steps 𝑁 is the # of nodes too slow for large 𝑁

Chord: adding shortcuts

Notation Definition 𝑓𝑓𝑓𝑓𝑓𝑓[𝑘]. 𝑐𝑠𝑎𝑓𝑠 𝑓 + 𝑃𝑘−1 mod 𝑃𝑚, 1 ≤ 𝑘 ≤ 𝑚

. 𝑓𝑓𝑠𝑓𝑓𝑖𝑎𝑖 𝑓𝑓𝑓𝑓𝑓𝑓[𝑘]. 𝑐𝑠𝑎𝑓𝑠, 𝑓𝑓𝑓𝑓𝑓𝑓[𝑘 + 1]. 𝑐𝑠𝑎𝑓𝑠 .𝑓𝑚𝑑𝑓 first node ≥ 𝑓. 𝑓𝑓𝑓𝑓𝑓𝑓 𝑘 . 𝑐𝑠𝑎𝑓𝑠

successor the next node on the identifier circle; i.e., 𝑓𝑓𝑓𝑓𝑓𝑓 1 .𝑓𝑚𝑑𝑓

predecessor the previous node on the identifier circle

Each node 𝑓 maintains a finger table that includes at most 𝑚 shortcuts 𝑓th finger/shortcut is at least 𝑃𝑖−1 far apart

Finger table of node 𝑓

Fingers for node 3 and 6: start start int. succ.

4 𝟒,𝟓 5 𝟓,𝟕 7 𝟕,𝟑

start int. succ. 7 𝟕,𝟎 0 𝟎,𝟐 2 𝟐,𝟔

Fingers for node 3 and 6: node start int. succ.

4 𝟒,𝟓 4 5 𝟓,𝟕 6 7 𝟕,𝟑 0

start int. succ. 7 𝟕,𝟎 0 0 𝟎,𝟐 0 2 𝟐,𝟔 2

Node Join start int. succ. 2 𝟐,𝟑 1 3 𝟑,𝟓 1 5 𝟓,𝟏 1


start int. succ. 3 𝟑,𝟒 1 4 𝟒,𝟔 1 6 𝟔,𝟐 1


start int. succ. 3 𝟑,𝟒 6 4 𝟒,𝟔 6 6 𝟔,𝟐 6

start int. succ. 7 𝟕,𝟎 0 0 𝟎,𝟐 0 2 𝟐,𝟔 2

start int. succ. 1 𝟏,𝟐 1 2 𝟐,𝟒 2 4 𝟒,𝟎 6

Routing start int. succ. 2 𝟐,𝟑 2 3 𝟑,𝟓 6 5 𝟓,𝟏 6

start int. succ. 3 𝟑,𝟒 6 4 𝟒,𝟔 6 6 𝟔,𝟐 6

start int. succ. 7 𝟕,𝟎 0 0 𝟎,𝟐 0 2 𝟐,𝟔 2

start int. succ. 1 𝟏,𝟐 1 2 𝟐,𝟒 2 4 𝟒,𝟎 6

query node 1: hash(key)=7

where is it located?

Node Leave

peer 1 abruptly leaves peer 0 detects; makes 2 its immediate successor;

asks 2 who its immediate successor is; makes 2’s immediate successor its second successor.

To handle node departure, require each node to know the IP address of its two successors

Each node periodically pings its two successors to see if they are still alive.

Chord: Performance

Finding an object takes 𝑂 log𝑁 steps

For 𝑁 nodes and 𝐾 objects each node is responsible for O 𝐾/𝑁 objects when an 𝑁 + 1 𝑠ℎ node joins or leaves,

responsibility of 𝑂 𝐾/𝑁 indices change hands

Any node joining or leaving an 𝑁-node network uses 𝑂 log𝑁 ∗ log𝑁 messages to re-establish the routing and finger tables initialize finger table and predecessor (for join)

From a ring to …

Two-dimensional torus

CAN: Basics

Scalable content-addressable network (CAN)

From Berkley, published in 2001 in the same conference as Chord

Namespace is a 𝑑-dimensional torus

Keep track of neighbors only no need to store shortcuts routing in a 𝑑-dimensional Euclidean space

CAN

a new node A joins via an existing node I randomly choose a coordinate (x,y)

A I

(x,y)

CAN

A I

(x,y)

route to node J from node I discover that node J owns (x,y)

J

CAN

J

A split node J’s zone by half. now node A owns one half

Splitting/merging namespace

Splitting a zone when a new node joins in a sequential order of the coordinates: split

along the 𝑋 dimension first, and then 𝑌 for the 2-dimensional space, each zone is a

square or a 1:2 narrow rectangle

When an existing node departs merge back to a neighbor, if it can be done otherwise, a neighbor node might temporarily

handle multiple zones

CAN

routing is easy: routing table contains 4 neighbors

J

A

CAN

routing is easy: routing table contains 4 neighbors

J

A

CAN

B

node B insert(K,V)

1. �𝒂 = 𝒉𝒙 𝑲𝒃 = 𝒉𝒚(𝑲)

2. route (K,V) to coordinate (a,b)

3. node who owns (a,b) stores

(a,b)

𝑥 = 𝑎

𝑦 = 𝑏

CAN B

node C retrieve (K,V)

1. �𝒂 = 𝒉𝒙 𝑲𝒃 = 𝒉𝒚(𝑲)

2. route

“retrieve(K,V)” to the node who owns (a,b)

(a,b)

C

CAN: Extension and Performance

Increase the dimension 𝑑 > 𝑃 increase routing table size and hash functions but shorter path

State information 𝑂 𝑑 maintain information of 𝑃𝑑 neighbors

Routing takes 𝑂 𝑑𝑓1/𝑑 with 𝑓 nodes average path length is 𝑑/4 𝑓1/𝑑

From 2D to 3D

CAN: Extension and Performance

Multiple realities multiple independent coordinate spaces each node gets a different zone in each space multiple copies of data stored in multiple nodes also shorter path

Routing weighted by round-trip-times take network topology into consideration forward to the “best” neighbor

Dimensions vs. Realities

increasing dimension reduces # hops more but large reality has other benefits

More References A. Rowstron and P. Druschel, "Pastry:

Scalable, decentralized object location and routing for large-scale peer-to-peer systems” IFIP/ACM International Conference on Distributed Systems Platforms (Middleware ’01), 329–350.

B. Zhao, L. Huang, J. Stribling, S. Rhea, A. Joseph, J. Kubiatowicz, ”Tapestry: A Resilient Global-scale Overlay for Service Deployment,” IEEE Journal on Selected Areas in Communications, 22(1): 2004.

Peer-to-Peer Networkstbma/teaching/cs4226y15... · 2015-10-23 · Outline P2P vs. traditional paradigm Properties, Advantages and Challenges Practical P2P systems Napster, Gnutella,

Documents