Top Banner
File Distribution: Server-Client vs P2P Question : How much time to distribute file from one server to N peers? u s u 2 d 1 d 2 u 1 u N d N Server Network (with abundant bandwidth) File, size F u s : server upload bandwidth u i : peer i upload bandwidth d i : peer i download bandwidth 1
32

File Distribution: Server-Client vs P2PTDDD66/timetable/2013/TDDD66_2013_11_p… · File distribution time: P2P u s d u 2 1 d 2 u 1 u N d N Server Network (with abundant bandwidth)

Apr 17, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: File Distribution: Server-Client vs P2PTDDD66/timetable/2013/TDDD66_2013_11_p… · File distribution time: P2P u s d u 2 1 d 2 u 1 u N d N Server Network (with abundant bandwidth)

File Distribution: Server-Client vs P2P

Question : How much time to distribute file from one server to N peers?

us

u2 d1 d2

u1

uN

dN

Server

Network (with abundant bandwidth)

File, size F

us: server upload

bandwidth

ui: peer i upload

bandwidth

di: peer i download

bandwidth

1

Page 2: File Distribution: Server-Client vs P2PTDDD66/timetable/2013/TDDD66_2013_11_p… · File distribution time: P2P u s d u 2 1 d 2 u 1 u N d N Server Network (with abundant bandwidth)

File distribution time: server-client

us

u2 d1 d2 u1

uN

dN

Server

Network (with abundant bandwidth)

F server sequentially

sends N copies: NF/us time

client i takes F/di

time to download

increases linearly in N (for large N)

= dcs = max { NF/us, F/min(di) } i

Time to distribute F to N clients using

client/server approach

2

Page 3: File Distribution: Server-Client vs P2PTDDD66/timetable/2013/TDDD66_2013_11_p… · File distribution time: P2P u s d u 2 1 d 2 u 1 u N d N Server Network (with abundant bandwidth)

File distribution time: P2P

us

u2 d1 d2 u1

uN

dN

Server

Network (with abundant bandwidth)

F server must send one

copy: F/us time

client i takes F/di time to download

NF bits must be downloaded (aggregate) fastest possible upload rate: us + Sui

dP2P = max { F/us, F/min(di) , NF/(us + Sui) } i

3

Page 4: File Distribution: Server-Client vs P2PTDDD66/timetable/2013/TDDD66_2013_11_p… · File distribution time: P2P u s d u 2 1 d 2 u 1 u N d N Server Network (with abundant bandwidth)

0

0.5

1

1.5

2

2.5

3

3.5

0 5 10 15 20 25 30 35

N

Min

imum

Dis

trib

ution T

ime P2P

Client-Server

Server-client vs. P2P: example

Client upload rate = u, F/u = 1 hour, us = 10u, dmin ≥ us

4

Page 5: File Distribution: Server-Client vs P2PTDDD66/timetable/2013/TDDD66_2013_11_p… · File distribution time: P2P u s d u 2 1 d 2 u 1 u N d N Server Network (with abundant bandwidth)

BitTorrent-like systems

File split into many smaller pieces Pieces are downloaded from both seeds and downloaders Distribution paths are dynamically determined

Based on data availability

Arrivals

Departures

Downloader

Downloader

Downloader

Downloader

Seed

Seed

Download time

Seed residence

time

Torrent (x downloaders; y seeds)

5

Page 6: File Distribution: Server-Client vs P2PTDDD66/timetable/2013/TDDD66_2013_11_p… · File distribution time: P2P u s d u 2 1 d 2 u 1 u N d N Server Network (with abundant bandwidth)

File distribution: BitTorrent

tracker: tracks peers participating in torrent

torrent: group of peers exchanging chunks of a file

obtain list of peers

trading chunks

peer

P2P file distribution

6

Page 7: File Distribution: Server-Client vs P2PTDDD66/timetable/2013/TDDD66_2013_11_p… · File distribution time: P2P u s d u 2 1 d 2 u 1 u N d N Server Network (with abundant bandwidth)

Background Multi-tracked torrents

Torrent file “announce-list” URLs

Trackers Register torrent file

Maintain state information

Peers Obtain torrent file

Choose one tracker at random

Announce

Report status

Peer exchange (PEX)

Issue Multiple smaller swarms

SwarmTorrent SwarmTorrent

7

Page 8: File Distribution: Server-Client vs P2PTDDD66/timetable/2013/TDDD66_2013_11_p… · File distribution time: P2P u s d u 2 1 d 2 u 1 u N d N Server Network (with abundant bandwidth)

Download using BitTorrent Background: Incentive mechanism Establish connections to large set of peers

At each time, only upload to a small (changing) set of peers

Rate-based tit-for-tat policy Downloaders give upload preference to the downloaders

that provide the highest download rates

Highest download rates

Optimistic unchoke

Pick top four

Pick one at random

8

Page 9: File Distribution: Server-Client vs P2PTDDD66/timetable/2013/TDDD66_2013_11_p… · File distribution time: P2P u s d u 2 1 d 2 u 1 u N d N Server Network (with abundant bandwidth)

Download using BitTorrent Background: Piece selection

Rarest first piece selection policy Achieves high piece diversity

Request pieces that the uploader has; the downloader is interested (wants); and is the rarest among this set of pieces

Peer 1:

Peer N :

Peer 2:

… …

Pieces in neighbor set:

1 2 3 k K

1 2 3 k K

1 2 3 k K

1 2 3 k K

(1) (2) (1) (2) (2) (3) (2) … …

… …

… …

from

to

9

Page 10: File Distribution: Server-Client vs P2PTDDD66/timetable/2013/TDDD66_2013_11_p… · File distribution time: P2P u s d u 2 1 d 2 u 1 u N d N Server Network (with abundant bandwidth)

Peer-assisted VoD streaming Some research questions ...

Can BitTorrent-like protocols provide scalable on-demand streaming?

How sensitive is the performance to the application configuration parameters?

Piece selection policy

Peer selection policy

Upload/download bandwidth

What is the user-perceived performance?

Start-up delay

Probability of disrupted playback

ACM SIGMETRICS 2008; IFIP Networking 2007/2009, IEEE/ACM ToN 2012 10

Page 11: File Distribution: Server-Client vs P2PTDDD66/timetable/2013/TDDD66_2013_11_p… · File distribution time: P2P u s d u 2 1 d 2 u 1 u N d N Server Network (with abundant bandwidth)

Live Streaming using BT-like systems

Live streaming (e.g., CoolStreaming) All peers at roughly the same play/download position

• High bandwidth peers can easily contribute more … (relatively) Small buffer window

• Within which pieces are exchanged

Buffer window

Media player

queue/buffer

Internet

piece

upload/downloads

11

Page 12: File Distribution: Server-Client vs P2PTDDD66/timetable/2013/TDDD66_2013_11_p… · File distribution time: P2P u s d u 2 1 d 2 u 1 u N d N Server Network (with abundant bandwidth)

12

Page 13: File Distribution: Server-Client vs P2PTDDD66/timetable/2013/TDDD66_2013_11_p… · File distribution time: P2P u s d u 2 1 d 2 u 1 u N d N Server Network (with abundant bandwidth)

More p2p slides …

13

Page 14: File Distribution: Server-Client vs P2PTDDD66/timetable/2013/TDDD66_2013_11_p… · File distribution time: P2P u s d u 2 1 d 2 u 1 u N d N Server Network (with abundant bandwidth)

Client-server architecture server:

always-on host

permanent IP address

server farms for scaling

clients: communicate with server

may be intermittently connected

may have dynamic IP addresses

do not communicate directly with each other

client/server

14

Page 15: File Distribution: Server-Client vs P2PTDDD66/timetable/2013/TDDD66_2013_11_p… · File distribution time: P2P u s d u 2 1 d 2 u 1 u N d N Server Network (with abundant bandwidth)

Pure P2P architecture no always-on server

arbitrary end systems directly communicate

peers are intermittently connected and change IP addresses

Three topics: File sharing

File distribution

Searching for information

Case Studies: Bittorrent and Skype

peer-peer

15

Page 16: File Distribution: Server-Client vs P2PTDDD66/timetable/2013/TDDD66_2013_11_p… · File distribution time: P2P u s d u 2 1 d 2 u 1 u N d N Server Network (with abundant bandwidth)

Hybrid of client-server and P2P Skype

voice-over-IP P2P application centralized server: finding address of remote

party: client-client connection: direct (not through

server) Instant messaging

chatting between two users is P2P centralized service: client presence

detection/location • user registers its IP address with central

server when it comes online • user contacts central server to find IP

addresses of buddies

16

Page 17: File Distribution: Server-Client vs P2PTDDD66/timetable/2013/TDDD66_2013_11_p… · File distribution time: P2P u s d u 2 1 d 2 u 1 u N d N Server Network (with abundant bandwidth)

P2P file sharing

Example

Alice runs P2P client application on her notebook computer

intermittently connects to Internet; gets new IP address for each connection

asks for “Hey Jude”

application displays other peers that have copy of Hey Jude.

Alice chooses one of the peers, Bob.

file is copied from Bob’s PC to Alice’s notebook: HTTP

while Alice downloads, other users uploading from Alice.

Alice’s peer is both a Web client and a transient Web server.

All peers are servers = highly scalable!

17

Page 18: File Distribution: Server-Client vs P2PTDDD66/timetable/2013/TDDD66_2013_11_p… · File distribution time: P2P u s d u 2 1 d 2 u 1 u N d N Server Network (with abundant bandwidth)

P2P: centralized directory

original “Napster” design

1) when peer connects, it informs central server: IP address

content

2) Alice queries for “Hey Jude”

3) Alice requests file from Bob

centralized directory server

peers

Alice

Bob

1

1

1

1 2

3

18

Page 19: File Distribution: Server-Client vs P2PTDDD66/timetable/2013/TDDD66_2013_11_p… · File distribution time: P2P u s d u 2 1 d 2 u 1 u N d N Server Network (with abundant bandwidth)

P2P: problems with centralized directory

single point of failure

performance bottleneck

copyright infringement: “target” of lawsuit is obvious

file transfer is decentralized, but locating content is highly centralized

19

Page 20: File Distribution: Server-Client vs P2PTDDD66/timetable/2013/TDDD66_2013_11_p… · File distribution time: P2P u s d u 2 1 d 2 u 1 u N d N Server Network (with abundant bandwidth)

Query flooding: Gnutella

fully distributed no central server

public domain protocol many Gnutella clients

implementing protocol

overlay network: graph

edge between peer X and Y if there’s a TCP connection

all active peers and edges form overlay net

edge: virtual (not physical) link

given peer typically connected with < 10 overlay neighbors

20

Page 21: File Distribution: Server-Client vs P2PTDDD66/timetable/2013/TDDD66_2013_11_p… · File distribution time: P2P u s d u 2 1 d 2 u 1 u N d N Server Network (with abundant bandwidth)

Gnutella: protocol

Query

QueryHit

Query

QueryHit

File transfer: HTTP

Query message sent over existing TCP connections peers forward Query message QueryHit sent over reverse path

Scalability: limited scope flooding

21

Page 22: File Distribution: Server-Client vs P2PTDDD66/timetable/2013/TDDD66_2013_11_p… · File distribution time: P2P u s d u 2 1 d 2 u 1 u N d N Server Network (with abundant bandwidth)

Gnutella: Peer joining

1. joining peer Alice must find another peer in Gnutella network: use list of candidate peers

2. Alice sequentially attempts TCP connections with candidate peers until connection setup with Bob

3. Flooding: Alice sends Ping message to Bob; Bob forwards Ping message to his overlay neighbors (who then forward to their neighbors….) peers receiving Ping message respond to Alice

with Pong message 4. Alice receives many Pong messages, and can then

setup additional TCP connections

22

Page 23: File Distribution: Server-Client vs P2PTDDD66/timetable/2013/TDDD66_2013_11_p… · File distribution time: P2P u s d u 2 1 d 2 u 1 u N d N Server Network (with abundant bandwidth)

Hierarchical Overlay

between centralized index, query flooding approaches

each peer is either a group leader or assigned to a group leader. TCP connection between

peer and its group leader.

TCP connections between some pairs of group leaders.

group leader tracks content in its children

ordinary peer

group-leader peer

neighoring relationships

in overlay network

23

Page 24: File Distribution: Server-Client vs P2PTDDD66/timetable/2013/TDDD66_2013_11_p… · File distribution time: P2P u s d u 2 1 d 2 u 1 u N d N Server Network (with abundant bandwidth)

Distributed Hash Table (DHT)

DHT = distributed P2P database

Database has (key, value) pairs; key: ss number; value: human name

key: content type; value: IP address

Peers query DB with key DB returns values that match the key

Peers can also insert (key, value) peers

24

Page 25: File Distribution: Server-Client vs P2PTDDD66/timetable/2013/TDDD66_2013_11_p… · File distribution time: P2P u s d u 2 1 d 2 u 1 u N d N Server Network (with abundant bandwidth)

DHT Identifiers

Assign integer identifier to each peer in range [0,2n-1]. Each identifier can be represented by n bits.

Require each key to be an integer in same range.

To get integer keys, hash original key. eg, key = h(“Led Zeppelin IV”)

This is why they call it a distributed “hash” table

25

Page 26: File Distribution: Server-Client vs P2PTDDD66/timetable/2013/TDDD66_2013_11_p… · File distribution time: P2P u s d u 2 1 d 2 u 1 u N d N Server Network (with abundant bandwidth)

How to assign keys to peers?

Central issue: Assigning (key, value) pairs to peers.

Rule: assign key to the peer that has the closest ID.

Convention in lecture: closest is the immediate successor of the key.

Ex: n=4; peers: 1,3,4,5,8,10,12,14; key = 13, then successor peer = 14

key = 15, then successor peer = 1

26

Page 27: File Distribution: Server-Client vs P2PTDDD66/timetable/2013/TDDD66_2013_11_p… · File distribution time: P2P u s d u 2 1 d 2 u 1 u N d N Server Network (with abundant bandwidth)

1

3

4

5

8 10

12

15

Circular DHT (1)

Each peer only aware of immediate successor and predecessor.

“Overlay network” 27

Page 28: File Distribution: Server-Client vs P2PTDDD66/timetable/2013/TDDD66_2013_11_p… · File distribution time: P2P u s d u 2 1 d 2 u 1 u N d N Server Network (with abundant bandwidth)

Circle DHT (2)

0001

0011

0100

0101

1000 1010

1100

1111

Who’s responsible for key 1110 ?

I am

O(N) messages on avg to resolve query, when there are N peers

1110

1110

1110

1110

1110

1110

Define closest as closest successor

28

Page 29: File Distribution: Server-Client vs P2PTDDD66/timetable/2013/TDDD66_2013_11_p… · File distribution time: P2P u s d u 2 1 d 2 u 1 u N d N Server Network (with abundant bandwidth)

Circular DHT with Shortcuts

Each peer keeps track of IP addresses of predecessor, successor, short cuts.

Reduced from 6 to 2 messages. Possible to design shortcuts so O(log N) neighbors,

O(log N) messages in query

1

3

4

5

8 10

12

15

Who’s responsible for key 1110?

29

Page 30: File Distribution: Server-Client vs P2PTDDD66/timetable/2013/TDDD66_2013_11_p… · File distribution time: P2P u s d u 2 1 d 2 u 1 u N d N Server Network (with abundant bandwidth)

Peer Churn

Peer 5 abruptly leaves Peer 4 detects; makes 8 its immediate successor;

asks 8 who its immediate successor is; makes 8’s immediate successor its second successor.

What if peer 13 wants to join?

1

3

4

5

8 10

12

15

•To handle peer churn, require each peer to know the IP address of its two successors. • Each peer periodically pings its two successors to see if they are still alive.

30

Page 31: File Distribution: Server-Client vs P2PTDDD66/timetable/2013/TDDD66_2013_11_p… · File distribution time: P2P u s d u 2 1 d 2 u 1 u N d N Server Network (with abundant bandwidth)

P2P Case study: Skype

inherently P2P: pairs of users communicate.

proprietary application-layer protocol (inferred via reverse engineering)

hierarchical overlay with Supernodes (SNs)

Index maps usernames to IP addresses; distributed over SNs

Skype clients (SC)

Supernode (SN)

Skype login server

31

Page 32: File Distribution: Server-Client vs P2PTDDD66/timetable/2013/TDDD66_2013_11_p… · File distribution time: P2P u s d u 2 1 d 2 u 1 u N d N Server Network (with abundant bandwidth)

Peers as relays

Problem when both Alice and Bob are behind “NATs”. NAT prevents an outside

peer from initiating a call to insider peer

Solution: Using Alice’s and Bob’s

SNs, Relay is chosen Each peer initiates

session with relay. Peers can now

communicate through NATs via relay

32