The BitTorrent content distribution system CS217 Advanced Topics in Internet Research Guest Lecture Nikitas Liogkas, 5/11/2006.

Post on 17-Dec-2015

214 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

The BitTorrentcontent distribution system

CS217 Advanced Topics in

Internet Research

Guest Lecture

Nikitas Liogkas, 5/11/2006

Motivation

flash crowd (aka slashdot) effect many clients, few servers

Problem: servers cannot handle load

Solution: swarming clients download pieces of the file

from each other has been proven to have good scaling

and performance properties

Presentation outline

Joining the system Encoding / metadata file Tracker protocol Peer wire protocol Piece selection Peer selection Client implementations Resources

new leecher

Joining a torrent

Peers divided into: seeds: have the entire file leechers: still downloading

datarequest

peer list

metadata file

join

1

2 3

4seed/leecher

website

tracker

1. obtain the metadata file (out of band)2. contact the tracker 3. obtain a peer list (contains seeds & leechers)4. contact peers from that list for data

!

Exchanging data

I have leecher A

● verify pieces using hashes

● download sub-pieces (blocks) in parallel

● advertise received pieces to the entire peer list

● interested: need pieces that a given peer has

seed

leecher B

leecher C

Bencoding

encoding format of all exchanged messages four types

byte strings integers lists dictionaries (mapping keys to values)

examples 4:spam represents the string “spam” i10e represents the integer 10

Metadata file structure

contains information necessary to contact the tracker and describes the files in the torrent announce URL of tracker file name file length piece length (typically 256KB) SHA-1 hashes of pieces for verification also creation date, comment, creator, …

Tracker protocol

communicates with clients via HTTP/HTTPS client GET request

info_hash: uniquely identifies the file peer_id: chosen by and uniquely identifies the client client IP and port numwant: how many peers to return (defaults to 50) stats: bytes uploaded, downloaded, left

tracker GET response interval: how often to contact the tracker list of peers, containing peer id, IP and port stats: complete, incomplete

tracker-less mode; based on the Kademlia DHT

Presentation outline

Joining the system Encoding / metadata file Tracker protocol Peer wire protocol Piece selection Peer selection Client implementations Resources

Peer wire protocol

implemented directly on top of TCP messages

handshake (maybe with bitfield) keep-alive choke / unchoke interested / not interested have (advertisement of a newly acquired piece) request / piece cancel (only used in “endgame mode”) port (used in tracker-less mode)

Piece selection

when downloading starts: choose at random get complete pieces as quickly as possible obtain something to offer to others

after we have 4 pieces: pick (local) rarest first achieves the fastest replication of rare pieces obtain something of value only get unique pieces from the seed

endgame mode defense against the “last-block problem” send requests for missing sub-pieces to all

peers in our peer list send cancel messages upon receipt of a sub-piece

Last-block problem

at the end of the download, a peer may have trouble finding the few missing pieces

based on anecdotal evidence other proposals

network coding [Gkantsidis et al., Infocom’05] prefer to upload to peers with similar file

completeness; unfair for the peers having most of the pieces [Tian et al., Infocom’06]

Last-block problem – a myth?

is it a problem after all? figure from [Legout et al., INRIA-TR-2006], with permission

Peer selection - unchoking

leecher A

seed

leecher B

leecher C

• periodically (typically every 10 seconds) calculate data-receiving rates

• upload to (unchoke) the fastest

• constant number of unchoking slots

• based on the “tit-for-tat” strategy

Optimistic unchoking

periodically select a peer at random and upload to it typically every 3 unchoking rounds (30 seconds)

multi-purpose mechanism allow bootstrapping of new clients continuously look for the fastest partners robustness: every peer has a non-zero chance

of interacting with any other peer

Seed unchoking

old algorithm unchoke the fastest leechers problem: fastest peers may monopolize seeds

new algorithm periodically sort all leechers according to their last unchoke time prefer the most recently unchoked leechers; on a tie, prefer the fastest (presumably) achieves equal spread of seed bandwidth

new listrequest

peer list

Downloading only from seeds

leecher A

seed

leecher B

leecher C

tracker

● repeatedly query the tracker for peer lists

● distinguish the seeds, and receive data from them

● violates fairness model; may be harmful to honest peers

Rate- vs. volume-based selection

Proponents of rate-based decisions: [Cohen, P2PECON’03], and[INRIA TR’2006]

Proponents of volume-based decisions:[Bharambe et al., MSR-TR-2005],[Gkantsidis et al., Infocom’05], [Jun et al., P2PECON’05], andeDonkey file-sharing system

No clear winner yet!

Client implementations

mainline: written in Python; right now, the only one employing the new seed unchoking algorithm

Azureus: the most popular, written in Java; implements a special protocol between clients(e.g. peers can exchange peer lists)

other popular clients: ABC, BitComet, BitLord, BitTornado, μTorrent, Opera browser

various non-standard extensions retaliation mode: detect compromised/malicious peers anti-snubbing: ignore a peer who ignores us super seeding: seed masquerading as a leecher

Resources #1

Basic BitTorrent mechanisms [Cohen, P2PECON’03]

BitTorrent specification Wikihttp://wiki.theory.org/BitTorrentSpecification

Measurement studies [Izal et al., PAM’04], [Pouwelse et al., Delft TR 2004 and IPTPS’05], [Guo et al., IMC’05], and[Legout et al., INRIA-TR-2006]

Resources #2

Theoretical analysis and modeling [Qiu et al., SIGCOMM’04], and[Tian et al., Infocom’06]

Simulations [Bharambe et al., MSR-TR-2005]

Sharing incentives and exploiting them [Shneidman et al., PINS’04],[Jun et al., P2PECON’05], and[Liogkas et al., IPTPS’06]

Conclusion and food for thought

BitTorrent is fast and robust

Yet, many parameters are arbitrarily set number of unchoking slots unchoking round duration size of pieces / sub-pieces

What can we learn from BitTorrent for the design of future P2P content distribution protocols?

top related