winter 2008 P2P 1 Peer-to-Peer Networks: Unstructured and Structured • What is a peer-to-peer network? • Unstructured Peer-to-Peer Networks – Napster – Gnutella – KaZaA – BitTorrent • Distributed Hash Tables (DHT) and Structured Networks – Chord – Pros and Cons Readings: do required and optional readings if interested
49
Embed
Winter 2008 P2P1 Peer-to-Peer Networks: Unstructured and Structured What is a peer-to-peer network? Unstructured Peer-to-Peer Networks –Napster –Gnutella.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
winter 2008 P2P 1
Peer-to-Peer Networks:Unstructured and Structured
• What is a peer-to-peer network?• Unstructured Peer-to-Peer Networks
– Napster– Gnutella– KaZaA– BitTorrent
• Distributed Hash Tables (DHT) and Structured Networks– Chord – Pros and Cons
Readings: do required and optional readings if interested
winter 2008 P2P 2
Peer-to-Peer Networks:How Did it Start?
• A killer application: Naptser– Free music over the Internet
• Key idea: share the content, storage and bandwidth of individual (home) users
Internet
winter 2008 P2P 3
Model
• Each user stores a subset of files• Each user has access (can download)
files from all users in the system
winter 2008 P2P 4
Main Challenge
• Find where a particular file is stored
AB
C
D
E
F
E?
winter 2008 P2P 5
Other Challenges
• Scale: up to hundred of thousands or millions of machines
• Dynamicity: machines can come and go any time
winter 2008 P2P 6
Peer-to-Peer Networks: Napster• Napster history: the rise
– January 1999: Napster version 1.0– May 1999: company founded– September 1999: first lawsuits– 2000: 80 million users
• Napster history: the fall– Mid 2001: out of business due to lawsuits– Mid 2001: dozens of P2P alternatives that were harder to
touch, though these have gradually been constrained– 2003: growth of pay services like iTunes
• Napster history: the resurrection– 2003: Napster reconstituted as a pay service– 2006: still lots of file sharing going on
Shawn Fanning,Northeastern freshman
winter 2008 P2P 7
Napster Technology: Directory Service
• User installing the software– Download the client program– Register name, password, local directory, etc.
• Client contacts Napster (via TCP)– Provides a list of music files it will share– … and Napster’s central server updates the directory
• Client searches on a title or performer– Napster identifies online clients with the file– … and provides IP addresses
• Client requests the file from the chosen supplier– Supplier transmits the file to the client– Both client and supplier report status to Napster
winter 2008 P2P 8
Napster• Assume a centralized index system that
maps files (songs) to machines that are alive
• How to find a file (song)– Query the index system return a machine that stores
the required file• Ideally this is the closest/least-loaded machine
– ftp the file
• Advantages: – Simplicity, easy to implement sophisticated search
• Disadvantages– Search scope may be quite large– Search time may be quite long– High overhead and nodes come and go often
winter 2008 P2P 19
Peer-to-Peer Networks: KaAzA
• KaZaA history– 2001: created by
Dutch company (Kazaa BV)
– Single network called FastTrack used by other clients as well
– Eventually the protocol changed so other clients could no longer talk to it
• Smart query flooding– Join: on start, the client
contacts a super-node (and may later become one)
– Publish: client sends list of files to its super-node
– Search: send query to super-node, and the super-nodes flood queries among themselves
– Fetch: get file directly from peer(s); can fetch from multiple peers at once
winter 2008 P2P 20
KaZaA: Exploiting Heterogeneity
• Each peer is either a group leader or assigned to a group leader– TCP connection between
peer and its group leader– TCP connections between
some pairs of group leaders
• Group leader tracks the content in all its children
ordinary peer
group-leader peer
neighoring re la tionshipsin overlay network
winter 2008 P2P 21
KaZaA: Motivation for Super-Nodes
• Query consolidation– Many connected nodes may have only a few files– Propagating query to a sub-node may take more
time than for the super-node to answer itself
• Stability– Super-node selection favors nodes with high up-time– How long you’ve been on is a good predictor of how
long you’ll be around in the future
winter 2008 P2P 22
Peer-to-Peer Networks: BitTorrent
• BitTorrent history and motivation– 2002: B. Cohen debuted BitTorrent– Key motivation: popular content
• Popularity exhibits temporal locality (Flash Crowds)• E.g., Slashdot effect, CNN Web site on 9/11,
release of a new movie or game
– Focused on efficient fetching, not searching• Distribute same file to many peers• Single publisher, many downloaders
– Preventing free-loading
winter 2008 P2P 23
BitTorrent: Simultaneous Downloading
• Divide large file into many pieces– Replicate different pieces on different peers– A peer with a complete piece can trade with
other peers– Peer can (hopefully) assemble the entire file
• Allows simultaneous downloading– Retrieving different parts of the file from
different peers at the same time
winter 2008 P2P 24
BitTorrent Components• Seed
– Peer with entire file– Fragmented in pieces
• Leacher– Peer with an incomplete copy of the file
• Torrent file– Passive component– Stores summaries of the pieces to allow peers to
verify their integrity
• Tracker– Allows peers to find each other– Returns a list of random peers
winter 2008 P2P 25
BitTorrent: Overall Architecture
Web page with link to .torrent
A
B
C
Peer
[Leech]
Downloader
“US”
Peer
[Seed]
Peer
[Leech]
TrackerWeb Server
.torr
ent
winter 2008 P2P 26
BitTorrent: Overall Architecture
Web page with link to .torrent
A
B
C
Peer
[Leech]
Downloader
“US”
Peer
[Seed]
Peer
[Leech]
Tracker
Get-announce
Web Server
winter 2008 P2P 27
BitTorrent: Overall Architecture
Web page with link to .torrent
A
B
C
Peer
[Leech]
Downloader
“US”
Peer
[Seed]
Peer
[Leech]
Tracker
Response-peer list
Web Server
winter 2008 P2P 28
BitTorrent: Overall Architecture
Web page with link to .torrent
A
B
C
Peer
[Leech]
Downloader
“US”
Peer
[Seed]
Peer
[Leech]
Tracker
Shake-hand
Web Server
Shake-hand
winter 2008 P2P 29
BitTorrent: Overall Architecture
Web page with link to .torrent
A
B
C
Peer
[Leech]
Downloader
“US”
Peer
[Seed]
Peer
[Leech]
Tracker
pieces
pieces
Web Server
winter 2008 P2P 30
BitTorrent: Overall Architecture
Web page with link to .torrent
A
B
C
Peer
[Leech]
Downloader
“US”
Peer
[Seed]
Peer
[Leech]
Tracker
piecespieces
pieces
Web Server
winter 2008 P2P 31
BitTorrent: Overall Architecture
Web page with link to .torrent
A
B
C
Peer
[Leech]
Downloader
“US”
Peer
[Seed]
Peer
[Leech]
Tracker
Get-
announce
Response-peer list
piecespieces
pieces
Web Server
winter 2008 P2P 32
Free-Riding Problem in P2P Networks
• Vast majority of users are free-riders– Most share no files and answer no queries– Others limit # of connections or upload speed
• A few “peers” essentially act as servers– A few individuals contributing to the public good– Making them hubs that basically act as a server
• BitTorrent prevent free riding– Allow the fastest peers to download from you– Occasionally let some free loaders download
winter 2008 P2P 33
Distributed Hash Tables (DHTs)
• Abstraction: a distributed hash-table data structure – insert(id, item);– item = query(id); (or lookup(id);)– Note: item can be anything: a data object,
• Route packet (ID, data) to the node responsible for ID using successor pointers
4
20
3235
8
15
44
58
lookup(37)
node=44
winter 2008 P2P 39
Joining Operation
• Each node A periodically sends a stabilize() message to its successor B
• Upon receiving a stabilize() message, node B – returns its predecessor B’=pred(B) to A by sending a
notify(B’) message
• Upon receiving notify(B’) from B, – if B’ is between A and B, A updates its successor to B’ – A doesn’t do anything, otherwise
winter 2008 P2P 40
Joining Operation• Node with id=50
joins the ring• Node 50 needs to
know at least one node already in the system– Assume known
node is 15
4
20
3235
8
15
44
58
50
succ=4pred=44
succ=nilpred=nil
succ=58pred=35
winter 2008 P2P 41
Joining Operation• Node 50:
send join(50) to node 15
• Node 44: returns node 58
• Node 50 updates its successor to 58
4
20
3235
8
15
44
58
50
join(50)
succ=58
succ=4pred=44
succ=nilpred=nil
succ=58pred=35
58
winter 2008 P2P 42
Joining Operation• Node 50:
send stabilize() to node 58
• Node 58: – update
predecessor to 50
– send notify() back
4
20
32
35
8
15
44
58
50
succ=58pred=nil
succ=58pred=35
stabilize()
notif
y(pr
ed=5
0)
pred=50succ=4pred=44
winter 2008 P2P 43
Joining Operation (cont’d)• Node 44 sends a stabilize
message to its successor, node 58
• Node 58 reply with a notify message
• Node 44 updates its successor to 50
4
20
3235
8
15
44
58
50
succ=58stabilize()no
tify(
pred
=50)
succ=50
pred=50succ=4
pred=nil
succ=58pred=35
winter 2008 P2P 44
Joining Operation (cont’d)• Node 44 sends a stabilize
message to its new successor, node 50
• Node 50 sets its predecessor to node 44
4
20
3235
8
15
44
58
50
succ=58
succ=50
Stabilize()pred=44
pred=50
pred=35
succ=4
pred=nil
winter 2008 P2P 45
Joining Operation (cont’d)
• This completes the joining operation!
4
20
3235
8
15
44
58
50succ=58
succ=50
pred=44
pred=50
winter 2008 P2P 46
Achieving Efficiency: finger tables
80 + 2080 + 21
80 + 22
80 + 23
80 + 24
80 + 25
(80 + 26) mod 27 = 16
0Say m=7
ith entry at peer with id n is first peer with id >= )2(mod2 min
i ft[i]0 961 962 963 964 965 1126 20
Finger Table at 80
32
4580
20112
96
winter 2008 P2P 47
Achieving Robustness
• To improve robustness each node maintains the k (> 1) immediate successors instead of only one successor
• In the notify() message, node A can send its k-1 successors to its predecessor B
• Upon receiving notify() message, B can update its successor list by concatenating the successor list received from A with A itself
winter 2008 P2P 48
Chord Optimizations
• Reduce latency– Chose finger that reduces expected time to reach
destination– Chose the closest node from range [N+2i-1,N+2i) as
successor
• Accommodate heterogeneous systems– Multiple virtual nodes per physical node
winter 2008 P2P 49
DHT Conclusions• Distributed Hash Tables are a key component
of scalable and robust overlay networks• Chord: O(log n) state, O(log n) distance• Both can achieve stretch < 2• Simplicity is key• Services built on top of distributed hash