What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Introduction on Peer to Peer systems
Georges Da Costa
Yerevan, Armenian National Academy of Sciences
[email protected] 1/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Goal of this Lecture
What can P2P do, not only as a buzzword
What it can't do
Shows some examples & algorithms
A Survey and Comparison of Peer-to-Peer Overlay Network Schemes, by Eng Keong Luaand al.
in IEEE Communications survey and tutorial March 2004
Harnessing the Power of Disruptive Technologies
published by O'Reilly, 2001
[email protected] 2/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
1 What is P2P
2 First generation systems
3 Self-organized systems
4 Structured systems
5 Distributed Hash Table
6 Conclusion
[email protected] 3/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Plan
1 What is P2P
2 First generation systems
3 Self-organized systems
4 Structured systems
5 Distributed Hash Table
6 Conclusion
[email protected] 4/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Universal
What have in common
Net Meeting, Skype, Ekiga
Irc, Msn, Icq, Jabber
Kazza, Freenet, Napster, Gnutella
Seti@Home, Folding@Home
Ebay, Flickr, Facebook
[email protected] 5/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
De�nition
Philosophical one
Participants gathering their resources in order to achieve a common goal
[email protected] 6/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Why ?
Available resources
Large Hard Drives
Powerful CPUs
Correct connexion to Internet
Users want
More freedom
No link to commercial companies
No infrastructure cost
[email protected] 7/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
A new (?) solution : Peer to Peer sys-tems
De�nition
Participant gathering their resources in order to achieve a common goal
Computers are running the same code
There is no global view of the system
View is limited to neighboors
Everyone has the same rights and duties
[email protected] 8/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Peer-to-Peer: New name, old concept
An architecture already there
Internet connects most of existing computers
Most computers are not fully used
Idle time > 75% on personal computersStorage systems are mostly empty
Already used between servers
Usenet
DNS
IP Routing
[email protected] 9/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Comparison with Client/Server
In client/Server each node is either a Client or a Server. Usually there are a fewServers and lots of Clients.
Client/Server systems su�er from single point of failure.
Client/Server are mostly static, at least the Servers. Peer to Peer systems aredynamics.
Client/Server systems need human administrators
Client/Server does not scale
[email protected] 10/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Comparison with Client/Server II
Client
Client
Client
Client
Client
Client
Node
Node
Node
NodeNode
Server
[email protected] 11/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Comparison with Client/Server II
When a new participant joins a service, the service increase the resource consumption
Client/Server : increases the server power/connectivity
Peer to Peer : uses the resources given by the participant
[email protected] 12/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Not so easy
Wanted
Scalability (1K,100K,1M nodes)
Dynamicity
Security (user, task)
Transparent
For the user (CPU,memory,disk)For the network
Heterogeneity
Self-organization
Participation (66% of Free riders)
Go through NAT/Firewall
[email protected] 13/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Self-organization
Participants
High volatility & voluntary
No central administration
Resource discovery
Heterogeneity
HardwareUsers (15% of users have 94% of �les)
Distribution of the resources
Trust
[email protected] 14/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
What's not new
Partial solutions
Scalability : Farm of web servers
Dynamism : Cell phones
Fault tolerance : Redundant servers
[email protected] 15/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Current Peer to Peer systems
Available applications
File sharing
Distributed storage
Content delivery
Distributed computing
Telephony/Chat
Games
[email protected] 16/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Current Peer to Peer systems (cont)
Widely used
2004 : According to British Web analysis �rm CacheLogic, BitTorrent accounts for anastounding 35 percent of all the tra�c on the Internet � more than all otherpeer-to-peer programs combined � and dwarfs mainstream tra�c like Web pages
Start-ups
Skype (ok, no more a small start-up)
BitTorrent
UbiStorage
[email protected] 17/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Two worlds
Internet Users
Problem of security
Large scale
No control
Motivation needed
Private Area (Corp., Univ.)
Other mean of security
Medium to large scale
Total control
[email protected] 18/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Plan
1 What is P2P
2 First generation systems
3 Self-organized systems
4 Structured systems
5 Distributed Hash Table
6 Conclusion
[email protected] 19/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Index Method
Server
Client
Client
Client Client
Client
Static connexion
Sending the files list
File transfert
Sending a request for a file
Users send the list of their �les to a server
To �nd a �le, you send a request to the server
It answers with the list of clients owning the �le
You directly contact the owners for the transfer
[email protected] 20/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Index Method II
Systems
Napster, Mojonation, Yaga, Filetopia, Seti@Home
Problems
Scaling
Price
HotSpot
Attack
Single point of failure
[email protected] 21/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Useful when...
Small number of client
Need a total control of transfers (video game industry)
Performance is more important than cost
[email protected] 22/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
BitTorrent
Same approach as Napster, but :
Downloads are done in parallel
One server per �le
Server manages all the details of transfers
Server enforces the rule The more you share, the more you get
Di�erences
Specialized for large �les
Distributed due to the One server per �le rule
[email protected] 23/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Privacy
No privacy
Napster : The server knows all transfers
BitTorrent : For each �le, a server knows all transferts
[email protected] 24/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Flooding
ClientClient
ClientClient Client
Client
Client
You send your request to your neighbors
They forward it to their neighbors, and so on until reaching the Time To Live depth
Users with �les corresponding to the request answer
[email protected] 25/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Flooding II
Systems
Gnutella, Direct Connect
Characteristics
Distributed structure
No single point of failureDenial of service di�cult (but possible)
Not scalable
Resource consumption (network)Not complete answers
[email protected] 26/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Privacy
Average to good privacy
Onion routing (good privacy)
No global view of the system
Usually easy to obtain the shared list of a node
Di�cult to have a global impact
[email protected] 27/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Super Peers
SuperPeerSuperPeer SuperPeer
SuperPeerSuperPeer
SuperPeer
PeerPeer Peer
Super Peers act as local servers
Some reliable nodes act as super peers
Super peers are connected with a gnutella protocol
Each super peer acts as a local server for several peers
[email protected] 28/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Super Peers II
Systems
Gnutella2, Kazaa
Characteristics
Less distributed structure
Some nodes are more loadedSome nodes are more important
Scalable
Less resource consumption due to limits of number of answers
[email protected] 29/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Plan
1 What is P2P
2 First generation systems
3 Self-organized systems
4 Structured systems
5 Distributed Hash Table
6 Conclusion
[email protected] 30/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
A case study : Freenet
Ian Clarke, University of Edinbourgh, (1999)
Keywords
A peer-to-peer �le sharing system
Provide anonymity for authors and readers
A web of Freedom
Principle
Files are referenced by key
The key is obtained by SHA-1 on the �le
The key is routed to localize the �le
[email protected] 31/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Content Driven routing algorithm
Routing table contains a set of key/node pairs
Take the nearest key in the routing table to obtain the next node to consult.
Nearest key = by lexical comparison
Request
Data
3
2 5 6
4
1
78a b
c d
e
abc
node cacdabb
node [email protected] 32/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
On the path of the answer
File is replicated on the path in the cacheCache : variant of Last Recently UsedRouting tables are updated
→ the graph evolves (new links = new entries)
a b
c d
e
Old links New links (entries)node c
acdabb
node babc node d
[email protected] 33/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Anonymity
Reader
Impossible to know if a user is forwarding or initiating the request
Impossible to know if a user is the last to receive a �le
Writer
Once in the system, the writer can disconnect
Impossible to know if someone insert some �le or forward it
[email protected] 34/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Some properties
Self-organization of the graph
Nodes specialize in �les with close keys (learning process)
Good properties (Small World)
File are automatically replicated in function of their popularity
Hot-spots are limited
Tolerant against attacks
[email protected] 35/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Drawbacks
Counterpart
Files might disappear (LRU cache)
The network is heavily loaded
Di�cult to update a value
Impossible to know what is hosted locally
[email protected] 36/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Plan
1 What is P2P
2 First generation systems
3 Self-organized systems
4 Structured systems
5 Distributed Hash Table
6 Conclusion
[email protected] 37/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Pastry
Principle
Each �le has a key
Each node has an identi�er
Node with identi�er Id manages keys whose values are near Id
Queries
Content driven queries
Su�x forwarding
[email protected] 38/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Pastry II
05981598259835984598559865987598
x098x198x298x398x498x598x698x798
xx08xx18xx28xx38xx48xx58xx68xx78
xxx0xxx1xxx2xxx3xxx4xxx5xxx6xxx7
Links to the neighbor
Table of the node 4598
87CA
D598
1598
2118
09 98
8 F4B
0325
4598
3E98
00982BB8
598 8
Neighbors of Id are chosen as to have the su�x of their identi�er in common withId
[email protected] 39/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Pastry III
Pros
ln(n) messages guarantee
Good path redundancy
Cons
Di�cult to keep a synchronized neighbor table
Problem of data redundancy
No adaptation to data dynamicity
[email protected] 40/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Plan
1 What is P2P
2 First generation systems
3 Self-organized systems
4 Structured systems
5 Distributed Hash Table
6 Conclusion
[email protected] 41/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Current state of Peer to Peer systems
A lot of redundant systems
Typically File Sharing
Common basic component
Distributed index (Key, Value)
Key is typically the �lename
Value is typically the �le content or where to obtain it
Each Key is associated with a node
[email protected] 42/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Generic Interface
Node Id : k-bit identi�er (unique)
Key : k-bit identi�er (unique)
Value : bytes (can be a �le, an IP, ...)
Generic DHT (Distributed Hash Table)
put(key, value)
Stores (key, value) on the node responsible of key
value = get(key)
Retrieves the data associated with key
[email protected] 43/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Current implementations
Software
Kadmelia
Chord
CAN
Usage
File sharing
Naming
Chat service
Databases
[email protected] 44/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Still limited
Fundamental Problems
Complex request
Data coherence
Request with several answer
Implementation di�culties
Distribute workload evenly
KeysRequests
Only local information
Dynamic information
[email protected] 45/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Chord structure
Nodes are distributedon a circle
Keys are assigned tothe node with Id justbefore their value
0
128
64192
75
61Key in store of 61
62, 66 74
[email protected] 46/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Neighbors
Log(N) neighbors
Neighbors are nodesId + 1, Id + 2, Id +4, ..., Id + 2i , ..., Id +2k−1 (modulo 2k).
0
128
64192
[email protected] 47/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Routing algorithm
Forward to theneighbor which isprior to the key
Query needs at mostLog(N) messages
0
128
192 64
Query for 40
Node responsible of
Id between 35 and 50
[email protected] 48/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Chord characteristics
E�cient
If a (key, value) exists, the query will �nd it
Fast : Log2(1.000.000) = 23
Small neighbors table Log2(N)
[email protected] 49/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Chord characteristics
Some problems
Security and privacy
Attack
How to test and evaluate such system ?
Real performance (instead of number of messages)
[email protected] 50/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Physical overlay
Logical topology mapped in the physical network :
N2
N1
Query
Answer
[email protected] 51/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Plan
1 What is P2P
2 First generation systems
3 Self-organized systems
4 Structured systems
5 Distributed Hash Table
6 Conclusion
[email protected] 52/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Conclusion
Peer to Peer systems are e�cient for several uses (using border resources)
Recent systems are scalable
Low cost alternative to Client/Server
Field old enough to be used in real cases
Still not perfect
Trust & certi�cationAnonymitySecurityPerformanceLayers fees
[email protected] 53/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
When to use Peer to Peer systems
Limited budget
Large audience
Trusted users
Dynamic system, but not too much
Do not need guarantee
Do not need control
[email protected] 54/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Vision of the future
User centered
No more serversAll content provided and served by users
Only cooperation of peers
WikipediaSocial networksYoutubeGood Ol' Time web-pages
[email protected] 55/55