Top Banner
P2P Search COP5711
38

P2P Search COP5711. 2 P2P Search Techniques Centralized P2P systems e.g. Napster, Decentralized unstructured P2P systems e.g. Gnutella.

Jan 18, 2018

Download

Documents

P2P Network P2P network is an overlay network built on top of a real physical network (e.g., Internet) In a P2P network, peers are network nodes connected by virtual or logical links A logical link is a path through many physical links in the underlying network 3
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: P2P Search COP5711. 2 P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized  unstructured P2P systems  e.g. Gnutella.

P2P SearchCOP5711

Page 2: P2P Search COP5711. 2 P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized  unstructured P2P systems  e.g. Gnutella.

2

P2P Search Techniques Centralized P2P systems

e.g. Napster, SETI@home

Decentralized & unstructured P2P systems e.g. Gnutella

Hybrid - partially decentralized e.g., Freenet

Structured P2P systems DHT CAN

Page 3: P2P Search COP5711. 2 P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized  unstructured P2P systems  e.g. Gnutella.

P2P Network P2P network is an overlay

network built on top of a real physical network (e.g., Internet)

In a P2P network, peers are network nodes connected by virtual or logical links

A logical link is a path through many physical links in the underlying network

3

Page 4: P2P Search COP5711. 2 P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized  unstructured P2P systems  e.g. Gnutella.

4

Napster server(Central Catalog)

(xyz.mp3, 192.1.2.3)

192.1.2.3

Napster: Publish a File

Users upload their IP address and music titles they wish to share

Page 5: P2P Search COP5711. 2 P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized  unstructured P2P systems  e.g. Gnutella.

5

Users search for peers to download desired files

xyz.mp3 ?

192.1.2.3192.1.2.3

Napster: Query for a File

Central Napster server

Page 6: P2P Search COP5711. 2 P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized  unstructured P2P systems  e.g. Gnutella.

6

File transfer is P2P, using a proprietary protocol

192.1.2.3

xyz.mp3 ?

Napster: Transfer Requested File

Central Napster server

Page 7: P2P Search COP5711. 2 P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized  unstructured P2P systems  e.g. Gnutella.

7

Disadvantage of Centralized Directory

Performance bottleneck

Single point of failure

Can we do it without a directory ?

Page 8: P2P Search COP5711. 2 P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized  unstructured P2P systems  e.g. Gnutella.

8

Decentralized P2P - Gnutella No catalog

Pings network to locate Gnutella peers

File requests are broadcast to peers

Flooding or breadth-first research

When provider is located, the file is transferred via HTTP

Page 9: P2P Search COP5711. 2 P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized  unstructured P2P systems  e.g. Gnutella.

9

Who are my neighbors ?

Gnutella: Join the Network

Peers areInternetedges

Special peer maintained by Gnutella

Pings network

to locate peers

Page 10: P2P Search COP5711. 2 P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized  unstructured P2P systems  e.g. Gnutella.

10

xyz.mp3 ?

Gnutella: Broadcast Request to Peers

Page 11: P2P Search COP5711. 2 P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized  unstructured P2P systems  e.g. Gnutella.

11

Gnutella: Flood the Request (Breadth-first research)

I have it.

Page 12: P2P Search COP5711. 2 P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized  unstructured P2P systems  e.g. Gnutella.

12

xyz.mp3

Gnutella: Reply with the File(via HTTP)

I have it.

Page 13: P2P Search COP5711. 2 P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized  unstructured P2P systems  e.g. Gnutella.

13

Gnutella - Disadvantages Network flooding - unnecessary

network traffic

Using TTL - some files might not be found

Alternatively, using ultranodes (or supernodes)using depth-first search, i.e., Freenet

Page 14: P2P Search COP5711. 2 P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized  unstructured P2P systems  e.g. Gnutella.

14

Morpheus, KazaaFlooding only the Supernodes

Cluster

Cluster

Cluster

Center Index for its cluster

C

B

A

F

E

D

I

H

G

Query: “W

ho has

file X”

Reply: “Peer H

has

file X”

Download file X from Peer H

SupernodeLayer

Page 15: P2P Search COP5711. 2 P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized  unstructured P2P systems  e.g. Gnutella.

15

Using Ultranodes Queries flood only the network of

ultranodes

Other peer nodes shielded from query traffic

Combine the benefits of centralized and decentralized search;

Take advantage of the heterogeneity in peer capabilities;

Page 16: P2P Search COP5711. 2 P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized  unstructured P2P systems  e.g. Gnutella.

16

Freenet - Depth-First Search

A

B

D

C

E

Query: “Who has file X”

Peer D might have file X

Peer E might have file X

Reply: “I have file X”

Reply : “Peer E has file X”

Reply : “Peer E

has file X”

Download file X from Peer E

Peer C might

have file X

Page 17: P2P Search COP5711. 2 P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized  unstructured P2P systems  e.g. Gnutella.

17

Freenet – File not Found

A

B

D

C

E

Peer D might have file X

Peer E might have file X

Peer C might

have file X F

NOT FOUND !

The requested file not found due to a poor routing decision made at peer D

In this case, query backs out of the dead-end, and tries another peer in depth-first manner

I havefile X

Page 18: P2P Search COP5711. 2 P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized  unstructured P2P systems  e.g. Gnutella.

Using Distributed Directory Data objects are everywhere

Distribute subsets of the data directory among peers

If we can find the relevant sub-directory, we can locate the data object

18

DirectoryData

ObjectsSub-directory

Page 19: P2P Search COP5711. 2 P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized  unstructured P2P systems  e.g. Gnutella.

19

How to Bound Search Space ?Basic Idea - Hashing

Hash key

Object “y”

Objects have hash keys

Peer “x”Peer nodes also have hash keys in the same hash space

P2P Network

y xH(y) H(x)

Join (H(x))Publish (H(y))

Place location information about an object at the peer with closest hash keys (i.e., a distributed directory)

Page 20: P2P Search COP5711. 2 P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized  unstructured P2P systems  e.g. Gnutella.

20

Viewed as a Distributed Hash Table

Hash table0 2128-1

Peer nodes• Each peer node is responsible for a range of

the hash table, according to the peer hash key

• Location information about Objects are placed in the peer with the closest key (information redundancy)

Page 21: P2P Search COP5711. 2 P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized  unstructured P2P systems  e.g. Gnutella.

21

How to Find an Object ?Looks for a peer /w the corresponding peer hash key

A peer knows its logical neighbors Find peer X based on multihop routing X knows who has the object

Hashtable

0 2128-1

Peernode X

Peer Y has the file

Page 22: P2P Search COP5711. 2 P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized  unstructured P2P systems  e.g. Gnutella.

22

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

Dynamic Hash Table (DHT) in action

Page 23: P2P Search COP5711. 2 P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized  unstructured P2P systems  e.g. Gnutella.

23

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

DHT in action

Page 24: P2P Search COP5711. 2 P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized  unstructured P2P systems  e.g. Gnutella.

24

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

DHT in action: put()

insert(K1,V1)

Operation: Route message, “I have the file,” to node holding key K1

Want to share a

file

Page 25: P2P Search COP5711. 2 P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized  unstructured P2P systems  e.g. Gnutella.

25

(K1,V1)

K V

K VK V

K V

K V

K V

K V

K V

K V

K V

K V

DHT in action: put()

Operation: take key as input; route messages to node holding key

Page 26: P2P Search COP5711. 2 P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized  unstructured P2P systems  e.g. Gnutella.

26

retrieve (K1)

K V

K VK V

K V

K V

K V

K V

K V

K V

K V

K V

DHT in action: get()

Operation: Retrieve message V1 at node holding key K1

Page 27: P2P Search COP5711. 2 P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized  unstructured P2P systems  e.g. Gnutella.

27

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

DHT in action

Retrieve file according to V1

Page 28: P2P Search COP5711. 2 P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized  unstructured P2P systems  e.g. Gnutella.

28

Still Flooding

Still flood the network although intermediate nodes do not need to search

Can we avoid flooding ?

Page 29: P2P Search COP5711. 2 P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized  unstructured P2P systems  e.g. Gnutella.

29

CAN – Content Addressable Network Each peer is

responsible for one zone, i.e., stores all (key, value) pairs of the zone

Each peer knows the neighbors of its zone

Random assignment of peers to zones at startup – split zone if not empty

Dimensional-ordered multihop routing

Page 30: P2P Search COP5711. 2 P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized  unstructured P2P systems  e.g. Gnutella.

30

CAN: Object Publishing

node I::publish(K,V) I

Page 31: P2P Search COP5711. 2 P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized  unstructured P2P systems  e.g. Gnutella.

31

(1) a = hx(K)

CAN: Object Publishingx = a

node I::publish(K,V) I

Page 32: P2P Search COP5711. 2 P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized  unstructured P2P systems  e.g. Gnutella.

32

(1) a = hx(K) b = hy(K)

CAN: Object Publishingx = a

y = b

node I::publish(K,V) I

Page 33: P2P Search COP5711. 2 P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized  unstructured P2P systems  e.g. Gnutella.

33

(1) a = hx(K) b = hy(K)

CAN: Object Publishing

(2) route (K,V) -> J

node I::publish(K,V) I

J

Page 34: P2P Search COP5711. 2 P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized  unstructured P2P systems  e.g. Gnutella.

34

(2) route (K,V) -> J

(3) J stores (K,V)

CAN: Object Publishing

(K,V)

node I::publish(K,V) I

(1) a = hx(K) b = hy(K)

J

Page 35: P2P Search COP5711. 2 P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized  unstructured P2P systems  e.g. Gnutella.

35

(2) route “retrieve(K)” to J that is in charge of (a,b)

(K,V)(1) a = hx(K) b = hy(K)

node I::retrieve(K)

I

CAN: Object Retrieval

J

Page 36: P2P Search COP5711. 2 P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized  unstructured P2P systems  e.g. Gnutella.

36

Maintenance

Inform neighbors that you are alive at discrete time interval t

If your neighbor does not send alive message in time t, takeover its zone

Page 37: P2P Search COP5711. 2 P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized  unstructured P2P systems  e.g. Gnutella.

P2P Benefits Efficient use of resources

Use unused bandwidth, storage, and processing power at the edge of the network

Scalability Consumers of resources also donate resources

Reliability Replicas, geographic distribution No single point of

failure Ease of administration

Self organized nodes Built-in reliability and load balancing

37

Page 38: P2P Search COP5711. 2 P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized  unstructured P2P systems  e.g. Gnutella.

Some Prototypes at UCF iSEE (Internet-scale Sensor Exploration Environement)Publishing real-time sensor data

Browsing and querying real-time sensor data

P2P Video Streaming for VoD and Live Broadcast Applications

38