Top Banner
Network Computing Laboratory Scalable File Sharing System Using Distributed Hash Table Idea Proposal April 14, 2005 Presentation by Jaesun Han
21

Network Computing Laboratory Scalable File Sharing System Using Distributed Hash Table Idea Proposal April 14, 2005 Presentation by Jaesun Han.

Jan 13, 2016

Download

Documents

Alison Douglas
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Network Computing Laboratory Scalable File Sharing System Using Distributed Hash Table Idea Proposal April 14, 2005 Presentation by Jaesun Han.

Network Computing Laboratory

Scalable File Sharing System Using Distributed Hash Table

Idea Proposal

April 14, 2005

Presentation by

Jaesun Han

Page 2: Network Computing Laboratory Scalable File Sharing System Using Distributed Hash Table Idea Proposal April 14, 2005 Presentation by Jaesun Han.

Network Computing Laboratory | 2

Korea Advanced Institute of Science and Technology

Contents

One Line CommentMotivation & ProblemsMy Idea

Key IdeaDistributed Hash TableP2P file sharing system using DHT

Technical challengesConclusion

Page 3: Network Computing Laboratory Scalable File Sharing System Using Distributed Hash Table Idea Proposal April 14, 2005 Presentation by Jaesun Han.

Network Computing Laboratory | 3

Korea Advanced Institute of Science and Technology

One-line comment

Achieving fully decentralized P2P file sharing system by distributing file indexing structure as distributed hash table (DHT)

Page 4: Network Computing Laboratory Scalable File Sharing System Using Distributed Hash Table Idea Proposal April 14, 2005 Presentation by Jaesun Han.

Network Computing Laboratory | 4

Korea Advanced Institute of Science and Technology

Scalability in file sharing is a practical key issue!!! Even worse is

the request of infamous files network attack like DDoS

Internet Explosion

InternetKorean UsersUS Users

Hot!!File sharing

infrastructure

Page 5: Network Computing Laboratory Scalable File Sharing System Using Distributed Hash Table Idea Proposal April 14, 2005 Presentation by Jaesun Han.

Network Computing Laboratory | 5

Korea Advanced Institute of Science and Technology

Solution Approach

Scalable solution for file sharing Investigate currently existing file sharing solutionsCurrently P2P based file sharing seems the most appropriate Investigate methods to provide scalability to P2P based

approaches

Fully decentralized architecture for P2P based file sharing

Page 6: Network Computing Laboratory Scalable File Sharing System Using Distributed Hash Table Idea Proposal April 14, 2005 Presentation by Jaesun Han.

Network Computing Laboratory | 6

Korea Advanced Institute of Science and Technology

Key Idea

Decentralized indexing Existing schemes are either centralized or self-indexing

E.g., … Self-indexing is not a index scheme. They have no indexing scheme. Solve the absence of indexing scheme by flooding-based search mechanism High search overhead

k.mp3

Search(k.mp3)

1

2

3

4Node3n.mp3

Node2s.mp3

Node4b.mp3

Node1k.mp3

nodefile

CentralIndex table

Search(k.mp3)

1

2

3

4

DistributedIndex table

a-e

f-m

n-r

s-z

Page 7: Network Computing Laboratory Scalable File Sharing System Using Distributed Hash Table Idea Proposal April 14, 2005 Presentation by Jaesun Han.

Network Computing Laboratory | 7

Korea Advanced Institute of Science and Technology

Key Idea

Distributed Indexing Split index table & distribute each part to each node

Hash Table for Distributed IndexingPossible to fast lookup Input to hash table : file name

Output from hash table : node address

Distributed Hash Table for Distributed IndexingSplit hash table & distribute to each nodeLookup through shortcut path

P2P file sharing with DHT

Page 8: Network Computing Laboratory Scalable File Sharing System Using Distributed Hash Table Idea Proposal April 14, 2005 Presentation by Jaesun Han.

Network Computing Laboratory | 8

Korea Advanced Institute of Science and Technology

DHT based File sharing: Technical Challenges

CHALLENGE

CHALLENGE

Search(k.mp3)

1

2

3

4

DistributedIndex table

a-e

f-m

n-r

s-zRouting?!

Nodes often join and leave!

Page 9: Network Computing Laboratory Scalable File Sharing System Using Distributed Hash Table Idea Proposal April 14, 2005 Presentation by Jaesun Han.

Network Computing Laboratory | 9

Korea Advanced Institute of Science and Technology

Related Works

Peer-to-Peer File Sharing SystemSharing files among personal computers [e.g.] Soribada, eDonkey, KaZaa, Gnutella33.4% of Internet traffic in KT investigation (2004.2)Millions of simultaneous users

Key technical issues in file indexing of existing P2P file sharing systemEvolution of indexing scheme for improving scalability

1st generation : centralized indexing2nd generation : fully decentralized self-indexing3rd generation : semi-centralized indexing

Page 10: Network Computing Laboratory Scalable File Sharing System Using Distributed Hash Table Idea Proposal April 14, 2005 Presentation by Jaesun Han.

Network Computing Laboratory | 10

Korea Advanced Institute of Science and Technology

Related Works

First generation file sharing systemCentralized indexing ([e.g.] Soribada, Napster)Problems : not scalable, single point of failure

CentralizedDirectoryServer

(napster.com)

N1

N2N3

N4

N5

… …a.mp

3N5

… …

file node

Search(a.mp3)

N5 IP addr.

Request(a.mp3)

File(a.mp3)

Page 11: Network Computing Laboratory Scalable File Sharing System Using Distributed Hash Table Idea Proposal April 14, 2005 Presentation by Jaesun Han.

Network Computing Laboratory | 11

Korea Advanced Institute of Science and Technology

Related Works

Second generation file sharing systemFully decentralized self-indexing ([e.g.] Gnutella)Problems : flooding overhead, partial searching

N1

N2

N3

N5

N4

N7

N6N8

N9

Search(a.mp3)

Search Result N3, N5, N8Selected Node N5

Page 12: Network Computing Laboratory Scalable File Sharing System Using Distributed Hash Table Idea Proposal April 14, 2005 Presentation by Jaesun Han.

Network Computing Laboratory | 12

Korea Advanced Institute of Science and Technology

Related Works

Third generation file sharing systemSemi-centralized Indexing ([e.g.] eDonkey, KaZaa)Problems : partial searching, weak to DoS attack

SupernodeSupernode

Search (a.mp3)

File (a.mp3)

Page 13: Network Computing Laboratory Scalable File Sharing System Using Distributed Hash Table Idea Proposal April 14, 2005 Presentation by Jaesun Han.

Network Computing Laboratory | 13

Korea Advanced Institute of Science and Technology

Distributed Hash Table, Basic (1)

Distributed Hash TableFile name H(x) File ID, Node address H(x) Node IDMapping File ID to Node ID

hash key node

0

1

9, 20

98

2 3,7,11

12767,10

2

H(x)

a.mp3

k.txt

x.mpg

g.doc

FileName

Node

k

b

w

n

Node IDFile ID

30(0-30)

71(31-71)

89(71-89)

127(89-127)

H(x)

k

b

n

w

NodeAddress

Page 14: Network Computing Laboratory Scalable File Sharing System Using Distributed Hash Table Idea Proposal April 14, 2005 Presentation by Jaesun Han.

Network Computing Laboratory | 14

Korea Advanced Institute of Science and Technology

Distributed Hash Table, Basic (2)

Key and Node are uniformly distributed and exist in the same ID space Each node is responsible to keys between predecessor node and itself

000

001

010

011

100

101

110

111001000 g.txt(2,8)

-

010 a.mp3(1)

100011

x.doc(4)-

110101s.mpg(1,4)

-111 k.mp3(2)

H(g.txt)

H(a.mp3)

H(x.doc)

H(s.mpg)

H(k.mp3)

Page 15: Network Computing Laboratory Scalable File Sharing System Using Distributed Hash Table Idea Proposal April 14, 2005 Presentation by Jaesun Han.

Network Computing Laboratory | 15

Korea Advanced Institute of Science and Technology

Distributed Hash Table, Routing (1)

Naïve approachEach node knows one’s successor node Lookup request is forwarded to the successor

until (Node ID < File ID < Successor Node ID)Worse case performance : O(N)

000

001

010

011

100

101

110

111110101s.mpg(1,8)

-111 k.mp3(2)

successor=010

successor=100

successor=111

successor=001

Lookup (H(k.mp3)) Lookup (101)

Page 16: Network Computing Laboratory Scalable File Sharing System Using Distributed Hash Table Idea Proposal April 14, 2005 Presentation by Jaesun Han.

Network Computing Laboratory | 16

Korea Advanced Institute of Science and Technology

Distributed Hash Table, Routing (2)

Tree-based routing tableShortcut to nodes whose no

de ID have different bits in each bit position

2m ID space m entriesLookup performance O(logN)

0

1

1

1

0

0

000

001

010

011

100

101

110

111 d

a

b

c

01100x1xx

cac

10111x0xx

dda

Shortcuttable

Lookup(101)

Page 17: Network Computing Laboratory Scalable File Sharing System Using Distributed Hash Table Idea Proposal April 14, 2005 Presentation by Jaesun Han.

Network Computing Laboratory | 17

Korea Advanced Institute of Science and Technology

Distributed Hash Table, Routing (3)

Complete example of routing table & routing algorithm

Lookup from node 65a1fc with key d46a1c

Lookup from node 65a1fc with key d46a1cRouting TablesRouting Tables

Page 18: Network Computing Laboratory Scalable File Sharing System Using Distributed Hash Table Idea Proposal April 14, 2005 Presentation by Jaesun Han.

Network Computing Laboratory | 18

Korea Advanced Institute of Science and Technology

Distributed Hash Table, Join

Join processTry to lookup with one’s node ID as lookup keyGathering routing table entries in routing

d46a1c

Lookup from node d46a1c with key d46a1c

Lookup from node d46a1c with key d46a1c

0- 1- 2- 3- 4- 5- 6- 7- 8- 9- a- b- c- e- f-

d0- d1- d2- d3- d5- ….. dc- dd- de- df-

d40- d41- d42- d43- d44- d45- ….. d4f-

Routing TablesCreation

Routing TablesCreation

Page 19: Network Computing Laboratory Scalable File Sharing System Using Distributed Hash Table Idea Proposal April 14, 2005 Presentation by Jaesun Han.

Network Computing Laboratory | 19

Korea Advanced Institute of Science and Technology

P2P File Sharing with DHT

Storing file index into DHTExample : node a shares new file g.txt, node b lookup g.txt

000

001

010

011

100

101

110

111

01100x1xx

cac

10111x0xx

dda

00001x1xx

acc

11010x0xx

dca

a

b

c

d

1. Hash g.txtFile ID=101

2. Insert file infowith ID=101

g.txt

3. Hash g.txtFile ID=1014. Lookup

with ID=101

101 g.txt a

ID Filename

Nodeaddr

File index table

addr(a)

5. Downloadfile g.txt

g.txt

Page 20: Network Computing Laboratory Scalable File Sharing System Using Distributed Hash Table Idea Proposal April 14, 2005 Presentation by Jaesun Han.

Network Computing Laboratory | 20

Korea Advanced Institute of Science and Technology

File Sharing with DHT: Technical Challenges

Frequent node join & leave

Index replication & fast routing table adaptation

Exact matching search by hashing file name

Keyword search scheme

Hotspot problem in node which is indexing a popular file

Load balancing mechanism

Page 21: Network Computing Laboratory Scalable File Sharing System Using Distributed Hash Table Idea Proposal April 14, 2005 Presentation by Jaesun Han.

Network Computing Laboratory | 21

Korea Advanced Institute of Science and Technology

Conclusion

New approach for P2P file sharing systemUsing new distributed data structure,

Distributed Hash Table (DHT)Fully decentralized indexingGuarantee lookup performance of O(logN)Possible to full searchRobust to node failure & network attack like DoS attack