Squirrel: A peer-to-peer web cache Sitaram Iyer Joint work with Ant Rowstron (MSRC) and Peter Druschel.

Post on 31-Mar-2015

217 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

Squirrel: A peer-to-peer web cache

Sitaram Iyer

Joint work with Ant Rowstron (MSRC)

and Peter Druschel

Peer-to-peer Computing

Decentralize a distributed protocol:– Scalable

– Self-organizing

– Fault tolerant

– Load balanced

Not automatic!!

Web Caching

1. Latency, 2. External bandwidth, 3. Server load.

ISPs, Corporate network boundaries, etc.

Cooperative Web Caching: group of web caches tied together and acting as one web cache.

Web Cache

Browser

Browser

Browser

Cache

Browser

Cache

Centralized

Web Cache

Web

Server

Sharing!

LAN Internet

Decentralized Web Cache

Browser

Browser

Browser

Cache

Browser

Cache

Web

Server

LAN Internet

• Why?• How?

Why peer-to-peer ?

1. Cost of dedicated web cache No additional hardware

2. Administrative costsSelf-organizing

3. Scaling needs upgrading Resources grow with clients

4. Single point of failure Fault-tolerant by design

Setting

• Corporate LAN

• 100 - 100,000 desktop machines

• Single physical location

• Each node runs an instance of Squirrel

• Sets it as the browser’s proxy

Pastry

Peer-to-peer object location and routing substrate

Distributed Hash Table:reliably map an object key to a live node

Routes in log2b(N) steps

(e.g. 3-4 steps for 100,000 nodes, with b=16)

Home-store model

client

home

LAN Internet

URL hash

Home-store model

client

home

…that’s how it works!

Directory model

Client nodes always store objects in local caches.

Main difference between the two schemes: whether the home node also stores the object.

In the directory model, it only stores pointers to recent clients, and forwards requests to them.

Directory model

client

home

NetLAN

Directory model

client delegate

homerando

mentry

(skip) Full directory protocol

dir

server

servere : cGET req

origin

origin

otherother

req

home

req

client

req

2

b : not-modified

3

e3

21c ,e : req

c ,e : object1

4a , d

2a , d : req 1a : no dir, go to origin. Also d2

3

1

not-modifiedobject or

dele-gate

Recap

• Two endpoints of design space, based on the choice of storage location.

• At first sight, both seem to do about as well. (e.g. hit ratio, latency).

Quirk

Consider a– Web page with many images, or– Heavily browsing node

In the Directory scheme,Many home nodes pointing to one

delegate

Home-store: natural load balancing.. evaluation on trace-based workloads ..

Trace characteristics

Redmond Cambridge

Total duration 1 day 31 days

Number of clients 36,782 105

Number of HTTP requests 16.41 million 0.971 million

Peak request rate 606 req/sec 186 req/sec

Number of objects 5.13 million 0.469 million

Number of cacheable objects 2.56 million 0.226 million

Mean cacheable object reuse 5.4 times 3.22 times

Total external bandwidth

85

90

95

100

105

0.001 0.01 0.1 1 10 100

Tot

al e

xter

nal b

andw

idth

(in G

B)

[

low

er is

bet

ter]

Per-node cache size (in MB)

Directory

Home-store

No web cache

Centralized cache

Redm

ond

Total external bandwidth

5.5

5.6

5.7

5.8

5.9

6

6.1

0.001 0.01 0.1 1 10 100

Tot

al e

xter

nal b

andw

idth

(in G

B)

[

low

er is

bet

ter]

Per-node cache size (in MB)

Directory

Home-store

No web cache

Centralized cache

Cam

bri

dg

e

LAN Hops

0%

20%

40%

60%

80%

100%

0 1 2 3 4 5 6

Fra

ctio

n of

cac

heab

le r

eque

sts

Total hops within the LAN

Centralized Home-store Directory

Redm

ond

LAN Hops

0%

20%

40%

60%

80%

100%

0 1 2 3 4 5

Fra

ctio

n of

cac

heab

le r

eque

sts

Total hops within the LAN

Centralized Home-store Directory

Cam

bri

dg

e

Load in requests per sec

1

10

100

1000

10000

100000

0 10 20 30 40 50

Num

ber

of s

uch

seco

nds

Max objects served per-node / second

Home-storeDirectory

Redm

ond

Load in requests per sec

1

10

100

1000

10000

100000

1e+06

1e+07

0 10 20 30 40 50

Num

ber

of s

uch

seco

nds

Max objects served per-node / second

Home-storeDirectory

Cam

bri

dg

e

Load in requests per min

1

10

100

0 50 100 150 200 250 300 350

Num

ber

of s

uch

min

utes

Max objects served per-node / minute

Home-storeDirectory

Redm

ond

Load in requests per min

1

10

100

1000

10000

0 20 40 60 80 100 120

Num

ber

of s

uch

min

utes

Max objects served per-node / minute

Home-storeDirectory

Cam

bri

dg

e

Conclusion

Possible to decentralize web caching

Performance comparable to centralized cache

Is better in terms of cost, administration, scalability and fault tolerance.

(backup) Storage utilization

Redmond Home-store Directory

Total 97641 MB 61652 MB

Mean per-node 2.6 MB 1.6 MB

Max per-node 1664 MB 1664 MB

(backup) Fault tolerance

Home-store Directory

EquationsMean H/O

Max Hmax /O

Mean (H+S)/O

Max max(Hmax,Smax)/O

Redmond

Mean 0.0027%

Max 0.0048%

Mean 0.198%

Max 1.5%

Cambridge

Mean 0.95%

Max 3.34%

Mean 1.68%

Max 12.4%

(backup) Full home-store protocol

server

client

otherother

req

home

req

req

a : object or notmod from home

b : object or notmod from origin3

1

b2

(WAN)(LAN)

origin

b : req

(backup) Full directory protocol

dir

server

servere : cGET req

origin

origin

otherother

req

home

req

client

req

2

b : not-modified

3

e3

21c ,e : req

c ,e : object1

4a , d

2a , d : req 1a : no dir, go to origin. Also d2

3

1

not-modifiedobject or

dele-gate

top related