Abstract-Conventional web caching systems based on cli- ent-server model often suffer from the limited cache space and the single point of failure. In this paper, we present a novel peer-to-peer client web caching system, in which end-hosts collec- tively share their web cache contents. Aggregating these individual web caches, a huge virtual cache space is formed, and the burden on web servers can be greatly lightened. We design an efficient algorithm for managing and searching in the aggregated cache. We also implement consistency control to prevent sharing stale web objects in peers’ caches. Finally and most importantly, con- sidering that end-hosts are generally not trusty as servers or proxies, we employ an opinion-based sampling technique to minimize the chance of distributing forged copies from malicious nodes. We have built a prototype of the proposed system, and our experimental results demonstrate that it has fast response time with low overhead, and can effectively identify and block mali- cious peers. 1 I.I NTRODUC TIONIn the past decade, the web is growing with tremendous speed and the contents are becoming enormously rich. To reduce network traffic and user latency, web caching systems have been widely deployed [14,15]. However, existing caching sys- tems often suffer from the limited cache space and the risk ofsingle point failure. There have been many proposals on coop- erative caching among proxies [2]-[6], yet they may still sufferfrom the similar problems in a traditional client-server model. In this paper, we present a novel peer-to-peer (P2P) client web caching system, in which end-hosts in the network collec- tively share their web cache contents. Aggregating these indi- vidual web caches, a huge virtual cache space is formed, and the burden of the w eb server can be greatly lig htened. Yet there are several critical issues to solve: First, how shall we manage the client caches for ease of search? Second, how to control the consistency with dynamic nodes? Third, how to maintain a reasonable trust-level of the system, especially considering that the clients are generally not trusty as servers or proxies? To this end, we design an efficient algorithm for managing and searching the aggregated cache. We also implement con- sistency control to prevent sharing stale web objects in peers’ web cache. Finally and most importantly, we propose an opin- 1 J. Liu’s work is partially supported by a Canadian NSERC Discovery Grant and a SFU President’s Research Grant. 2 X. Chu’s work is partially supported by a Research Grant Council, Hong Kong, China, under Grant RGC/HKBU2159/04E, and a Faculty Research Grant of Hong Kong Baptist University (FRG/03-04/II-22). ion-based sampling technique to minimize the chance of dis- tributing forged copies from malicious nodes. We have built a prototype of the proposed system, and our experimental results demonstrate that it has fast response time with low overhead, and can effectively identify and block malicious nodes. The remainder of this paper is organized as follows. In Sec- tion II, we present our trustable P2P web caching system in detail. The performance evaluation of the system is presented in Section III. Finally, Section IV concludes the paper. II.THE P2PCLIENT WEB CACHING SYSTEMFig. 1 depicts a generic P2P web caching system. With a P2P network, the storage spaces of several machines are virtually combined to form a huge web cache space to serve all the peers (clients). We now detail the operations of the system, including discovering neighboring nodes, searching desired web objects, and maintaining the trust level. Fig. 1. Overview of the P2P web caching systemA.Neighbor Discovery Discovering other online nodes is quite an important issue in decentralized P2P network. A careless design of discovery protocol, like pinging a range of IP addresses and ports, would cause heavy network traffic overhead. Motivated by the JXTA Peer Discovery Protocol (PDP) [11], we have implemented two ways for discovering peers. One is active, in which a peer is allowed to request new peer information from its existing neighbors. The other method is a passive one where a peeradvertises itself to other peers periodically. There is no single dedicated bootstrap server in our system. Every peer will keep a list of cached address for start up. We assume that every peer in the network knows at least one otherOn Peer-to-Peer Client Web Cache Sharing Jiangchuan Liu 1 , Xiaowen Chu 2 , Ke Xu 3 1 School of Computing Science, Simon Fraser University, BC, Canada 2 Department of Computer Science, Hong Kong Baptist University, Hong Kong, China 3 Department of Computer Science and Technology, Tsinghua University, Beijing, China 306 -7803-8938- 7/05/$20.00 (C) 2005 IEEE Authorized licensed use limited to: UNIVERSITI UTARA MALAYSIA. Downloaded on August 25, 2009 at 06:31 from IEEE Xplore. Restrictions apply.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
In this paper, we have presented a trustable peer-to-peer web
caching system, in which peers in the network share their web
cache contents. To increase the trust-level of the system, we
propose to use a sampling technique to minimize the chance of
distributing fake web file copies among the peers. We further
introduce the concept of opinion to represent the trustworthiness
of individual peer. We have built a prototype of the proposed
system, and our experimental results demonstrate that it has fast
response time with low overhead, and can effectively identify
and block malicious peers.
Variation of Disbelief against Time for 16 Nodes
0
0.05
0.1
0.15
0.2
0.25
0.3
1 3 5 7 9 11 13 15 17 19 21 23 25
Time (in minute)
D i s b e l i e f
Node0
Node1
Node5
Node7
Node8
Node9
Node15
Fig. 9. Disbelief level against time for 16 nodes
ACKNOWLEDGEMENT
The authors would like to thank for C. Ng and P. Choi for
their effort in building the prototype.
R EFERENCES
[1] A. Luotonen, Web Proxy Servers, Prentice Hall, Englewood Cliffs, NewJersey, 1998
[2] D. Wessels and K. Claffy, “Internet Cache Protocol. (v2)”, RFC2186 ,September 1997.
[3] D. Wessels and K. Claffy, “Application of Internet Cache Protocol. (v2)”,RFC2187, September 1997.
[4] K. W. Ross, “Hash Routing for Collections of Shared Web Caches”, IEEE Network , vol. 11, pp. 37-44, November /December 1997.
[5] T. Asaka, H. Miwa, and Y. Tanaka, “Distributed Web Caching usingHash-based Query Caching Method”, in Proceedings of IEEE Interna-tional Conference on Control Applications, vol. 2, pp 1620-1625, 1999.
[6] I. Clarke et al., “Protecting Free Expression Online with Freenet”, IEEE Internet Computing , vol. 5, no. 1, pp. 40-49, 2002.
[7] S. Lyer, A. Rowstron, and P. Druschel, “Squirrel: A Decentralized
Peer-to-peer Web Cache”, in Proceedings of ACM Symposium on Prin-ciples of Distributed Computing , 2002.
[8] National Institute of Standards and Technology, “Secure Hash Standard”,April 17, 1995.
[9] D. Eastlake 3rd and P. Jones, “US Secure Hash Algorithm 1 (SHA1)”, RFC3174, September 2001.
[10] D. A. Menascé, “Scalable P2P Search”, IEEE Internet Computing , vol. 7, pp. 83-87, March / April 2003.
[11] R. Flenner, Java P2P unleashed , Indianapolis: Sams, pp. 122-135, 2003.
[12] X. Li, M. R. Lyu, and J. Liu, “A Trust Model Based Routing Protocol for Secure Ad Hoc Network ”, in Proceedings of IEEE Aerospace Confer-ence, Big Sky, MT, March 6-13, 2004.
[13] J. Gwertzman and M. Seltzer, “World-Wide Web Cache Consistency”, in Proceedings of the USENIX Annual Technical Conference, pp. 141-152,San Diego, CA, January 1996.
[14] J. Liu and J. Xu, “Proxy Caching for Media Streaming over the Internet,” IEEE Communications, Feature Topic on Proxy Support for Streaming onthe Internet , August 2004.
[15] J. Xu, J. Liu, B. Li, and X. Jia, “Caching and Prefetching for Web ContentDistribution,” IEEE Computing in Science and Engineering, Special issueon Web Engineering , August 2004.