YEE VANG WEB CACHE
Dec 25, 2015
Y E E VA N G
WEB CACHE
INTRODUCTION
• Internet has many user• Issues with access latency (lag)• Server crashing
• How to solve?• One solution, Web Cache
WEB CACHE
• What is web cache?
• Cache “a place of storage”
• Web cache – “a place to store websites or web objects”
WEB CACHING
• Web Caching
• Technique that can:
• Reduce access latency• “the time it takes for a request to be completed”
• Network congestion• “occurs when a link or node is carrying so much data that its
quality of service deteriorates”
WEB CACHING
• How does it reduce user access latency and network congestion?
• No cache example• Movie Storage Room in the next building• Contain one copy of every movie
• One worker
WEB CACHING
• Cache example
• The same as the previous example
• Movie Storage Room in the next building• Contain one copy of every movie
• One worker
• A Movie rack that can hold five movie at a time, to simulate a movie cache.
WEB CACHING
• In the example
• Customer -> User
• Movie -> Web Pages
• Worker -> ISP
• Movie Storage Room -> Origin Server
• Movie Rack -> Web Cache
CACHE HIT/CACHE HIT RATE
• Cache hit• Occurs when a request can be satisfied by the web
cache.
• In the movie store example• Hit?
• Cache hit rate• Is the percentage that a previously cached object will score a
cache hit
CACHE MISS
• Cache miss
• Occurs when a request cannot be satisfied by the web cache.
• In the movie example• Miss?
WEB CACHING
• Pros
• Can reduce internet bandwidth• If a request can be satisfied by the web cache
• Reduce the work load of the origin server• By storing previously requested web objects in a web cache
• Reduce user access latency• When a cache hit occurs
WEB CACHING
• Cons
• Not every web objects are cacheable• Website that generate dynamic data• Requires an active connection• https://
• Stale Cache• Cache that are out of date
• Bottleneck at the proxy server (in proxy caching)
TYPES OF WEB CACHE
• Browser Cache
• Proxy Cache
• Reverse Proxy Cache
BROWSER CACHE
• Cache stored at client level• Meaning the cache is actually stored on the user’s computer
• i.e. Temporary internet files,
Mozilla/Netscape C:\Users\Profile\AppData\Roaming\Mozilla\Profiles\[random.string].slt\
Firefox C:\Users\Profile\AppData\Roaming\Mozilla\Firefox\Profiles\[random.string]\
Thunderbird C:\Users\Profile\AppData\Roaming\Thunderbird\Profiles\[random.string]\
[http://www.holgermetzger.de/pdl.html]
BROWSER CACHE
• Advantages of Browser Cache
• Stored Locally • On cache hit it saves bandwidth• Increase in access latency
• User pattern• The same user has a higher probability of browsing the same
website each day.
BROWSER CACHE
• Disadvantages of Browser Cache
• Takes up hard drive space
• Stale object• Always risk running into stale object with caching.
• Stored Locally• Only serves one computer.
PROXY CACHE
• Cache are stored at a proxy server
• The proxy server usually serves more than one user
• Acts as a gateway to the internet for large company or institution
http://www.codeproject.com/KB/web-cache/ExploringCaching/cache_array.jpg
PROXY CACHE
• Request are directed to the proxy server instead of the origin server.
• On cache hit• Returns the requested object to the user.
• On cache miss• Request is then forwarded to origin server.
PROXY CACHE
• Advantages• Serves more than one client• Cache hit can occur even if different user makes the same
request.
• Gateway• Companies can limit what user can access.
• Disadvantages• Serves more than one client• Can be overloaded.
• Gateway• When the proxy server is down all the users are disconnected
from then internet.
REVERSE PROXY CACHE
• Serves, origin server
• Basically a proxy server that sits in front of the origin server.
http://odino.org/images/proxy-cache.jpg
REVERSE PROXY CACHE
• When a request is made?
• Directed to the reverse proxy cache server
• On cache hit • Object is returned to user
• On cache miss• Request is forwarded to the origin server• A copy is stored on the Reverse proxy server• A copy is sent back to the user
REVERSE PROXY CACHE
• Advantages
• Reduces workload off of the origin server• Requested object can be requested once, cached on the
reverse proxy server, and server many clients without contacting the origin server again
• Static files can be cached• i.e. CSS files, java scripts, logos• Allows the origin server to better process dynamic contents
REVERSE PROXY CACHE
• Disadvantages
• Bottleneck• Many users making requests at the same time
• Stale Cache/old files• Risk of cache hits on stale object, also static files can be
outdated
WEB CACHING ARCHITECTURE
• Two main web caching architecture• Hierarchical • Distributed
• They both utilizes the network shown below
[3]
HIERARCHICAL CACHING ARCHITECTURE
• There are more than one level of cache between the users and the origin server
• Typically employs more than one types of cache
• There are parents, child and sibling relationships between caches.
HIERARCHICAL CACHING ARCHITECTURE
• First level of cache – Institutional Network• Second level of cache – Regional Network• Third level of cache – National Network• Parents? Child? Siblings?
[3]
HIERARCHICAL CACHING ARCHITECTURE
• When a request is made
• Its sent to the level one cache
• If the level one cache cannot satisfy the request
• Then its forwarded to the level two cache
• If the level two cache cannot satisfy the request
• Then its forwarded to the next level.
• Once it reaches the last level, and still not be satisfied, then the request is forwarded to the origin server
HIERARCHICAL CACHING ARCHITECTURE
• Advantages• Different level of cache offers more chance for a cache
hit• Leads to decrease access latency• Also reduce workload on the origin servers
• Disadvantages• Every level added to the hierarchy adds delay• On cache miss there is a slight increase in latency• Higher level cache servers are expensive
DISTRIBUTED CACHING ARCHITECTURE
• Cache are stored at the Institutional Level• Regional and national level are eliminated• Each institutional network in the distributed
system are siblings to each other.
[3]
DISTRIBUTED CACHING ARCHITECTURE
• What is special in the distributed caching architecture?
• Each institutional cache can contact its sibling cache
• So each cache can knows what is in the other cache
• They can receive objects from their sibling
DISTRIBUTED CACHING ARCHITECTURE
• When a request is made?
• Query-Based Approach – Internet Caching Protocol• Request sent to configured institutional cache server
• On cache miss, the request is broadcasted to the institutional cache’s sibling cache.
• If a sibling cache contains the requested object, the sibling cache sends the object to the immediate institutional cache. The immediate institutional cache then stores a copy in itself, and sends the client another copy
• If no sibling contains the requested object, a timeout will occur. • At which point the immediate institutional cache will then forward the
request to the origin server.
DISTRIBUTED CACHING ARCHITECTURE
• When a request is made?
• Directory-Based Approach – Cache Digest (Squid)
• In this approach metadata is used.
• Each cache is aware of it’s siblings content.
• When a request is made, its sent to the immediate institutional cache.
• On cache miss, the institutional cache checks its metadata to see if any of it’s sibling cache contains the requested object.
• If not, then it forwards the request to the origin server
DISTRIBUTED CACHING ARCHITECTURE
• Advantage
• Sibling cache servers share common interests• More chance of cache hit
• Sibling cache servers are assigned based on proximity• Faster response time
DISTRIBUTED CACHING ARCHITECTURE
• Disadvantage
• Sibling cache servers share common interests• If the servers are too far apart
• Increase in access latency
• Sibling cache servers are assigned based on proximity• Servers may not share common interest
• Less chance of cache hit
WEB CACHE COHERENCY
• Web cache coherency• Is the cache up to date?
• Web cache coherency mechanism• Validation check
• When a web object is first received
• It gets time stamped
• When the cached object is used, the cache server makes a validation check, by sending the time stamp to the origin server
WEB CACHE COHERENCY
• Web cache coherency mechanism• Callback
• When a web object is cached, it receives a callback promise for the object, from server.
• Callback promise – a promise that the origin server will notify the cache server if the object has been updated
• So the cache object is up to date if the cache server have not received a notification from the origin server
WEB CACHE COHERENCY
• Web cache coherency mechanism• Expiration
• When an object is cache an expiration date is assigned to it
• Object is valid until expiration date
• The first request for the object after its expiration date is requested from the origin server again.
• At this time a new expiration date is assigned to the object
CACHE PLACEMENT AND REPLACEMENT POLICIES
• How cache are replaced
• Random• A random cache is replaced.
• Size• Largest cache is replaced first
• FIFO – First In First Out• Oldest cache is eliminated first
CACHE PLACEMENT AND REPLACEMENT POLICIES
• LRU – Least Recently Used• Cache that has not been requested for the longest time is
eliminated first
• LRU/MIN – Least Recently Used Minimum• The first document whose size is larger than or equal to the
size of the new document is removed
• HLRU – History Least Recently Used• Record how many times each cached object is used• Elimination based on• LRU• Least Used
CACHE PLACEMENT AND REPLACEMENT POLICIES
• LFU – Least Frequently Used• Cache are sorted based on how frequently it is used• On cache hit, the counter for the hit object is
incremented by one.• List is then re-ordered• The web object with the lowest count is replaced first
• LFU – Aging• Same as LRU• The Average count of all cached object is monitored• When the average count reaches a threshold, all counts
are reset back to zero
CACHE PLACEMENT AND REPLACEMENT POLICIES
• LRV – Lowest Relative Value• Each cached object is assigned a cost value• Object with the lowest cost value are replaced first
• GD – Greedy Duel• Each cached object is assigned a cost value• Lowest cost object are replaced first• Then all cached object has their cost lowered by the
replaced object’s cost• Each time a cache is accessed its cost is reset back to its
original cost
CONCLUSION
• Web caching helps reduce:
• Network Congestion
• User access latency
• Performance of origin server
QUESTIONS?
•Questions?
REFERENCE
• [1]Barish, G., & Obraczke, K. (2000). World Wide Web caching: trends and techniques. Communications Magazine, IEEE , 38(5), 178 - 184 . doi:10.1109/35.841844 Retrieved from http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=841844&isnumber=18201
• • [2]Bakiras, S., Loukopoulos, T., Papadias, D., & Ahmad, I. (2005). Adaptive schemes for
distributed web caching. Jour of Parallel and Distributed Computing, Retrieved from http://www.cs.ust.hk/~dimitris/PAPERS/JPDC05-DWC.pdf
• • [3]Biersack, E. W., Rodriguez, P., & Spanner, C. (2001). Analysis of Web caching architectures:
hierarchical and distributed caching. Networking, IEEE/ACM Transactions on , 9(4), 404-418. doi:10.1109/90.944339 Retrieved from http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=944339&isnumber=20434
• • [4]Das, S., Dykes, S. G., & Jeffery, C. L. (1999). Taxonomy and design analysis for distributed
Web caching. System Sciences, 1999. HICSS-32. Proceedings of the 32nd Annual Hawaii International Conference on , 8, 10. doi:10.1109/HICSS.1999.773040 Retrieved from http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=773040&isnumber=16788
• • [5]Davison, B. D. (2001). A Web caching primer. Internet Computing, IEEE, 5(4), 38-45.
doi:10.1109/4236.939449 Retrieved from http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=939449&isnumber=20329
REFERENCE
• [6]Dubois, M., & Jeong, J. (2002, June). In R Bianchini (Chair). Cost-sensitive cache replacement algorithms. Paper presented at Second workshop on caching, coherence, and consistency, New York, NY, USA Retrieved from http://www.research.rutgers.edu/~wc3/papers/dubois.pdf.gz
• • [7]Geetha, K., Gounden, N. A., & Monikandan, S. (2009). SEMALRU: An Implementation of modified web
cache replacement algorithm. Nature & Biologically Inspired Computing, 1406-1410.• doi: 10.1109/NABIC.2009.5393711• URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5393711&isnumber=5393306• • [8]Hassanein, H., Liang, Z., & Liang, P. (2002). Performance comparison of alternative Web caching
techniques. Computers and Communications, 2002. Proceedings. ISCC 2002. Seventh International Symposium on , 213 - 218 . doi:10.1109/ISCC.2002.1021681 Retrieved from http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=1021681&isnumber=21983
• • [9](n.d.). Reverse Proxy Caching. In Cisco ACNS Caching and Streaming Configuration Guide. (5th ed.).
(pp. 6-1). San Jose, CA: Cisco Systems, Inc.. doi:OL-4070-01 Retrieved from http://www.cisco.com/en/US/docs/app_ntwk_services/waas/acns/v51/configuration/local/guide/a51cag.pdf
• • [10]Tay, T. T., & Wijesundara, M. N. (2002). Distributed Web caching. Communication Systems, 2002.
ICCS 2002. The 8th International Conference on , 2(25-28), 1142- 1146 vol.2 . doi:10.1109/ICCS.2002.1183311 Retrieved from http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=1183311&isnumber=26554
REFERENCE
• [11]Vakali, A. (2000). Lru-based algorithms for web cache replacement. In K. Bauknecht, S. Kumar Madria & G. Pernul (Eds.), Electronic Commerce and Web Technologies, First International Conference (p. 409-418). Retrieved from http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.59.5504&rep=rep1&type=pdf