Cache in Chromium: Disk Cache
Post on 24-Jun-2015
387 Views
Preview:
DESCRIPTION
Transcript
Cache in ChromiumDisk Cache & Overall cache flow
Chang W. Doh
GDG Korea WebTech OrganizerHTML5Rocks/KO Contributor/Coordinator
</hi><hi>
Before stepping into Cache:Network stack
● A mostly single-threaded cross-platform library primarily for resource fetching
○ URLRequest■ represents the request for a URL
○ URLRequestContext■ contains all the associated contexts to fullfill
the ‘URL request’● e.g. cookies, host resolver, cache
Network Stack:What’s ‘Network Stack’?
● Some code layouts○ /net/base
■ shared utilities for /net modules○ /net/disk_cache
■ Cache for web resources○ /net/url_request
■ URLRequest, URLRequestContext, ...
Before stepping into Cache:Network stack
Typical request-flow
HttpCache Cache(aka Disk Cache)check
HttpCache::Transaction
not exist
notify
notify
Cache hit!
Disk Cache:(a.k.a. Cache)
● Cache ○ Stores resources fetched from the web○ A part of ‘/net’
■ location: /net/disk_cache■ This means ‘DiskCache’ will controll cache-flows
for network fetches.
● NOTE:○ Android use ‘Simple cache’.
■ location: /net/disk_cache/simple
DiskCache:What’s DiskCache?
● Main characteristics:○ The cache should not grow unbounded
■ Algorithm to decide when removing old entries○ Not critical to loose some data
■ But discarding whole cache should be minimized
○ Access should be possible to use sync or async operations
○ Design should avoid ‘cache trashing’
DiskCache:Characteristics
● Main characteristics:○ Should be possible to remove a entry from
the cache■ and keep working with that entry while at the
same time inaccessible to other requests○ Shouldn’t be using explicit multithread sync
■ Always called from the same thread
■ However, callbacks must be issued by message loop for avoiding reentrancy
DiskCache:Characteristics
● /net/disk_cache/disk_cache.h
● 2 Interfaces○ disk_cache::Backend
■ manages entries on the cache○ disk_cache::Entry
■ handles operations specific to a given resource
DiskCache:External interfaces
● An entry is identified by its key
○ e.g. http://www.google.com/favicon.ico
● Once an entry is created, the data is stored in separate chunks or data streams:
○ HTTP headers
○ Actual resource data
● Index for the required stream is an argument to methods:
○ Entry::ReadData
○ Entry::WriteData
External interfaces:Backend
Very Simple Cache(a.k.a. Simple Cache)
● Proposed to a new backend for diskcache○ Conforming to the interface in Disk Cache○ Very simple
■ Using 1 file per cache entry + index file■ Dealing with I/O bottlenecks
Simple cache:What is “Simple Cache”?
Comparison toBlockfile Backend
?
● Comparison to blockfile cache○ More resilent under corruption from the
system crash■ Periodcally flushes its entire index■ Swaps index in atomically
■ After system crash, will starts with the stale cache
● NOTE: With the blockfile cache, chrome will drops whole cache by default
Simple cache:Benefits and goals
● Comparison to blockfile cache (cont’d)○ Doesn’t delay launching network requests
■ Elimination of delay factors● No context switching● Not blocks disk I/O before using network
■ Blockfile has (AVR) 14~25ms delay on requests
● On Android, slower flash controllers make these delays significantly slower.
Simple cache:Benefits and goals
● Comparison to blockfile cache (cont’d)○ Lower resident set pressure & fewer IO ops.
■ Disk format has● 256~512B per entry records● + rankings & index information(~100B) per entry● Not all entries that are heavily used contiguously
■ Simple cache● stores only SMALLNUM bytes per entry in memory● doesn’t access the disk where not required
Simple cache:Benefits and goals
● Comparison to blockfile cache (cont’d)○ Simpler
■ Shorter and easier via explicitly avoiding implementation of filesystem than blockfile’s
Simple cache:Benefits and goals
● Not a log structed cache system○ I/O performs by Simple cache is mostly
sequential. But NOT log structed■ If it is, it means “filesystem that itself is log
structed.”
● Not a filesystem○ Disk cache delegates filesystem.
■ means “Simple cache uses abstract interface of Disk Cache instead implementing its own filesystem”.
Simple cache:Non-goals; Simple cache is
● Entry hash○ Hash with 40 bit SHA-2 of url○ 2 entries with same EH can’t be stored
● Stored in single directory○ ONE index file○ Each entry stored in a single file
■ named by HexEntryHash_StreamNumber
Simple cache:Structure on Disk
● A file ‘00index’○ contains data for initializing memory index
● Index (on memory)○ used for faster cache performance○ consists of entry hashes for records & simple
eviction information
Simple cache:Structure on Disk
● Formats of entry file○ Simple file header
○ Simple file EOF
○ Simple File sparse range header
Simple cache:Structure on Disk
magic_number version key_length key_hash
flagsfinal_magic_number data_crc32 stream_size
sparse_magic_number offset length data_crc32
Term:Sparse file
● I/O thread operations○ public API is called on the I/O thread○ The index is updated in the I/O thread
● Worker pool operations○ All I/O operations are performed async on
the worker pool.○ Cache will keep a pool of new entries ready
to move into final place.
Simple cache:Implementation
● Index flushing & consistency checking○ The index is flushed on
■ shutdown■ periodically
● Operation without index○ can operate without the IO thread index by
directly opening files in the directory.○ for avoiding startup speeds & I/O costs
Simple cache:Implementation
[1] Disk Cache[2] Disk Cache 3.0[3] Very Simple Cache[4] Multi-process Resource Loading[5] Network Stack[6] Network Stack Use in Chromium
References
top related