Paging Algorithms Vivek Pai / Kai Li Princeton University
Dec 20, 2015
Paging Algorithms
Vivek Pai / Kai Li
Princeton University
2
Virtual Memory Gedankenexperiment
• Assume memory costs $20 per 256MB• What does it cost to fill a 32-bit system?• What does it cost to fill a 64-bit system?
– What about at $1 per 256MB?
• What implications does this have for the design of virtual to physical translation when using 64-bit address spaces?– (hint: think hierarchical page tables)
3
Memory Hierarchy Revisited
CPU
TLB
L1
L2
Main Memory Devices
What does thisimply about L1addresses?
Where do we hoperequests get satisfied?
4
Memory Hierarchy Re-Revisited
CPU
TLBL1
L2
Main Memory Devices
Now what does thisimply about L1addresses?
Any speed benefits?Any drawbacks?
5
Definitions
• Paging – moving pages to (from) diskex: paging begins five minutes into the test
• Pressure – the demand for some resource (often used when demand exceeds supply)ex: the system experienced memory pressure
• Optimal – the best (theoretical) strategy
• Eviction – throwing something out ex: cache lines and memory pages got evicted
• Pollution – bringing in useless pages/linesex: this strategy causes high cache pollution
6
Big Picture
Load M i
Free frame
Pagetable
VM
reffault
7
Really Big Picture
• Every “page-in” requires an eviction• Hopefully, kick out a less-useful page
– Dirty pages require writing, clean pages don’t
– Where do you write? To “swap space”
• Goal: kick out the page that’s least useful• Problem: how do you determine utility?
– Heuristic: temporal locality exists
– Kick out pages that aren’t likely to be used again
8
More definitions
• Thrashing / Flailing – extremely high rate of paging, usually induced by other decisions
• Dirty/Clean – indicates whether modifications have been made versus copy on stable storage
• Heuristic – set of rules to use when no good rigorous answer exists
• Temporal – in time• Spatial – in space (location)• Locality – re-use – it makes the world go round
9
What Makes This Hard?
• Perfect reference stream hard to get– Every memory access would need bookkeeping
• Imperfect information available, cheaply– Play around with PTE permissions, info
• Overhead is a bad idea– If no memory pressure, ideally no bookkeeping– In other words, make the common case fast
10
Steps in Paging
• Data structures– A list of unused page frames– Data structure to map a frame to its pid/ virtual address
• On a page fault– Get an unused frame or a used frame– If the frame is used
• If it has been modified, write it to disk• Invalidate its current PTE and TLB entry
– Load the new page from disk– Update the faulting PTE and invalidate its TLB entry– Restart the faulting instruction
11
Optimal or MIN
• Algorithm: – Replace the page that won’t be used for the longest
time
• Pros– Minimal page faults– This is an off-line algorithm for performance analysis
• Cons– No on-line implementation
• Also called Belady’s Algorithm
12
Not Recently Used (NRU)
• Algorithm– Randomly pick a page from the following (in order)
• Not referenced and not modified• Not referenced and modified• Referenced and not modified• Referenced and modified
• Pros– Easy to implement
• Cons– Not very good performance, takes time to classify
13
First-In-First-Out (FIFO)
• Algorithm– Throw out the oldest page
• Pros– Low-overhead implementation
• Cons– May replace the heavily used pages
5 3 4 7 9 11 2 1 15Recentlyloaded
Pageout
14
FIFO with Second Chance
• Algorithm– Check the reference-bit of the oldest page– If it is 0, then replace it– If it is 1, clear the reference-bit, move it to end of list, and
continue searching
• Pros– Fast and does not replace a heavily used page
• Cons– The worst case may take a long time
5 3 4 7 9 11 2 1 15Recentlyloaded
Pageout
If reference bit is 1
15
Clock: A Simple FIFO with 2nd Chance
• FIFO clock algorithm– Hand points to the oldest page– On a page fault, follow the hand to inspect pages
• Second chance– If the reference bit is 1, set it to 0 and advance the hand – If the reference bit is 0, use it for replacement
• What is the difference between Clock and the previous one?
Oldest page
16
Enhanced FIFO with 2nd-Chance Algorithm
• Same as the basic FIFO with 2nd chance, except that this method considers both reference bit and modified bit– (0,0): neither recently used nor modified– (0,1): not recently used but modified– (1,0): recently used but clean– (1,1): recently used and modified
• Pros– Avoid write back
• Cons– More complicated
17
More Page Frames Fewer Page Faults?
• Consider the following reference string with 4 page frames– FIFO replacement– 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5– 10 page faults
• Consider the same reference string with 3 page frames– FIFO replacement– 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5– 9 page faults!
• This is called Belady’s anomaly
18
Least Recently Used (LRU)
• Algorithm– Replace page that hasn’t been used for the
longest time
• Question– What hardware mechanisms required to
implement LRU?
19
Implement LRU
• Perfect– Use a timestamp on each reference
– Keep a list of pages ordered by time of reference
5 3 4 7 9 11 2 1 15
Mostly recently used
Leastrecently used
20
Approximate LRUMost recently used Least recently used
N categories
pages in order of last reference
LRU
CrudeLRU
2 categories
pages referenced since the last page fault
pages not referenced since the last page fault
. . . 2552540 1 2 38-bitcount
256 categories
21
Aging: Not Frequently Used (NFU)
• Algorithm– Shift reference bits into counters– Pick the page with the smallest counter
• Main difference between NFU and LRU?– NFU has a short history (counter length)
• How many bits are enough?– In practice 8 bits are quite good
• Pros: Require one reference bit• Cons: Require looking at all counters
00000000
00000000
10000000
00000000
10000000
00000000
11000000
00000000
01000000
10000000
11100000
00000000
10100000
01000000
01110000
10000000
01010000
10100000
00111000
01000000
22
Where Do We Get Storage?
• 32 bit VA to 32 bit PA – no space, right?– Offset within page is the same
• No need to store offset– 4KB page = 12 bits of offset– Those 12 bits are “free” in PTE
• Page # + other info <= 32 bits– Makes storing info easy