Lecture Topics: 11/17 • Page tables – flat page tables – paged page tables – inverted page tables • TLBs • Virtual memory
Jan 02, 2016
Lecture Topics: 11/17
• Page tables– flat page tables– paged page tables– inverted page tables
• TLBs• Virtual memory
Virtual Addresses
• Programs use virtual addresses which don't correlate to physical addresses
• CPU translates all memory references from virtual addresses to physical addresses
• OS still uses physical addresses
CPUTranslation
BoxMain
Memory
physical address
CPUTranslation
Boxvirtual address
physical address
Main
MemoryUser mode:
Kernel mode:
Paging• Divide a process's
virtual address space into fixed-size chunks (called pages)
• Divide physical memory into pages of the same size
• Any virtual page can be located at any physical page
• Translation box converts from virtual pages physical pages
012345
0123
012345678910111213
Translation
Word
IE5
Virtual Page #
Physical Page #
0x0000
0x6000
0x0000
0x4000
0x0000
0xE000
Page Tables
• A page table maps virtual page numbers to physical page numbers
• Lots of different types of page tables– arrays, lists, hashes
Page
Table
Virtual
Page #
Physical
page #
virtual addressVPN
Offset
physical address
PPN
Offset
Process ID
Flat Page Table
• A flat page table uses the VPN to index into an array
• What's the problem? (Hint: how many entries are in the table?)
VPN
PPN
Page Table
56213109
012345678910111213
012345
VPN Offset
4 100
Memory
Flat Page Table Evaluation
• Very simple to implement• Flat page tables don't work for sparse
address spaces– code starts at 0x00400000– stack starts at 0x7FFFFFFF
• With 8K pages, this requires 1MB of memory per page table – 1MB per process– must be kept in main memory (can't be put
on disk)
• 64-bit addresses are a nightmare (4 TB)
Multi-level Page Tables
• Use multiple levels of page tables– each page table points to another
page table– the last page table points to the PPN
• The VPN is divided into – Index into level 1 page– Index into level 2 page
…
Multi-level Page Tables
012345678910111213
L1 Page Table
NO
VPN Offset
3 1002
0123
MemoryL2Page Tables
Multi-Level Evaluation
• Only allocate as many page tables as we need--works with the sparse address spaces
• Only the top page table must be in pinned in physical memory
• Each page table usually fills exactly 1 page so it can be easily moved to/from disk
• Requires multiple physical memory references for each virtual memory reference
Inverted Page Tables
• Inverted page tables hash the VPN to get the PPN
• Requires O(1) lookup
• Storage is proportional to number of physical pages being used not the size of the address space
Hash
Table
VPN Offset
Inverted Page Table Memory
Translation Problem
• Each virtual address reference requires multiple accesses to physical memory
• Physical memory is 50 times slower than accessing the on-chip cache
• If the VPN->PPN translation was made for each reference, the computer would run as fast as a Commodore-64
• Fortunately, locality allows us to cache translations on chip
Translation Lookaside Buffer
• The translation lookaside buffer (TLB) is a small on-chip cache of VPN->PPN translations
• In common case, translation is in the TLB and no need to go through page tables
• Common TLB parameters– 64 entries– fully associative– separate data and instruction TLBs (why?)
Virtual Page # Physical Page # Control Info
11 6 valid, read/write
200 13 valid, read only
-- -- invalid
0 14 valid, read/write
TLB
• On a TLB miss, the CPU asks the OS to add the translation to the TLB– OS replacement policies are usually
approximations of LRU
• On a context switch all TLB entries are invalidated because the next process has different translations
• A TLB usually has a high hit rate 99-99.9%– so virtual address translation doesn't cost
anything
Virtual Memory• Virtual memory spills unused memory to disk
– abstraction: infinite memory– reality: finite physical memory
• In computer science, virtual means slow– think Java Virtual Machine
• VM was invented when memory was small and expensive– needed VM because memories were too small– 1965-75 CPU=1 MIPS, 1MB=$1000, disk=30ms
• Now cost of accessing is much more expensive– 2000 CPU=1000 MIPS, 1MB=$1, disk=10ms– VM is still convenient for massive multitasking, but
few programs need more than 128MB
Virtual Memory
• Simple idea: page table entry can point to a PPN or a location on disk (offset into page file)
• A page on disk is swapped back in when it is referenced– page fault
VPN
012345678910
0123456
memory
0123456
page file
Page Fault ExampleVPN
012345678910
0123456
memory
0123456
page file
VPN
012345678910
0123456
memory
0123456
page file
VPN
012345678910
0123456
memory
0123456
page file
Reference to VPN 10 causes a page fault because it is on disk.
VPN 5 has not been used recently. Write it to the page file.
Read VPN 10 from the page file into physical memory.
Virtual Memory vs. Caches
• Physical memory is a cache of the page file
• Many of the same concepts we learned with caches apply to virtual memory– both work because of locality– dirty bits prevent pages from always being
written back
• Some concepts don't apply– VM is usually fully associative with
complex replacement algorithms because a page fault is so expensive
Replacement Algorithms
• How do we decide which virtual page to replace in memory?
• FIFO--throw out the oldest page– very bad because throws out frequently used pages
• RANDOM--pick a random page– works better than you would guess, but not good enough
• MIN--pick the page that won't be used for the longest time– provably optimal, but impossible because requires
knowledge of the future
• LRU--approximation of MIN, still impractical• CLOCK--practical approximation of LRU
Perfect LRU
• Perfect LRU– timestamp each page when it is
referenced– on page fault, find oldest page– too much work per memory reference
LRU Approximation: Clock
• Clock algorithm– arrange physical pages in a circle, with a
clock hand– keep a use bit per physical page– bit is set on each reference
• bit isn't set page not used in a long time
– On page fault• Advance clock hand to next page & check use
bit– If used, clear the bit and go to next page– If not used, replace this page
Clock Example
0
1
1
10
0
1
0
0
1
2
34
5
6
7
0
1
1
00
0
1
0
0
1
2
34
5
6
7
0
1
0
00
0
1
0
0
1
2
34
5
6
7
0
0
0
00
0
1
0
0
1
2
34
5
6
7
1
0
0
00
0
1
0
0
1
2
34
5
6
7
PPN 0 has been used; clear and advance
PPN 1 has been used; clear and advance
PPN 2 has been used; clear and advance
PPN 3 has been not been used; replace and set use bit
Clock Questions
• Will Clock always find a page to replace?
• What does it mean if the hand is moving slowly?
• What does it mean if the hand is moving quickly?
Thrashing
• Thrashing occurs when pages are tossed out, but are still needed– listen to the harddrive crunch
• Example: a program touches 50 pages often but only 40 physical pages
• What happens to performance?– enough memory 2 ns/ref (most refs hit in cache)– not enough memory 2 ms/ref (page faults every few
instructions)
• Very common with shared machinesjobs/sec
# users
thrashing
Thrashing Solutions
• If one job causes thrashing– rewrite program to have better
locality
• If multiple jobs cause thrashing– only run processes that fit in memory
• Big red button
Working Set
• The working set of a process is the set of pages that it is actually using– usually much smaller than the amount of
memory that is allocated
• As long as a process's working set fits in memory it won't thrash
• Formally: the set of pages a job has referenced in the last T seconds
• How do we pick T?– too big => could run more programs– too small => thrashing
What happens on a memory reference?
• An instruction refers to memory location X:• Is X's VPN in the TLB?
– Yes: get data from cache or memory. Done.– (Often don't look in TLB if data is in the L1 cache)
• Trap to OS to load X's VPN into the cache• OS: Is X's VP located in physical memory?
– Yes: replace TLB entry with X's VPN. Return control to CPU, which restarts the instruction. Done.
• Must load X's VP from disk– pick a page to replace, write it back to disk if dirty– load X's VP from disk into physical memory– Replace the TLB entry with X's VPN. Return control to
CPU, which restarts the instruction.
What is a Trap?
• http://www.cs.wayne.edu/~tom/guide/os2.html
• http://www.cs.nyu.edu/courses/fall99/G22.2250-001/class-notes.html