15-447 Computer Architecture Fall 2008 © November 10, 2007 Nael Abu-Ghazaleh [email protected] http://www.qatar.cmu.edu/~msakr/15447-f 08 Lecture 23 Virtual Memory (2) CS 15-447: Computer Architecture
Dec 20, 2015
15-447 Computer Architecture Fall 2008 ©
November 10, 2007Nael Abu-Ghazaleh
[email protected]://www.qatar.cmu.edu/~msakr/15447-f08
Lecture 23Virtual Memory (2)
CS 15-447: Computer Architecture
2
15-447 Computer Architecture Fall 2008 ©
Last Lecture
• Virtual memory lets the programmer “see” a memory array larger than the DRAM available on a particular computer system.
• Virtual memory enables multiple programs to share the physical memory without:– Knowing other programs exist (transparency).– Worrying about one program modifying the data
contents of another (protection).
3
15-447 Computer Architecture Fall 2008 ©
Page table register
0x00002 0x082
0 Disk address
valid Physical page number
Exception: page fault
1. Stop this process2. Pick page to replace3. Write back data4. Get referenced page5. Update page table6. Reschedule process
Page Table Components
4
15-447 Computer Architecture Fall 2008 ©
Other VM Functions
• Page data location – Physical memory, disk, uninitialized data
• Access permissions– Read only pages for instructions
• Gathering access information– Identifying dirty pages by tracking stores– Identifying accesses to help determine LRU
candidate
5
15-447 Computer Architecture Fall 2008 ©
Page Replacement Strategies
• Page table indirection enables a fully associative mapping between virtual and physical pages.
• How do we implement LRU?– True LRU is expensive, but LRU is a heuristic
anyway, so approximating LRU is fine– Reference bit on page, cleared occasionally by
operating system. Then pick any “unreferenced” page to evict.
6
15-447 Computer Architecture Fall 2008 ©
Two problems
• Size of page table – can be big– Do we have one page table per machine or
one page table per process?
• Performance: – How many times do we access memory per
instruction?– Can we afford to access a page table to do
translation with every memory access?
7
15-447 Computer Architecture Fall 2008 ©
Hierarchical Page Table Example
Virtual Superpage Virtual page Page offset
Virtual address
1 2nd level page table
valid super page table
1 page 1 Physical page number
valid 2nd level page table
Physical page number Page offset
Physical address
Page table register
1 page
8
15-447 Computer Architecture Fall 2008 ©
Problem 2: Performance
• We must access physical memory to access the page table to make the translation from a virtual address to a physical one
• Then we access physical memory again to get (or store) the data• A load instruction performs at least 2 memory
reads• A store instruction performs at least 1 read
and then a write.
9
15-447 Computer Architecture Fall 2008 ©
Translation Lookaside Buffer
• We fix this performance problem by avoiding memory in the translation from virtual to physical pages.
• We buffer the common translations in a translation lookaside buffer (TLB)
10
15-447 Computer Architecture Fall 2008 ©
TLB
Virtual page
v tag Physical page
Pg offset
11
15-447 Computer Architecture Fall 2008 ©
Where is the TLB Lookup?
• We put the TLB lookup in the pipeline after the virtual address is calculated and before the memory reference is performed.– This may be before or during the data cache
access.– Without a TLB we need to perform the
translation during the memory stage of the pipeline.
12
15-447 Computer Architecture Fall 2008 ©
Placing Caches in a VM System
• VM systems give us two different addresses: virtual and physical
• Which address should we use to access the data cache?– Virtual address (before VM translation)
• Faster access? More complex?
– Physical address (after VM translations)• Delayed access?
13
15-447 Computer Architecture Fall 2008 ©
Physically-Addressed Caches
• Perform TLB lookup before cache tag comparison.– Use bits from physical address to index set– Use bits from physical address to compare
tag
• Slower access? – Tag lookup takes place after the TLB lookup.
• Simplifies some VM management– When switching processes, TLB must be
invalidated, but cache OK to stay as is.
14
15-447 Computer Architecture Fall 2008 ©
Page offsetVirtual page
Picture of Physical Caches
Virtual address
Set1 tagSet1 tag
Set0 tagSet0 tag
Set2 tagSet2 tag
Tagcmp
Tagcmp
Cache
tag PPNtag PPNtag PPNtag PPN
Page offsetPPN
tag index Blockoffset
15
15-447 Computer Architecture Fall 2008 ©
Virtually-Addressed Caches
• Perform the TLB lookup at the same time as the cache tag compare.– Uses bits from the virtual address to index the cache
set– Uses bits from the virtual address for tag match.
• Problems:– Aliasing: Two processes may refer to the same
physical location with different virtual addresses.– When switching processes, TLB must be invalidated,
and dirty cache blocks must be written back to memory.
16
15-447 Computer Architecture Fall 2008 ©
Picture of Virtual Caches
tag index Block offset
Virtual address
Set1 tagSet1 tag
Set0 tagSet0 tag
Set2 tagSet2 tag
Tagcmp
Tagcmp
• TLB is accessed in parallel with cache lookup• Physical address is used to access main memory in case of a cache miss.
17
15-447 Computer Architecture Fall 2008 ©
OS Support for Virtual Memory
• It must be able to modify the page table register, update page table values, etc.– To enable the OS to do this, AND not the user
program, we have different execution modes for a process – one which has executive (or supervisor or kernel level) permissions and one that has user level permissions.
18
15-447 Computer Architecture Fall 2008 ©
Extended Example: Loading a Program into Memory
text2Istatic
text1
Disk Pages 2 entry TLBMemory
Page Table
0123
1000100110021003
01234567
References000000047FFC00082134
Physical Refs
19
15-447 Computer Architecture Fall 2008 ©
Additional Information
• Page size = 4KB• Page table entry size = 4B• Page table register points to physical
address 0000
20
15-447 Computer Architecture Fall 2008 ©
Step 1: Read Executable Headerand Initialize Page Table
text2Istatic
text1
Disk Pages 2 entry TLB
reserved
Memory
Page TableD1000D1001D1002no mapno mapno mapno mapno map
0123
1000100110021003
01234567
References000000047FFC00082134
Physical Refsroro
21
15-447 Computer Architecture Fall 2008 ©
Step 2: Load PC from Headerand Start Execution
text2Istatic
text1
Disk Pages 2 entry TLB
reserved
Memory
Page TableD1000D1001D1002no mapno mapno mapno mapno map
0123
1000100110021003
01234567
References000000047FFC00082134
Physical Refsroro
MISS!
22
15-447 Computer Architecture Fall 2008 ©
Fetching instr 0000
text2Istatic
text1
Disk Pages 2 entry TLB
reserved
Memory
Page TableD1000D1001D1002no mapno mapno mapno mapno map
0123
1000100110021003
01234567
References000000047FFC00082134
Physical Refs0000
Page fault
roro
23
15-447 Computer Architecture Fall 2008 ©
Fetching instr 0000
text2Istatic
text1
Disk Pages
M10000
2 entry TLB
reservedtext1
Memory
Page TableM1
D1001D1002no mapno mapno mapno mapno map
0123
1000100110021003
ro
01234567
References000000047FFC00082134
Physical Refs0000
Page fault
roro
24
15-447 Computer Architecture Fall 2008 ©
Fetching instr 0000
text2Istatic
text1
Disk Pages
M10000
2 entry TLB
reservedtext1
Memory
Page TableM1
D1001D1002no mapno mapno mapno mapno map
0123
1000100110021003
ro
01234567
References000000047FFC00082134
Physical Refs0000
Page fault1000
roro
25
15-447 Computer Architecture Fall 2008 ©
Fetching instr 0004
text2Istatic
text1
Disk Pages
M10000
2 entry TLB
reservedtext1
Memory
Page TableM1
D1001D1002no mapno mapno mapno mapno map
0123
1000100110021003
ro
01234567
References000000047FFC00082134
Physical Refs0000
Page fault10001004
roro
HIT!
26
15-447 Computer Architecture Fall 2008 ©
Reference 7FFC
text2Istatic
text1
Disk Pages
M10000
2 entry TLB
reservedtext1
Memory
Page TableM1
D1001D1002no mapno mapno mapno mapno map
0123
1000100110021003
ro
01234567
References000000047FFC00082134
Physical Refs0000
Page fault10001004
roro
MISS!
27
15-447 Computer Architecture Fall 2008 ©
Reference 7FFC
text2Istatic
text1
Disk Pages
M10000
2 entry TLB
reservedtext1
Memory
Page TableM1
D1001D1002no mapno mapno mapno mapno map
0123
1000100110021003
ro
01234567
References000000047FFC00082134
Physical Refs0000
Page fault10001004
No map page fault
roro
28
15-447 Computer Architecture Fall 2008 ©
Reference 7FFC
text2Istatic
text1
Disk Pages
M10000
2 entry TLB
reservedtext1
Set to 0s
Memory
M27000
Page TableM1
D1001D1002no mapno mapno mapno map
M2
0123
1000100110021003
rorw
01234567
References000000047FFC00082134
Physical Refs0000
Page fault10001004
No map page fault2FFC
roro
29
15-447 Computer Architecture Fall 2008 ©
Fetching instr 0008
text2Istatic
text1
Disk Pages
M10000
2 entry TLB
reservedtext1
Set to 0s
Memory
M27000
Page TableM1
D1001D1002no mapno mapno mapno map
M2
0123
1000100110021003
rorw
01234567
References000000047FFC00082134
Physical Refs0000
Page fault10001004
No map page fault2FFC1008
roro
HIT!
30
15-447 Computer Architecture Fall 2008 ©
Reference 2134
text2Istatic
text1
Disk Pages
M10000
2 entry TLB
reservedtext1
Set to 0s
Memory
M27000
Page TableM1
D1001D1002no mapno mapno mapno map
M2
0123
1000100110021003
rorw
01234567
References000000047FFC00082134
Physical Refs0000
Page fault10001004
No map page fault2FFC1008
roro
MISS!
31
15-447 Computer Architecture Fall 2008 ©
Reference 2134
text2Istatic
text1
Disk Pages
M10000
2 entry TLB
reservedtext1
Set to 0s
Memory
M27000
Page TableM1
D1001D1002no mapno mapno mapno map
M2
0123
1000100110021003
rorw
01234567
References000000047FFC00082134
Physical Refs0000
Page fault10001004
No map page fault2FFC1008
Page fault
roro
32
15-447 Computer Architecture Fall 2008 ©
Reference 2134
text2Istatic
text1
Disk Pages
M10000
2 entry TLB
reservedtext1
Set to 0sIstatic
Memory
M32000
Page TableM1
D1001M3
no mapno mapno mapno map
M2
0123
1000100110021003
rorw
01234567
References000000047FFC00082134
Physical Refs0000
Page fault10001004
No map page fault2FFC1008
Page fault3134
roro
33
15-447 Computer Architecture Fall 2008 ©
Multiple Processes
• Virtual cache support for multiple processes:– Flush the cache between each context switch.– Use processID (a unique number for each
processes given by the operating system) as part of the tag
34
15-447 Computer Architecture Fall 2008 ©
Multiple Processors
• Can run two programs at the same time– Each processor has its own cache. Why?
• May or may not share data– Sharing code is not a problem (read only)
• Example: shared libraries, DDLs
– Sharing data (read/write) is a problem• What if it is in one processors cache?
– Solution: Snoopy caches