Lecture-19 (Virtual Memory and Caches) CS422-Spring 2018 Biswa@CSE-IITK
Lecture-19 (Virtual Memory and Caches)CS422-Spring 2018
Biswa@CSE-IITK
CS422: Spring 2018 Biswabandan Panda, CSE@IITK 2
Summary
• Reducing hit time
1. Small and simple caches
2. Way prediction
3. Trace caches
• Increasing cache bandwidth
4. Pipelined caches
5. Multibanked caches
6. Nonblocking caches
• Reducing Miss Penalty
7. Critical word first
8. Early Restart
• Reducing Miss Rate
9. Victim Cache
10. Hardware prefetching
11. Compiler prefetching
12. Compiler Optimizations
CS422: Spring 2018 Biswabandan Panda, CSE@IITK 3
Inclusion- Cache Hierarchy
Back
Inval
L1/L2
LLC
victimfill
fill
Core request
memory
evictL1/L2
LLCfill
fill
Core request
memory
victim
L1/L2
LLC
victim
fill
fill
Core request
memory
victim
Inclusive Non-Inclusive Exclusive
CS422: Spring 2018 Biswabandan Panda, CSE@IITK 4
Cache Design Space
• Several interacting dimensions– cache size
– block size
– associativity
– replacement policy
– write-through vs write-back
– write allocation
• The optimal choice is a compromise– depends on access characteristics
• workload
• use (I-cache, D-cache, TLB)
– depends on technology / cost
• Simplicity often wins
Associativity
Cache Size
Block Size
Bad
Good
Less More
Factor A Factor B
CS422: Spring 2018 Biswabandan Panda, CSE@IITK 5
Address Translation – Welcome to the Virtual World
CPU
VA
TLB
PA
L1$
LLC
DRAM
CS422: Spring 2018 Biswabandan Panda, CSE@IITK 6
Address Translation
CPU
VA – 0x1002
TLB
L1$
LLC
DRAM
Page table
VP0 PP8VP1 PP3VP2 PP2
...
CS422: Spring 2018 Biswabandan Panda, CSE@IITK 7
Virtual World?
▪ Need to cope with additional latency of TLB:- slow down the clock?- pipeline the TLB and cache access?- virtual address caches- parallel TLB/cache access
PCInstTLB
Inst. Cache D Decode E M
Data TLB
Data Cache W+
TLB miss? Page Fault?Protection violation?
TLB miss? Page Fault?Protection violation?
CS422: Spring 2018 Biswabandan Panda, CSE@IITK 8
Why Virtual? (CS330?)
CS422: Spring 2018 Biswabandan Panda, CSE@IITK 9
Caches: Virtual or Physical?
▪ one-step process in case of a hit (+)▪ cache needs to be flushed on a context switch unless
address space identifiers (ASIDs) included in tags (-)▪ aliasing problems due to the sharing of pages (-)▪ maintaining cache coherence (-)
CPUPhysicalCache
TLBMemoryVA PA PA
Alternative: place the cache before the TLB
Virtual
CacheCPU
VA PATLB
MemoryVA
CS422: Spring 2018 Biswabandan Panda, CSE@IITK 10
Cache and Virtual Memory
CPU
TLB
cache
lowerhier.
physical cache
CPU
cache
tlb
lowerhier.
virtual (L1) cache
VA
PA
CPU
cache tlb
lowerhier.
virtual-physical cache
VA
PA
VA
PA
CS422: Spring 2018 Biswabandan Panda, CSE@IITK 11
Virtual Caches
PC
Inst. TLB
Inst. Cache D Decode E M
Data Cache W+
Data TLB
Main Memory (DRAM)
Memory Controller
Physical Address
Instruction dataPhysical Address
Physical Address
Page-Table BaseRegister
Virtual Address
Virtual Address
Hardware Page Table Walker
Miss?Miss?
Translate on miss
CS422: Spring 2018 Biswabandan Panda, CSE@IITK 12
Problem: Aliasing (Synonyms, Homonyms?)
VA1
VA2
Page Table
Data Pages
PA
VA1
VA2
1st Copy of Data at PA
2nd Copy of Data at PA
Tag Data
Two virtual pages share one physical page
Virtual cache can have two copies of same physical data. Writes to one copy not visible to reads of other!
General Solution: Prevent aliases coexisting in cache
Software (i.e., OS) solution for direct-mapped cache
VAs of shared pages must agree in cache index bits; this ensures all VAs accessing same PA will conflict in direct-mapped cache (early SPARCs)
CS422: Spring 2018 Biswabandan Panda, CSE@IITK 13
Homonyms (Virtual Tags)
VA1
Page Table
Data Pages
PAs
One virtual page maps to two physical pages
Tag may not uniquely identify cache dataSolution: Add ASID with tag
Or Physical tags
OrFlush on context switch
CS422: Spring 2018 Biswabandan Panda, CSE@IITK 14
Virtual Memory (Virtually Indexed Physically Tagged)
• If C≤(page_size associativity), the cache index bits come only from page offset (same in VA and PA)
• If both cache and TLB are on chip• index both arrays concurrently using VA bits
• check cache tag (physical) against TLB output at the endVPN Page Offset
TLB
PPN
Index BiB
physicalcache
tag data=
cache hit?TLB hit?
CS422: Spring 2018 Biswabandan Panda, CSE@IITK 15
Virtual Memory (Virtually Indexed Physically Tagged)
• If C>(page_size associativity), the cache index bits include VPN Synonyms can cause problems• The same physical address can exist in two locations
• Solutions?
VPN Page Offset
TLB
PPN
Index BiB
physicalcache
tag data=
cache hit?TLB hit?
a
• VIVT (Virtual cache)
• VIPT
• PIVT
• PIPT (Physical cache)
CS422: Spring 2018 Biswabandan Panda, CSE@IITK 16
Four Possibilities
• VIVT (Virtual cache): Fastest, Synonyms, Homonyms,
• VIPT: Good enough, No Homonyms, mostly in L1 caches
• PIVT: ??
• PIPT (Physical cache): You know it
CS422: Spring 2018 Biswabandan Panda, CSE@IITK 17
Four Possibilities: Cache Performance
CS422: Spring 2018 Biswabandan Panda, CSE@IITK 18
Caches+TLB+Page-Table+Prefetcher