Page 1
CS152ComputerArchitectureandEngineeringCS252GraduateComputerArchitecture
Lecture9–VirtualMemory
KrsteAsanovicElectricalEngineeringandComputerSciences
UniversityofCaliforniaatBerkeley
http://www.eecs.berkeley.edu/~krstehttp://inst.eecs.berkeley.edu/~cs152
Page 2
Last@meinLecture8
§ Protec=onandtransla=onrequiredformul=programming– Baseandboundswasearlysimplescheme
§ Page-basedtransla=onandprotec=onavoidsneedformemorycompac=on,easyalloca=onbyOS– Butneedtoindirectinlargepagetableoneveryaccess
§ Addressspacesaccessedsparsely– Canusemul=-levelpagetabletoholdtransla=on/protec=oninforma=on,butimpliesmul=plememoryaccessesperreference
§ Addressspaceaccesswithlocality– Canuse“transla=onlookasidebuffer”(TLB)tocacheaddresstransla=ons(some=mesknownasaddresstransla=oncache)
– S=llhavetowalkpagetablesonTLBmiss,canbehardwareorsoNwaretalk
§ VirtualmemoryusesDRAMasa“cache”ofdiskmemory,allowsverycheapmainmemory
2
Page 3
ModernVirtualMemorySystemsIllusionofalarge,private,uniformstore
3
Protec=on&Privacyseveralusers,eachwiththeirprivateaddressspaceandoneormoresharedaddressspaces
pagetable≡namespaceDemandPaging
ProvidestheabilitytorunprogramslargerthantheprimarymemoryHidesdifferencesinmachineconfigura=ons
Thepriceisaddresstransla/ononeachmemoryreference
OS
useri
PrimaryMemory
SecondaryStorage
VA PAmappingTLB
Page 4
Recap:HierarchicalPageTable
4
Level1PageTable
Level2PageTables
DataPages
pageinprimarymemorypageinsecondarymemory
RootofCurrentPageTable
p1
offset
p2
VirtualAddressfromCPU
(ProcessorRegister,satpin
RISC-V)
PTEofanonexistentpage
p1p2offset01112212231
10-bitL1index
10-bitL2index
PhysicalM
emory
RISC-V Sv32 Virtual Memory Scheme
Page 5
Recap:Page-BasedVirtual-MemoryMachine(HardwarePage-TableWalk)
5
§ Assumespagetablesheldinuntranslatedphysicalmemory
PCInst.TLB
Inst.Cache D Decode E M
DataCache W+
PageFault?
Protec/onviola/on?PageFault?
Protec/onviola/on?
DataTLB
MainMemory(DRAM)
MemoryControllerPhysicalAddress
PhysicalAddress
PhysicalAddress
PhysicalAddress
Page-TableBaseRegister
VirtualAddress Physical
Address
VirtualAddress
HardwarePageTableWalker
Miss? Miss?
Page 6
AddressTransla@on:pu3ngitalltogether
6
VirtualAddress
TLBLookup
PageTableWalk
UpdateTLBPageFault(OSloadspage)
Protec=onCheck
PhysicalAddress(tocache)
miss hit
thepageis∉ memory ∈ memory denied permi_ed
Protec=onFault
hardwarehardwareorsoNwaresoNware
SEGFAULT Where?
Page 7
Page-FaultHandler
§ WhenthereferencedpageisnotinDRAM:– Themissingpageislocated(orcreated)– Itisbroughtinfromdisk,andpagetableisupdated
• AnotherjobmayberunontheCPUwhilethefirstjobwaitsfortherequestedpagetobereadfromdisk
– IfnofreepagesareleN,apageisswappedout• Pseudo-LRUreplacementpolicy,implementedinsoNware
§ Sinceittakesalong=metotransferapage(msecs),pagefaultsarehandledcompletelyinsoNwarebytheOS– Untranslatedaddressingmodeisessen=altoallowkerneltoaccesspagetables
7
Page 8
HandlingVM-relatedexcep@ons
§ HandlingaTLBmissneedsahardwareorsoNwaremechanismtorefillTLB
§ Handlingpagefault(e.g.,pageisondisk)needsrestartableexcep=onsosoNwarehandlercanresumeaNerretrievingpage– Preciseexcep=onsareeasytorestart– Canbeimprecisebutrestartable,butthiscomplicatesOSsoNware
§ Aprotec=onviola=onmayabortprocess– ButoNenhandledthesameasapagefault
8
PCInstTLB
Inst.Cache D Decode E M
DataTLB
DataCache W+
TLBmiss?PageFault?Protec/onviola/on?
TLBmiss?PageFault?Protec/onviola/on?
Page 9
AddressTransla@oninCPUPipeline
§ Needtocopewithaddi=onallatencyofTLB:– slowdowntheclock?– pipelinetheTLBandcacheaccess?– virtualaddresscaches– parallelTLB/cacheaccess
9
PCInstTLB
Inst.Cache D Decode E M
DataTLB
DataCache W+
TLBmiss?PageFault?Protec/onviola/on?
TLBmiss?PageFault?Protec/onviola/on?
Page 10
Virtual-AddressCaches
10
§ one-stepprocessincaseofahit(+)§ cacheneedstobeflushedonacontextswitchunlessaddressspace
iden=fiers(ASIDs)includedintags(-)§ aliasingproblemsduetothesharingofpages(-)§ maintainingcachecoherence(-)
CPU Physical Cache TLB Primary
Memory VA PA PA
Alternative: place the cache before the TLB
Virtual Cache CPU
VA (StrongARM) PA TLB
Primary Memory VA
Page 11
VirtuallyAddressedCache(VirtualIndex/VirtualTag)
11
PC
Inst. TLB
Inst.Cache D Decode E M Data
Cache W+
DataTLB
MainMemory(DRAM)
MemoryControllerPhysicalAddress
Instruc=ondataPhysicalAddress
PhysicalAddress
Page-Table Base Register
VirtualAddress
VirtualAddress
HardwarePageTableWalker
Miss?Miss?
Translate on miss
Page 12
AliasinginVirtual-AddressCaches
12
VA1
VA2
PageTable
DataPages
PA
VA1
VA2
1stCopyofDataatPA
2ndCopyofDataatPA
Tag Data
Two virtual pages share one physical page
Virtualcachecanhavetwocopiesofsamephysicaldata.Writestoonecopynotvisibletoreadsofother!
GeneralSolu=on:Preventaliasescoexis/ngincache
SoNware(i.e.,OS)solu=onfordirect-mappedcache
VAsofsharedpagesmustagreeincacheindexbits;thisensuresallVAsaccessingsamePAwillconflictindirect-mappedcache(earlySPARCs)
Page 13
ConcurrentAccesstoTLB&Cache(VirtualIndex/PhysicalTag)
13
IndexLisavailablewithoutconsul=ngtheTLB→cacheandTLBaccessescanbeginsimultaneously!
TagcomparisonismadeaNerbothaccessesarecompletedCases:L+b=k,L+b<k,L+b>k
VPNLb
TLB Direct-mapCache2Lblocks
2b-byteblockPPNPageOffset
=hit? DataPhysicalTag
Tag
VA
PA
VirtualIndex
k
Page 14
Virtual-IndexPhysical-TagCaches:Associa@veOrganiza@on
14
Howdoesthisschemescaletolargercaches?
VPNaL=k-bb
TLB Direct-map2Lblocks
PPNPageOffset
=hit?
Data
Phy.Tag
Tag
VA
PA
VirtualIndex
k Direct-map2Lblocks
2a
=2a
ANerthePPNisknown,2aphysicaltagsarecompared
Page 15
CS152Administrivia
§ PS2dueWednesdayFeb26§ MidterminclassMondayMarch2
– Coverslectures1–9,plusassignedproblemsets,labs,bookreadings
§ Lab2dueMondayMarch9
15
Page 16
CS252
CS252Administrivia§ Startthinkingofclassprojectsandformingteamsoftwo§ ProposaldueWednesdayFebruary26th§ ProposalshouldbeonepagePDFincluding:
– Title– Teammembernames– Whatareyoutryingtodo?– Howisitdonetoday?– Whatisyourideaforimprovementandwhydoyouthinkyou’llbesuccessful
– Whatinfrastructureareyougoingtouseforyourproject?– Project=melinewithmilestones
§ MailPDFofproposaltoinstructors§ Givea<5-minutepresenta=oninclassindiscussionsec=on=meonMarch11th
§ NodiscussiononMnodayMarch4th–midterm!
16
Page 17
ConcurrentAccesstoTLB&LargeL1TheproblemwithL1>Pagesize
17
CanVA1andVA2bothmaptoPA?
VPN aPageOffsetb
TLB
PPN PageOffsetb
Tag
VA
PA
VirtualIndexL1PAcacheDirect-map
= hit?
PPNaData
PPNaDataVA1VA2
Page 18
Asolu@onviaSecond-LevelCache
18
UsuallyacommonL2cachebacksupbothInstruc=onandDataL1cachesL2is“inclusive”ofbothInstruc=onandDatacaches
• InclusivemeansL2hascopyofanylineineitherL1
CPUL1DataCache
L1Instruc=on
Cache UnifiedL2Cache
RF MemoryMemoryMemoryMemory
Page 19
An@-AliasingUsingL2[MIPSR10000,1996]
19
§ SupposeVA1andVA2bothmaptoPAandVA1isalreadyinL1,L2(VA1≠VA2)
§ ANerVA2isresolvedtoPA,acollisionwillbedetectedinL2.
§ VA1willbepurgedfromL1andL2,andVA2willbeloaded⇒noaliasing!
VPN aPageOffsetb
TLB
PPN PageOffsetb
Tag
VA
PA
VirtualIndexL1PAcacheDirect-map
= hit?
PPNaData
PPNaData
VA1
VA2
Direct-MappedL2
PAa1Data
PPN
intoL2tag
Page 20
An@-AliasingusingL2foraVirtuallyTaggedL1
20
VPN PageOffsetb
TLB
PPN PageOffsetb
Tag
VA
PA
VirtualIndex&Tag
PhysicalIndex&Tag
L1VACache
L2PACacheL2“contains”L1
PAVA1Data
VA1Data
VA2Data
“VirtualTag”
Physically-addressedL2canalsobeusedtoavoidaliasesinvirtually-addressedL1
Page 21
CS252
AtlasRevisited
§ OnePARforeachphysicalpage
§ PAR’scontaintheVPN’softhepagesresidentinprimarymemory
§ Advantage:Thesizeispropor=onaltothesizeoftheprimarymemory
§ Whatisthedisadvantage?
21
VPN
PAR’s
PPN
Page 22
CS252
HashedPageTable:Approxima@ngAssocia@veAddressing
22
§ HashedPageTableistypically2to3=meslargerthanthenumberofPPN’storeducecollisionprobability
§ ItcanalsocontainDPN’sforsomenon-residentpages(notcommon)
§ Ifatransla=oncannotberesolvedinthistablethenthesoNwareconsultsadatastructurethathasanentryforeveryexis=ngpage(e.g.,fullpagetable)
hashOffset
BaseofTable
+ PAofPTE
PrimaryMemory
VPNPIDPPN
PageTableVPN d VirtualAddress
VPNPIDDPN
VPNPID
PID
Page 23
CS252
PowerPC:HashedPageTable
§ Eachhashtableslothas8PTE's<VPN,PPN>thataresearchedsequen=ally
§ Ifthefirsthashslotfails,analternatehashfunc=onisusedtolookinanotherslot Allthesestepsaredoneinhardware!
§ HashedTableistypically2to3=meslargerthanthenumberofphysicalpages
§ ThefullbackupPageTableismanagedinsoNware
23
BaseofTable
hashOffset + PAofSlot
Primary Memory
VPNPPN
PageTableVPN d 80-bitVA
VPN
Page 24
VMfeaturestrackhistoricaluses:§ Baremachine,onlyphysicaladdresses
– Oneprogramowneden=remachine§ Batch-stylemul=programming
– SeveralprogramssharingCPUwhilewai=ngforI/O– Base&bound:transla=onandprotec=onbetweenprograms(supportsswappingen=reprogramsbutnotdemand-pagedvirtualmemory)
– Problemwithexternalfragmenta=on(holesinmemory),neededoccasionalmemorydefragmenta=onasnewjobsarrived
§ Timesharing– Moreinterac=veprograms,wai=ngforuser.Also,morejobs/second.– Mo=vatedmovetofixed-sizepagetransla=onandprotec=on,noexternalfragmenta=on(butnowinternalfragmenta=on,wastedbytesinpage)
– Mo=vatedadop=onofvirtualmemorytoallowmorejobstosharelimitedphysicalmemoryresourceswhileholdingworkingsetinmemory
§ VirtualMachineMonitors– Runmul=pleopera=ngsystemsononemachine– Ideafrom1970sIBMmainframes,nowcommononlaptops
• e.g.,runWindowsontopofMacOSX– Hardwaresupportfortwolevelsoftransla=on/protec=on
• GuestOSvirtual->GuestOSphysical->Hostmachinephysical 24
Page 25
VirtualMemoryUseToday-1§ Servers/desktops/laptops/smartphoneshavefulldemand-pagedvirtualmemory– Portabilitybetweenmachineswithdifferentmemorysizes– Protec=onbetweenmul=pleusersormul=pletasks– Sharesmallphysicalmemoryamongac=vetasks– Simplifiesimplementa=onofsomeOSfeatures
§ Vectorsupercomputershavetransla=onandprotec=onbutrarelycompletedemand-paging
§ (OlderCrays:base&bound,Japanese&CrayX1/X2:pages)– Don’twasteexpensiveCPU=methrashingtodisk(makejobsfitinmemory)
– Mostlyruninbatchmode(runsetofjobsthatfitsinmemory)– Difficulttoimplementrestartablevectorinstruc=ons
§ ModernGPUsoperatesimilarlytovectorsupercomputers,withtransla=onandprotec=onbutnotdemandpaging
25
Page 26
VirtualMemoryUseToday-2
§ MostembeddedprocessorsandDSPsprovidephysicaladdressingonly– Can’taffordarea/speed/powerbudgetforvirtualmemorysupport
– ONenthereisnosecondarystoragetoswapto!– Programscustomwri_enforpar=cularmemoryconfigura=oninproduct
– Difficulttoimplementpreciseorrestartableexcep=onsforexposedarchitectures
26
Page 27
Acknowledgements
§ ThiscourseispartlyinspiredbypreviousMIT6.823andBerkeleyCS252computerarchitecturecoursescreatedbymycollaboratorsandcolleagues:– Arvind(MIT)– JoelEmer(Intel/MIT)– JamesHoe(CMU)– JohnKubiatowicz(UCB)– DavidPa_erson(UCB)
27