Virtual m

Virtual Memory Management

Virtual MemoryWhat is the size limit of a processDoes it have to be smaller than physical memoryNot reallyNot all parts of the process have to be in memoryThe limit is set by address space, not physical memory32 bits address 4G address spaceOnly a part of the 4G needs to be in memoryIf only part of a process address space is in memory, where is the remaining part?Memory hierarchy

Memory HierarchyCache Memory Disk (swap space, not files)

Down the hierarchyDecreasing cost Increasing capacity and access time

Virtual Memory and Demand PagingVirtual MemoryThe size of each process is as large as the address space A page is a unit of the address space For example: 32 bits for addressing, 1KB per page Each process has 4M pagesMany pages do not actually contain anythingEach process uses virtual addresses in the virtual address spaceVirtual addresses are mapped to physical addresses for memory accesses

Virtual Memory and Demand PagingDemand pagingAll pages (that contain something) are stored in the secondary storage (disk swap space)A page is brought into memory only when it is neededThis is called demand pagingIf a needed page is not in memoryA page fault occurs OS brings the page from swap space into memory

Virtual Memory IssuesAllocationNot an issue in paging schemesAddressingHow to map a logical address to a physical address?How to know where the page is?Page tableNeed a page table to keep track of the pages of a processIn cache, or memory, or diskHow to store the page table?How to efficiently search the page table?

Virtual Memory IssuesReplacementA page that has been brought into memory may reside there till it is replaced When a new page is to be brought into memory and there is no free framesNeed to replacement an existing pageHow to choose the page to replace?Working set managementWorking set: the set of pages required during executionWhat is the best working set size

Virtual Memory IssuesProtectionProtect the address space of one process from being accessed by another processPerformance metrics Time for each memory access (time for page table lookup) Number of page faults in a time unit

Performance: Memory HierarchyCachingAnother layer in the hierarchyFrequently used pages are stored in the cache For a memory access

Is it in Cache?hitFetch the pagefrom cacheIs it in Memory?hitFetch the pagefrom memoryto cachemissmissFetch the pagefrom diskswap spaceto memory andthen to cache

Performance: Memory HierarchyMemory hierarchy characteristicsRegister: ~ 1 ns or lessCache access time: ~ 1 ns to 5 nsMemory access time: ~ 50 ns to 200 nsDisk access time: ~ 1 ms to 5 msCache hit rateProbability that a word is in cacheMemory hit rateProbability that a word is not in cache but in memory

Memory Hierarchy (3 Levels)Average memory access latency0.95 * 10 ns + [ (1 0.95) * 0.999 ] * (100+10) ns +[ (1 (0.95 + (1 0.95) * 0.999) ] * (100000+100+10) ns= 9.5 ns + 5.5 ns + 5 ns= 20 ns

CacheAccess time = 10 nsHit rate = 95%MemoryAccess time = 100 nsHit rate = 99.9%DiskAccess time = 100s

Memory Hierarchy (2 Levels)Average memory access latency0.95 * 10 ns + (1 0.95) * (100+10) ns= 9.5 ns + 5.5 ns= 15 ns

CacheAccess time = 10 nsHit rate = 95%MemoryAccess time = 100 nsThis is a simplified example!In actual systems, access time depends on how page table is implemented.Most of the time, it takes more than one memory reference for each memory access

Addressing: Page TableAddressing depends on page table implementationPage table can be very large32 bit address space, 4KB page size 1M pagesWhere to put the page table with so many entriesStore in register array Used in PDP-11Fast but can only afford a very small sized page table Store in main memory One extra memory reference per memory access Still, 1M pages takes too much memory spaceStore in disk Terrible access performance

Page Table for Virtual MemoryMulti-level page tableAllow full indexingCan map for memory and diskInverted Page Table (IPT)Maps memory frames to pages in processesOnly address pages in memoryTranslation lookaside buffer (TLB)Only address pages in memoryBoth IPT and TLBNeed to have page tables for pages on disk

Addressing -- Obtain Frame NumberPagein cacheCheckscacheVirtual AddressPage #OffsetnoChecks PTin MemoryyesPage inMemory noyesPut page entryinto TLBPage fault interruptOS init I/O tobring the pagefrom disk to memoryCPU switchto anotherprocessI/O transferpage tomemoryUpdatepage tablein memoryReplace memory pageif necessaryGet frame #Bring page to cacheReference

Two-Level Page Table4GB virtual address space (32 bits)Assume 1KB page sizePage table has 4M entries for each processMost of the entries are empty (in the middle)Program at beginning pages and data at ending pagesUse another table to index the page tableCalled root page tableCall the one with 4M entries the process page table Use 4K entries in the RPTEach entry in the RPT points to the beginning of 1K page entries in PPT

Two-Level Page Table (for one process)Root page tablePoint to PPTEntry 0 - 1023Point to PPTEntry 1024 - 2047Point to PPTEntry 2048 - 3071Point to PPTLast 1024 entries4K entries1K entriescontiguous

Two-Level Page Table (for one process)Virtual AddressMain MemoryPageFramep #Frame #PartialPage TableBlockFrame #OffsetOffsetPhysical Addressrp #Page #Root Page TableSub Tbl Ptr

Addressing in Two-Level PTAddressing4GB address space, 1KB per page 4M pagesRoot page table has 16K entriesUse 14 bits for addressing the root page tableEach partial page table block is 256 entriesUse 8 bits for addressing the page table block00000000000010000010100000101011Root table page number = 10 = 2Partial Page block index = 1010 = 10Page offset = 10101100000000000000000101010000101011Frame number = 21 = 10101157401P.512P.513P.514P.52121P.522012910012910

Inverted Page Table (IPT)Inverted: Map frame # to processe ID/page ## IPT entries = # frames in physical memory Indexed by physical page frames Useful when a frame is modifiedEasy to locate the corresponding process ID/page #Find page # from frame # can be very expensiveUse hash table to map page # to frame #Require at least two extra memory accesses One for hash table, one for IPT accessesCollision may occur in the hash tableNeed an additional field for chaining

Addressing in IPTpage #offsethash15253, 0262312frame #offsetIPTMain MemoryVirtual Address Space (disk)hitmissHash table

Addressing in IPTAddressingGiven (process#, page#) tupleLocate the corresponding physical frame hash function h = process# page#Example 1: Find frame number for (0,5) h(0, 5) = 0000 0101 = 0101 = 5hashtable[5] = 6 IPT[6] = (0,5) Got the match (process 0, page 5) is in memory frame 6

Addressing in IPTExample 2: Find frame number for (1, 2)h(1, 2) = 0001 0010 = 0011 = 3hashtable[3] = 3, 0IPT[3] = (2,1) does not match IPT[0] = (1,2) Got the match(process 0, page 5) is also in memory frame 0

Addressing in IPTExample 3: Find frame number (8, 9) h(8, 9) = 1000 1001 = 0001 = 1hashtable[1] = 2 IPT[2] = (1,0) does not match (process 0, page 1) is not in memoryIssue a page fault

Translation Lookaside BufferUse process ID and logical page number as the keySearch for the entry in parallelGet the frame number of the matching entryPID + logical page #Frame #ComparatorFrame #PID + logical page #PID + logical page #Frame #Comparator

Working Set ManagementWorking set (resident set)The set of pages that a process needs within a time interval Working set size for each process Depend on the time interval consideredIf it is smallHigher level of multiprogramming Better for time-sharing systems May result in thrashing Page faults occur every few instructions If it is too bigMay have lower level of concurrency

Working Set Managementint A[128,128], B[128,128]; int i, j, x, y; for (i=0; i < 128; i++) for (j=0; j < 128; j++) { B[i,j] = A[i,j] + x + y; x = (x * i) % 128; y = (y * j) % 128; }MemoryPage size = 1K bytes; size(int) = 1 word = 4 bytes Program and x, y, i, and j are stored in one pageA[0,0]-A[0,127], A[1,0]-A[1,127] in a page B[0,0]-B[0,127], B[1,0]-B[1,127] in a pageWhat happens if WS = 1, 2, 3, 4, or 5? How many page faults in each case?What is the best working set size? What happens if switch inner and outer loops?How many page faults now?

Working Set ManagementWhen to load the working set Load on demandA page will be loaded when neededSuitable for new jobsMay take some initialization time for a process to get all its needed pagesAfter being swapped out and swapped inLoad back pages on demand will incur a big overheadPreload the working set before process starts to runVariable working-set sizeMore page faults Larger working set size Pages that are not used for a time period Remove the page and reduce the working set size

Page Replacement PoliciesPrinciple of locality A program that references a location l at some point of time is likely to reference the same location l and locations in the immediate vicinity of l in the near future90% of the execution time is spent on loopsMost of the time, a program is executed sequentially A program tends to favor a subset of its pages during a time interval

Page Replacement PoliciesReplacement policiesWhen a working set of a process is full and a new page is needed replace an existing pageWhich page to replaceGlobal versus process basedReplacement for each processEach process has a working set sizeGlobalThere is no need to consider working set size for each processMay replace any page in the memory

Page Replacement PoliciesReplacement policiesWhen a working set of a process is full and a new page is needed replace an existing pageWhich page to replacePoliciesFirst-in-first-out policy (FIFO) Least recently used policy (LRU) Clock policy Modified clock policyAging policy

Page Replacement PoliciesFirst-in-first-out policy (FIFO) Simple (use a pointer points to the oldest page) The oldest page may be used more recently Least recently used policy (LRU) Replace the least recently used page Comply with the principle of locality High implementation overhead Hard to keep track of all referencesUse a stack for each process

Clock PolicyPointer mrppoints to the most recently loaded page Flag used indicates whether a page has been used recentlyPage replacement Scan from mrp, find the first page with used = 0mrp points to the page next to the most recently loaded pageDuring scanning, set used 0 if it was 1 If no page with used = 0 in the first round will definitely get one during second round

Modified Clock PolicySame as before: mrp and usedFlag dirtyIndicate whether a page has been modified dirty=1 the page have to be written out before clear the flag to 0Principle Same as clockFirst try to find an used = 0 pageAmong them, dirty = 0 should be considered first

Modified Clock PolicyPage replacementScan from mrpFirst scanFind the first page with used = 0 dirty = 0No change to the used and dirty bitsSecond scan, if no suitable pages foundFind the first page with used = 0 dirty = 1During scan, set used 0 if it was 1 Never change dirty bitThird scan, if still no suitable pages foundFind the first page with used = 0 dirty = 0Fourth scan, if still no suitable pages foundFind the first page with used = 0 dirty = 1Will definitely find one now

Example for Page Replacement PoliciesPage use pattern: 0 1 5 0 1 4 0 1 2 0 2 3 0 6 3Working set size: 3FIFO (10 page faults)0 1 5 | 0 0 1 5 | 1 0 1 5 | 4 1 5 4 | 0 5 4 0 | 1 4 0 1 | 2 0 1 2 | 00 1 2 | 0 0 1 2 | 2 0 1 2 | 3 1 2 3 | 0 2 3 0 | 6 3 0 6 | 33 0 6 |

Example for Page Replacement PoliciesPage use pattern: 0 1 5 0 1 4 0 1 2 0 2 3 0 6 3LRU (7 page faults)Minimal page faults required: 70 1 5 | 0 1 5 0 | 1 5 0 1 | 4 0 1 4 | 0 1 4 0 | 1 4 0 1 | 2 0 1 2 | 00 1 2 | 0 1 2 0 | 2 1 0 2 | 3 0 2 3 | 0 2 3 0 | 6 3 0 6 | 30 6 3 |

Example for Page Replacement PoliciesPage use pattern: 0 1 5 0 1 4 0 1 2 0 2 3 0 6 3Clock (9 page faults)* represents used bit, next to the last loaded pages0* | 1

0* 1* | 5 0* 1* 5* | 0

0* 1* 5* | 1 0* 1* 5* | 44* 1 5 | 0 4* 0* 5 | 1 4* 0* 1* | 2 2* 0 1 | 0

2* 0* 1 | 22* 0* 1 | 3 2* 0 3* | 0 2* 0* 3* | 6 6* 0 3 | 3

6* 0 3* | Scan and find page with (used = 0)While scanning, change (used 0) 0 1* 5* | 4 0 1 5* | 4 0 1 5 | 4Finish one round, no page selectedStart again, page 0 selected 4* 1 5 | Scan and find page with (used = 0)Page 1 selected 4* 0* 5 | Scan and find page with (used = 0)While scanning, change (used 0) 4 0 1 | 2Finish one round, no page selectedStart again, page 0 selected 2* 0 1 | Scan and find page with (used = 0)While scanning, change (used 0) 2* 0 1 | 3Page 1 selected 2* 0 3* |

Example for Page Replacement PoliciesPage use/write pattern 0 1w 5 0 1w 4w 0 2 0 1w 2w 4Modified Clock Policy (x represents modified)0* | 1w

0* 1*x | 5 0* 1*x 5* | 0

0* 1*x 5* | 1wScan and find page with (used=0, dirty=0) 0* 1*x 5* | 4w no page selected2nd scan, find page with (used=0, dirty=1)While scanning, change used bit 0 1x 5 | 4w still no page selectedScan and find page with (used=0, dirty=0)Page 0 selected 4*x 1x 5 |0* 1*x 5* | 4w 4*x 1x 5 | 0

4*x 1x 0* | 2

4x 2* 0* | 0Scan and find page with (used=0, dirty=0)Page 5 selected 4*x 1x 0* |Scan and find page with (used=0, dirty=0) 4*x 1x 0* | 2 no page selected2nd scan, find page with (used=0, dirty=1)While scanning, change used bit 4x 1x 0* | 2 page 1 selected 4x 2* 0* | write page 1 to disk4x 2* 0* | 1w 1*x 2* 0 | 2w

1*x 2*x 0 | 4

1x 2x 4* | Scan and find page with (used=0, dirty=0) 4x 2* 0* | 1w no page selected2nd scan, find page with (used=0, dirty=1)While scanning, change used bit 4x 2* 0 | 1w page 4 selectedWrite page 4 back to disk, replace it 1*x 2* 0 |

Other Issues for VMGlobal page replacementConsider all processes togetherE.g., Linux page replacement policyAging policy with 8-bit aging vectorLoad Control How many processes should reside in memory In global policy, this cannot be controlled by working set sizeControl the level of multiprogrammingToo many processes high frequency of page faults Too few processes reduced level of concurrency

Other Issues for VMGlobal page replacement policiesAll the page replacement policies discussed previously can be used as the global policiesAging policyConsider an aging vectorIncrease the aging vector whenever the page is accessedSystem periodically scan the aging vector and reduce its valueA page is to be removed if it reaches 0Does not need to actually remove till receiving a replacement request

Other Issues in VMProtectionHow to ensure that each process access its own memoryIs it a problem in virtual memory?Why there are access violations?How about the buffer overflow attacks?Frame Locking Lock pages that cannot be replaced e.g. OS code and data When a page is just brought in and before it is usedProcess A is switched out when having a page faultWhen the page is brought in, A may not be scheduled to run

Readings Sections 1.5, 1.6Sections 8.1, 8.2

Virtual m

Documents

memory accesses virtual

access time virtual

memory accessis

needed page

page tableneed

new page

process address space

nscacheaccess time