Top Banner
The Memory Hierarchy CPSC 321 Andreas Klappenecker
41
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: The Memory Hierarchy CPSC 321 Andreas Klappenecker.

The Memory Hierarchy CPSC 321

Andreas Klappenecker

Page 2: The Memory Hierarchy CPSC 321 Andreas Klappenecker.

Some Results from the Survey

• Issues with the CS curriculum• CPSC 111 Computer Science Concepts & Prg• CPSC 310 Databases• CPSC 431 Software Engineering

• Something from the wish list:• More C++• More Software Engineering• More focus on industry needs• Less focus on industry needs

Page 3: The Memory Hierarchy CPSC 321 Andreas Klappenecker.

Some Results from the Survey

• Why (MIPS) assembly language? • More detailed explanations of

programming language xyz.• Implement slightly reduced version of the

Pentium 4 or Athlon processors• Have another computer architecture class• Lack of information on CS website about

specialization...

Page 4: The Memory Hierarchy CPSC 321 Andreas Klappenecker.

Follow Up

• CPSC 462 Microcomputer Systems• CPSC 410 Operating Systems

• Go to seminars/lectures by Bjarne Stroustrup, Jaakko Jarvi, or Gabriel Dos Reis

Page 5: The Memory Hierarchy CPSC 321 Andreas Klappenecker.

Today’s Menu

Caches

Page 6: The Memory Hierarchy CPSC 321 Andreas Klappenecker.

Memory

Current memory is largely implemented inCMOS technology. Two alternatives: • SRAM

• fast, but not area efficient• stored value in a pair of inverting gates

• DRAM• slower, but more area efficient• value stored on charge of a capacitor (must be

refreshed)

Page 7: The Memory Hierarchy CPSC 321 Andreas Klappenecker.

Static RAM

Page 8: The Memory Hierarchy CPSC 321 Andreas Klappenecker.

Static RAM

Page 9: The Memory Hierarchy CPSC 321 Andreas Klappenecker.

Dynamic RAM

Page 10: The Memory Hierarchy CPSC 321 Andreas Klappenecker.

Dynamic RAM

Page 11: The Memory Hierarchy CPSC 321 Andreas Klappenecker.

Memory

• Users want large and fast memories• SRAM is too expensive for main memory• DRAM is too slow for many purposes

• Compromise• Build a memory hierarchy

CPU

Level n

Level 2

Level 1

Levels in thememory hierarchy

Increasing distance from the CPU in

access time

Size of the memory at each level

Page 12: The Memory Hierarchy CPSC 321 Andreas Klappenecker.

Locality

• If an item is referenced, then • it will be again referenced soon (temporal locality)• nearby data will be referenced soon (spatial locality)

• Why does code have locality?

Page 13: The Memory Hierarchy CPSC 321 Andreas Klappenecker.

Memory Hierarchy

• The memory is organized as a hierarchy• levels closer to the processor is a subset

of any level further away• the memory can consist of multiple

levels, but data is typically copied between two adjacent levels at a time

• initially, we focus on two levels

Page 14: The Memory Hierarchy CPSC 321 Andreas Klappenecker.

Memory Hierarchy

Page 15: The Memory Hierarchy CPSC 321 Andreas Klappenecker.

Two Level Hierarchy

• Upper level (smaller and faster)• Lower level (slower)• A unit of information that is present or not

within a level is called a block• If data requested by the processor is in the

upper level, then this is called a hit, otherwise it is called a miss

• If a miss occurs, then data will be retrieved from the lower level. Typically, an entire block is transferred

Page 16: The Memory Hierarchy CPSC 321 Andreas Klappenecker.

Cache

A cache represents some level of memory between CPU and main memory

[More general definitions are often used]

Page 17: The Memory Hierarchy CPSC 321 Andreas Klappenecker.

A Toy Example

• Assumptions• Suppose that processor requests are each one word,• and that each block consists of one word

• Example • Before request C = [X1,X2,…,Xn-1]• Processor requests Xn not contained in C• item Xn is brought from the memory to the cache• After the request C = [X1,X2,…,Xn-1,Xn]

• Issues• What happens if the cache is full?

Page 18: The Memory Hierarchy CPSC 321 Andreas Klappenecker.

Issues

• How do we know whether the data item is in the cache?

• If it is, how do we find it?

• Simple strategy: direct mapped cache• exactly one location where data might

be in the cache

Page 19: The Memory Hierarchy CPSC 321 Andreas Klappenecker.

• Mapping: address modulo the number of blocks in the cache, x -> x mod B

Direct Mapped Cache

00001 00101 01001 01101 10001 10101 11001 11101

000

Cache

Memory

001

01

001

11

001

011

101

11

Page 20: The Memory Hierarchy CPSC 321 Andreas Klappenecker.

• Cache with 1024=210 words• tag from cache is compared against

upper portion of the address• If tag=upper 20 bits and valid bit is

set, then we have a cache hit otherwise it is a cache miss

What kind of locality are we taking advantage of?

Direct Mapped Cache

Address (showing bit positions)

20 10

Byteoffset

Valid Tag DataIndex

0

1

2

1021

1022

1023

Tag

Index

Hit Data

20 32

31 30 13 12 11 2 1 0

Page 21: The Memory Hierarchy CPSC 321 Andreas Klappenecker.

Direct Mapped Cache Example

Page 22: The Memory Hierarchy CPSC 321 Andreas Klappenecker.

Direct Mapped Cache Example

Page 23: The Memory Hierarchy CPSC 321 Andreas Klappenecker.

Direct Mapped Cache Example

Page 24: The Memory Hierarchy CPSC 321 Andreas Klappenecker.

• Taking advantage of spatial locality:

Direct Mapped Cache

Address (showing bit positions)

16 12 Byteoffset

V Tag Data

Hit Data

16 32

4Kentries

16 bits 128 bits

Mux

32 32 32

2

32

Block offsetIndex

Tag

31 16 15 4 32 1 0

Page 25: The Memory Hierarchy CPSC 321 Andreas Klappenecker.

• Read hits• this is what we want!

• Read misses• stall the CPU, fetch block from memory, deliver to cache,

restart

• Write hits:• can replace data in cache and memory (write-through)• write the data only into the cache (write-back the cache later)

• Write misses:• read the entire block into the cache, then write the word

Hits vs. Misses

Page 26: The Memory Hierarchy CPSC 321 Andreas Klappenecker.

Hits vs. Miss Example

Page 27: The Memory Hierarchy CPSC 321 Andreas Klappenecker.
Page 28: The Memory Hierarchy CPSC 321 Andreas Klappenecker.
Page 29: The Memory Hierarchy CPSC 321 Andreas Klappenecker.
Page 30: The Memory Hierarchy CPSC 321 Andreas Klappenecker.
Page 31: The Memory Hierarchy CPSC 321 Andreas Klappenecker.
Page 32: The Memory Hierarchy CPSC 321 Andreas Klappenecker.
Page 33: The Memory Hierarchy CPSC 321 Andreas Klappenecker.
Page 34: The Memory Hierarchy CPSC 321 Andreas Klappenecker.

What Block Size?

• A large block size reduces cache misses• Cache miss penalty increases • We need to balance these two

constraints• How can we measure cache

performance?• How can we improve cache

performance?

Page 35: The Memory Hierarchy CPSC 321 Andreas Klappenecker.

The performance of a cache depends on many parameters:

• Memory stall clock cycles

• Read stall clock cycles

• Write stall clock cycles

Page 36: The Memory Hierarchy CPSC 321 Andreas Klappenecker.
Page 37: The Memory Hierarchy CPSC 321 Andreas Klappenecker.
Page 38: The Memory Hierarchy CPSC 321 Andreas Klappenecker.
Page 39: The Memory Hierarchy CPSC 321 Andreas Klappenecker.

Cache Block Mapping

• Direct mapped cache• a block goes in exactly one place in the

cache

• Fully associative• a block can go anywhere in the cache• difficult to find a block• parallel comparison to speed-up search

Page 40: The Memory Hierarchy CPSC 321 Andreas Klappenecker.

Cache Block Mapping

• Set associative• Each block maps to a unique set, and

the block can be placed into any element of that set

• Position is given by (Block number) modulo (# of sets in cache)

• If the sets contain n elements, then the cache is called n-way set associative

Page 41: The Memory Hierarchy CPSC 321 Andreas Klappenecker.

Cache Types