Chapter 8: Chapter 8: System Memory System Memory Dr Mohamed Menacer Dr Mohamed Menacer Taibah University Taibah University 2007-2008 2007-2008
Chapter 8:Chapter 8:System MemorySystem Memory
Dr Mohamed MenacerDr Mohamed MenacerTaibah UniversityTaibah University
2007-20082007-2008
Memory CharacteristicsMemory Characteristics
LocationLocation
CapacityCapacity
Unit of transferUnit of transfer
Access methodAccess method
PerformancePerformance
Physical typePhysical type
Physical characteristicsPhysical characteristics
OrganisationOrganisation
LocationLocation
CPUCPU
InternalInternal
ExternalExternal
CapacityCapacity
Word sizeWord size The natural unit of organisationThe natural unit of organisation
Number of wordsNumber of words or Bytesor Bytes
Unit of TransferUnit of Transfer
InternalInternal Usually governed by data bus widthUsually governed by data bus width
ExternalExternal Usually a block which is much larger than a wordUsually a block which is much larger than a word
Addressable unitAddressable unit Smallest location which can be uniquely Smallest location which can be uniquely
addressedaddressed Word internallyWord internally Cluster on M$ disksCluster on M$ disks
Access Methods (1)Access Methods (1)SequentialSequential Start at the beginning and read through in orderStart at the beginning and read through in order Access time depends on location of data and Access time depends on location of data and
previous locationprevious location e.g. tapee.g. tape
DirectDirect Individual blocks have unique addressIndividual blocks have unique address Access is by jumping to vicinity plus sequential Access is by jumping to vicinity plus sequential
searchsearch Access time depends on location and previous Access time depends on location and previous
locationlocation e.g. diske.g. disk
Access Methods (2)Access Methods (2)RandomRandom Individual addresses identify locations exactlyIndividual addresses identify locations exactly Access time is independent of location or previous Access time is independent of location or previous
accessaccess e.g. RAMe.g. RAM
AssociativeAssociative Data is located by a comparison with contents of a Data is located by a comparison with contents of a
portion of the storeportion of the store Access time is independent of location or previous Access time is independent of location or previous
accessaccess e.g. cachee.g. cache
Memory HierarchyMemory Hierarchy
RegistersRegisters In CPUIn CPU
Internal or Main memoryInternal or Main memory May include one or more levels of cacheMay include one or more levels of cache ““RAM”RAM”
External memoryExternal memory Backing storeBacking store
Memory Hierarchy - DiagramMemory Hierarchy - Diagram
Hierarchy ListHierarchy List
RegistersRegisters
L1 CacheL1 Cache
L2 CacheL2 Cache
Main memoryMain memory
Disk cacheDisk cache
DiskDisk
OpticalOptical
TapeTape
PerformancePerformance
Access timeAccess time Time between presenting the address and Time between presenting the address and
getting the valid datagetting the valid data
Memory Cycle timeMemory Cycle time Time may be required for the memory to Time may be required for the memory to
“recover” before next access“recover” before next access Cycle time is access + recoveryCycle time is access + recovery
Transfer RateTransfer Rate Rate at which data can be movedRate at which data can be moved
Physical TypesPhysical Types
SemiconductorSemiconductor RAMRAM
MagneticMagnetic Disk & TapeDisk & Tape
OpticalOptical CD & DVDCD & DVD
OthersOthers HologramHologram
Physical CharacteristicsPhysical Characteristics
Decay (e.g. ROM)Decay (e.g. ROM)
Volatility (e.g. RAM)Volatility (e.g. RAM)
Erasable (e.g. EPROM, EEPROM)Erasable (e.g. EPROM, EEPROM)
Power consumptionPower consumption
OrganisationOrganisation
Physical arrangement of bits into wordsPhysical arrangement of bits into words
Not always obviousNot always obvious
e.g. interleavede.g. interleaved
Cache MemoryCache Memory
CacheCache
Small amount of fast memorySmall amount of fast memory
Sits between normal main memory and Sits between normal main memory and CPUCPU
May be located on CPU chip or moduleMay be located on CPU chip or module
Cache/Main Memory StructureCache/Main Memory Structure
Cache operation – overviewCache operation – overview
CPU requests contents of memory locationCPU requests contents of memory location
Check cache for this dataCheck cache for this data
If present, get from cache (fast)If present, get from cache (fast)
If not present, read required block from If not present, read required block from main memory to cachemain memory to cache
Then deliver from cache to CPUThen deliver from cache to CPU
Cache includes tags to identify which block Cache includes tags to identify which block of main memory is in each cache slotof main memory is in each cache slot
Size does matterSize does matter
CostCost More cache is expensiveMore cache is expensive
SpeedSpeed More cache is faster (up to a point)More cache is faster (up to a point) Checking cache for data takes timeChecking cache for data takes time
Typical Cache OrganizationTypical Cache Organization
Mapping FunctionMapping Function
Cache of 64kByteCache of 64kByte
Cache block of 4 bytesCache block of 4 bytes i.e. cache is 16k (2i.e. cache is 16k (21414) lines of 4 bytes) lines of 4 bytes
16MBytes main memory16MBytes main memory
24 bit address 24 bit address (2(22424=16M)=16M)
Direct MappingDirect MappingEach block of main memory maps to only one Each block of main memory maps to only one cache linecache line i.e. if a block is in cache, it must be in one specific i.e. if a block is in cache, it must be in one specific
placeplace
Address is in two partsAddress is in two parts
Least Significant w bits identify unique wordLeast Significant w bits identify unique word
Most Significant s bits specify one memory blockMost Significant s bits specify one memory block
The MSBs are split into a cache line field r and a The MSBs are split into a cache line field r and a tag of s-r (most significant)tag of s-r (most significant)
Direct MappingDirect MappingAddress StructureAddress Structure
Tag s-r Line or Slot r Word w
8 14 2
24 bit address24 bit address
2 bit word identifier (4 byte block)2 bit word identifier (4 byte block)
22 bit block identifier22 bit block identifier 8 bit tag (=22-14)8 bit tag (=22-14) 14 bit slot or line14 bit slot or line
No two blocks in the same line have the same Tag fieldNo two blocks in the same line have the same Tag field
Check contents of cache by finding line and checking Check contents of cache by finding line and checking TagTag
Direct Mapping Cache Direct Mapping Cache OrganizationOrganization
Direct Direct MappingMapping Example Example
Direct Mapping pros & consDirect Mapping pros & cons
SimpleSimple
InexpensiveInexpensive
Fixed location for given blockFixed location for given block If a program accesses 2 blocks that map to If a program accesses 2 blocks that map to
the same line repeatedly, cache misses are the same line repeatedly, cache misses are very highvery high
Pentium 4 Block DiagramPentium 4 Block Diagram
Pentium 4 Core Pentium 4 Core ProcessorProcessor
Fetch/Decode UnitFetch/Decode Unit Fetches instructions from L2 cacheFetches instructions from L2 cache Decode into micro-opsDecode into micro-ops Store micro-ops in L1 cacheStore micro-ops in L1 cache
Out of order execution logicOut of order execution logic Schedules micro-opsSchedules micro-ops Based on data dependence and resourcesBased on data dependence and resources May speculatively executeMay speculatively execute
Execution unitsExecution units Execute micro-opsExecute micro-ops Data from L1 cacheData from L1 cache Results in registersResults in registers
Memory subsystemMemory subsystem L2 cache and systems busL2 cache and systems bus
Pentium 4 CachePentium 4 Cache80386 – no on chip cache80386 – no on chip cache80486 – 8k using 16 byte lines and four way set associative organization80486 – 8k using 16 byte lines and four way set associative organizationPentium (all versions) – two on chip L1 cachesPentium (all versions) – two on chip L1 caches Data & instructionsData & instructions
Pentium III – L3 cache added off chipPentium III – L3 cache added off chipPentium 4Pentium 4 L1 cachesL1 caches
8k bytes8k bytes64 byte lines64 byte linesfour way set associativefour way set associative
L2 cache L2 cache Feeding both L1 cachesFeeding both L1 caches256k256k128 byte lines128 byte lines8 way set associative8 way set associative
L3 cache on chipL3 cache on chip