June 5, 2005 June 5, 2005 ISCA 2005 Tutorial ISCA 2005 Tutorial 1 Using the M5 Simulator Using the M5 Simulator Nathan Binkert, Ron Nathan Binkert, Ron Dreslinski Dreslinski, Lisa Hsu, Kevin Lim, Ali Lisa Hsu, Kevin Lim, Ali Saidi Saidi Prof. Steve Reinhardt Prof. Steve Reinhardt June 5, 2005 June 5, 2005 ISCA 2005 Tutorial ISCA 2005 Tutorial 2 Welcome! Welcome! This tutorial is for This tutorial is for you you Feel free to ask questions Feel free to ask questions We We’ ve got a lot to cover ve got a lot to cover Lots of cool stuff didn Lots of cool stuff didn’ t even make the slides t even make the slides Don Don’ t be offended if we have to move on t be offended if we have to move on Come talk to us later Come talk to us later we we’ re all here through Wednesday re all here through Wednesday
64
Embed
Using the M5 Simulator · 1 June 5, 2005 ISCA 2005 Tutorial 11 Using the M5 Simulator Nathan Binkert, Ron Dreslinski, Lisa Hsu, Kevin Lim, Ali Saidi Prof. Steve Reinhardt June 5,
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
This tutorial is for This tutorial is for youyouFeel free to ask questionsFeel free to ask questions
WeWe’’ve got a lot to coverve got a lot to coverLots of cool stuff didnLots of cool stuff didn’’t even make the slidest even make the slidesDonDon’’t be offended if we have to move ont be offended if we have to move onCome talk to us laterCome talk to us later
wewe’’re all here through Wednesdayre all here through Wednesday
What M5 is and is notWhat M5 is and is notA brief peek insideA brief peek insideCurrent status & future developmentsCurrent status & future developments
A tool for simulating A tool for simulating systemssystemsNot just CPU cores: memory, I/ONot just CPU cores: memory, I/ONot just SPEC apps: full OS codeNot just SPEC apps: full OS codeNot just single machines: client/server, etc.Not just single machines: client/server, etc.
1.1. A framework for eventA framework for event--driven simulationdriven simulationEvents, objects, statistics, configurationEvents, objects, statistics, configuration
2.2. A collection of predefined object modelsA collection of predefined object modelsCPUs, caches, busses, devices, etc.CPUs, caches, busses, devices, etc.
This tutorial focuses on #2This tutorial focuses on #2You may find #1 useful even if #2 is notYou may find #1 useful even if #2 is not
Where Did M5 Come From?Where Did M5 Come From?Born of frustration with existing toolsBorn of frustration with existing tools
Did not do what we wantedDid not do what we wantedDid not scale with added complexityDid not scale with added complexity
Desire to simulate TCP/IP performanceDesire to simulate TCP/IP performanceFullFull--system supportsystem supportMultiple system simulationMultiple system simulation
Almost entirely original codeAlmost entirely original codeOld CPU model based on Old CPU model based on SimpleScalarSimpleScalar simsim--outorderoutorderFullFull--system support used system support used SimOSSimOS as referenceas reference
No premeditated distribution plansNo premeditated distribution plansJust hacking together the system Just hacking together the system wewe wantedwanted
A hardware design languageA hardware design languageHigher level for design space exploration, Higher level for design space exploration, simulation speedsimulation speed
A restrictive environmentA restrictive environmentJust C++/Python with an event queue and a Just C++/Python with an event queue and a bunch of APIs you can choose to ignorebunch of APIs you can choose to ignore
Finished!Finished!Always room for improvementAlways room for improvement……
What We Would Like M5 to BeWhat We Would Like M5 to Be
Something that spares you the pain weSomething that spares you the pain we’’ve ve been throughbeen throughA community resourceA community resource
Modular enough to localize changesModular enough to localize changesContribute back, and spare others some painContribute back, and spare others some pain
A path to reproducible/comparable resultsA path to reproducible/comparable resultsA common platform for evaluating ideasA common platform for evaluating ideas
Everything you care about is an object Everything you care about is an object (C++/Python)(C++/Python)
Derived from Derived from SimObjectSimObject base classbase classCommon code for creation, configuration parameters, Common code for creation, configuration parameters, naming, naming, checkpointingcheckpointing, etc., etc.
Uniform methodUniform method--based APIs for object typesbased APIs for object typesCPUs, caches, memory, etc.CPUs, caches, memory, etc.PlugPlug--compatibility across implementationscompatibility across implementations
Functional vs. detailed CPUFunctional vs. detailed CPUConventional vs. indirectConventional vs. indirect--index cacheindex cache
Standard event queue timing modelStandard event queue timing modelGlobal logical time (in Global logical time (in ““ticksticks””))No fixed relation to real timeNo fixed relation to real time
Objects schedule their own eventsObjects schedule their own eventsFlexibility for detail vs. performance tradeoffsFlexibility for detail vs. performance tradeoffs
E.g., a CPU E.g., a CPU typtyp. schedules an event every cycle. schedules an event every cycleSimple CPU wonSimple CPU won’’t schedule self if stalled/idle t schedule self if stalled/idle Can also schedule every nCan also schedule every nthth cycle to model other cycle to model other clock ratesclock rates
SyscallSyscall emulation modeemulation modeAlpha Tru64 or Linux application binariesAlpha Tru64 or Linux application binariesHostHost--based or based or SimpleScalarSimpleScalar EIO tracesEIO traces
FullFull--system modesystem modeModels Compaq Models Compaq ““TsunamiTsunami””--based systembased system
Boots Linux 2.4 & 2.6 and L4; FreeBSD in progressBoots Linux 2.4 & 2.6 and L4; FreeBSD in progressAsk us if you want Tru64Ask us if you want Tru64
Extensions for >4 CPUsExtensions for >4 CPUs
Ethernet, IDE disk adaptersEthernet, IDE disk adaptersHandful of preHandful of pre--built benchmarks availablebuilt benchmarks available
Building ExecutablesBuilding ExecutablesPlatformsPlatforms
Linux, BSD, CYGWIN (most UNIX like systems?)Linux, BSD, CYGWIN (most UNIX like systems?)Linux is primary, others may take a tiny bit of workLinux is primary, others may take a tiny bit of work
Little Little endianendian machines!machines!6464--bit machines help a lotbit machines help a lot
ToolsToolsGCC/G++ 3.0+GCC/G++ 3.0+
Recently tested with 3.3Recently tested with 3.3--3.53.5Python 2.4Python 2.4SConsSCons (we use 0.95 or 0.96.1)(we use 0.95 or 0.96.1)
--d set the output directory to <dir>d set the output directory to <dir>--E set the environment variable <E set the environment variable <varvar> to <> to <valval> (or > (or 'True')'True')--I add the directory <dir> to python's pathI add the directory <dir> to python's path--P execute <python> directly in the configurationP execute <python> directly in the configuration----varvar==valval set the python variable <set the python variable <varvar> to '<> to '<valval>'>'<<configfileconfigfile> > configconfig file name (ends in .file name (ends in .pypy))
% % ~/m5/build/ALPHA_FS/m5.debug ~/m5/build/ALPHA_FS/m5.debug ––d output ~/m5/configs/fullsys/run.pyd output ~/m5/configs/fullsys/run.pyM5 Simulator SystemM5 Simulator SystemCopyright (c) 2001Copyright (c) 2001--20052005The Regents of The University of MichiganThe Regents of The University of MichiganAll Rights ReservedAll Rights Reserved
This code is part of the M5 simulator, developed by Nathan BinkeThis code is part of the M5 simulator, developed by Nathan Binkert,rt,Erik Erik HallnorHallnor, Steve , Steve RaaschRaasch, and Steve Reinhardt, with contributions, and Steve Reinhardt, with contributionsfrom Ron from Ron DreslinskiDreslinski, Dave Greene, Lisa Hsu, Ali , Dave Greene, Lisa Hsu, Ali SaidiSaidi, and Andrew, and AndrewSchultz.Schultz.
M5 compiled on May 31 2005 11:43:24M5 compiled on May 31 2005 11:43:24M5 executing on M5 executing on ziff.eecs.umich.eduziff.eecs.umich.eduM5 simulation started Tue May 31 11:45:02 2005M5 simulation started Tue May 31 11:45:02 2005Listening for console connection on port 3456Listening for console connection on port 3456
0: 0: system.tsunami.iosystem.tsunami.io: Real: Real--time clock set to Sun Jan 1 00:00:00 2006time clock set to Sun Jan 1 00:00:00 2006command line: /n/ziff/z/binkertn/build/head/ALPHA_FS/m5.debug command line: /n/ziff/z/binkertn/build/head/ALPHA_FS/m5.debug ––d output d output
Listening for remote Listening for remote gdbgdb connection on port 7000connection on port 7000warn: Entering event queue. Starting simulation...warn: Entering event queue. Starting simulation...
PythonPythonConfigConfig objsobjs mapped to simulator mapped to simulator objsobjsNo need for scripts to generate No need for scripts to generate configsconfigs
All logic for running many simulations contained in a All logic for running many simulations contained in a single set of configurable single set of configurable configconfig files!files!
Pass parameters via environment Pass parameters via environment varsvars--E<E<varvar>[=<>[=<valval>]>]
Variables with units are enforcedVariables with units are enforcedLatency must be Latency must be ‘‘2ns2ns’’, not simply 2, not simply 2
Current Directory or Current Directory or --d <dir>d <dir>config.pyconfig.py, , config.iniconfig.ini, , config.outconfig.outconsole.<system>.console.<system>.sim_consolesim_consoleoutputoutputstats.txtstats.txtcptcpt.<number>/.<number>/
Database OutputDatabase OutputM5 can output to a MYSQL databaseM5 can output to a MYSQL database
----Serialize.cycleSerialize.cycle=<start cycle>=<start cycle>----Serialize.periodSerialize.period=<repeat interval>=<repeat interval>----Serialize.countSerialize.count=<# of checkpoints>=<# of checkpoints>
M5 instructionM5 instructionInsert special instruction into code to trigger a Insert special instruction into code to trigger a checkpoint to be droppedcheckpoint to be droppedOur benchmarks do thisOur benchmarks do this
Starting From a CheckpointStarting From a Checkpoint
Same configuration as normal except you add:Same configuration as normal except you add:----Root.checkpointRoot.checkpoint=<path>/=<path>/cptcpt.<number>.<number>
Checkpoints must be regenerated with some Checkpoints must be regenerated with some configconfig changeschanges
Most Most configconfig changes that are architecturally visible changes that are architecturally visible (because the kernel may have behaved differently)(because the kernel may have behaved differently)Physical memory size, new kernelsPhysical memory size, new kernels
M5 canM5 candump statistics many timesdump statistics many timesaggregate statistics based on some eventaggregate statistics based on some event(keep stats according to kernel mode or user mode)(keep stats according to kernel mode or user mode)
Switch between CPU configurationsSwitch between CPU configurationsFunctional CPU Functional CPU Detailed CPUDetailed CPU
WarmWarm--up caches in a functional CPU, do up caches in a functional CPU, do measurements in a detailed CPUmeasurements in a detailed CPU
Raw copies of Linux disk imageRaw copies of Linux disk imageBinaries to be run must be present on imageBinaries to be run must be present on image
rcSrcS files (files (m5/configs/boot/*.m5/configs/boot/*.rcSrcS))Exactly like normal boot scriptsExactly like normal boot scriptsUse them to start running a binary on the disk Use them to start running a binary on the disk image, configure image, configure ethernetethernet interfaces, etc.interfaces, etc.Can also execute m5 instructionsCan also execute m5 instructionsNice and flexible, since not compiled inNice and flexible, since not compiled in
Specified in configuration by Specified in configuration by readfilereadfile==‘‘path/to/path/to/script.rcSscript.rcS’’
See for yourself!See for yourself!Going into / of disk image and typing Going into / of disk image and typing lsls will show:will show:
benchmarks etc lib benchmarks etc lib mntmnt sbinsbin usrusrbin floppy bin floppy lost+foundlost+found modules sys modules sys varvardev home man proc dev home man proc tmptmp zz
Adding Your Own BenchmarksAdding Your Own BenchmarksHighly encouraged! Highly encouraged! ☺☺
Please share them with others!Please share them with others!
Since M5 is Alpha targeted, need to Since M5 is Alpha targeted, need to compile Alpha binariescompile Alpha binaries
CrossCross--compiler can be downloaded from compiler can be downloaded from www.kegel.com/crosstoolwww.kegel.com/crosstoolOr, if you have an Alpha, use thatOr, if you have an Alpha, use thatAdd the benchmark binaries to disk image Add the benchmark binaries to disk image Create .Create .rcSrcS file that executes the binaryfile that executes the binary
If you need to mount a disk image to If you need to mount a disk image to change something (like add a benchmark change something (like add a benchmark binary)binary)As root:As root:mount mount ––o o loop,offsetloop,offset=32256 =32256 myimage.imgmyimage.img //mntmnt/point/point
You can then manipulate the file system You can then manipulate the file system directly and copy in binariesdirectly and copy in binariesDonDon’’t forget to t forget to unmountunmount!!
New benchmark: New benchmark: mybenchmybenchCompile and put it in disk image:Compile and put it in disk image:cp cp mybenchmybench //mnt/point/benchmarks/mybenchmnt/point/benchmarks/mybench
Simple CPU ModelSimple CPU Modelm5/cpu/simple/simple_cpu.{hh,cc}m5/cpu/simple/simple_cpu.{hh,cc}
Uses of the Uses of the SimpleCPUSimpleCPU::Warming up cachesWarming up cachesDriving systems that do not require detailed modelingDriving systems that do not require detailed modeling
Ideal starting point to learn how CPU models Ideal starting point to learn how CPU models work within M5work within M5
Simple overview of fetching, executing, and retiring Simple overview of fetching, executing, and retiring instructionsinstructionsHandles all the calls to support full system modeHandles all the calls to support full system mode
Simulates an inSimulates an in--order 1 CPI machineorder 1 CPI machineStalls on I or D cache missesStalls on I or D cache missesSingle threadedSingle threadedCan roughly model a superscalar machine by Can roughly model a superscalar machine by ticking multiple timesticking multiple times
Can be extended to simple pipelineCan be extended to simple pipelineAdd in functional units with latenciesAdd in functional units with latencies
Detailed CPU ModelDetailed CPU Modelm5/cpu/o3/*, m5/encumbered/cpu/fullm5/cpu/o3/*, m5/encumbered/cpu/full
Currently two detailed models within M5Currently two detailed models within M5Old model, distantly based on Old model, distantly based on simsim--oooooo
Used for SMT and full system mode, but will Used for SMT and full system mode, but will eventually be phased outeventually be phased out
New modelNew modelWill be the focus of this sectionWill be the focus of this sectionMore closely couples execution and timingMore closely couples execution and timingUses template policiesUses template policies
Previous model executes instructions at Previous model executes instructions at fetchfetch
Feeds instruction to timing backendFeeds instruction to timing backendNew model executes at execute, modeling New model executes at execute, modeling the timing for each pipeline stagethe timing for each pipeline stage
Important for coherenceImportant for coherenceUP and MP system studiesUP and MP system studies
Forces both timing and execution to be Forces both timing and execution to be accurateaccurate
Time buffer class is Time buffer class is templatedtemplatedIts template parameter is the communication Its template parameter is the communication structstruct between stagesbetween stages
Stages must communicate to each other Stages must communicate to each other via the time buffervia the time buffer
Avoids unrealistic interaction between pipeline Avoids unrealistic interaction between pipeline stagesstages
ImplImpl’’ss are used to define specific CPU are used to define specific CPU instances, down to the ISAinstances, down to the ISA
To create different types of CPUs, create a To create different types of CPUs, create a new new ImplImpl and define all of the specific typesand define all of the specific types
*_*_impl.hhimpl.hh files due to using templates in files due to using templates in our infrastructureour infrastructure
*.cc files have instantiations*.cc files have instantiations
New model will eventually add full system New model will eventually add full system support, SMT supportsupport, SMT supportWill include a checker/verifier at commitWill include a checker/verifier at commitAlso will abstract away ISA specific Also will abstract away ISA specific functionsfunctions
Lower levels of code can be shared across Lower levels of code can be shared across platformsplatforms
Represents a decoded instructionRepresents a decoded instructionHas classifications of the instHas classifications of the instCorresponds to the binary machine inst Corresponds to the binary machine inst Decoded onceDecoded onceOnly has static informationOnly has static information
Has all the methods needed to execute an Has all the methods needed to execute an instructioninstruction
Tells which Tells which regsregs are source and are source and destdest
Individual Use of Individual Use of StaticInstStaticInst
Contains the execute() functionContains the execute() functionSpecific instruction classes override thisSpecific instruction classes override thisPython generates execute() for all Python generates execute() for all instsinsts
TemplatedTemplated on the ISAon the ISADifferent Different ISAsISAs can have specific instantiations can have specific instantiations of of StaticInstStaticInst
Dynamic version of Dynamic version of StaticInstStaticInstUsed for detailed CPU modelUsed for detailed CPU modelHolds PC, results, renamed Holds PC, results, renamed regsregs, etc, etc
Represents total architectural state of a Represents total architectural state of a single thread in the systemsingle thread in the system
PC, register values, memory, etc.PC, register values, memory, etc.
Contains pointers to key classesContains pointers to key classesEverything needed to functionally execute an Everything needed to functionally execute an instructioninstruction
Currently has a few ISA specific detailsCurrently has a few ISA specific detailsFuture direction has state being removed Future direction has state being removed from from ExecContextExecContext, and , and ExecContextExecContextserving mainly as an interfaceserving mainly as an interface
Two versions of the memory systemTwo versions of the memory systemSeparate timing / functional modelsSeparate timing / functional models
Capability to have data in the timing cachesCapability to have data in the timing cachesMemory tester objectMemory tester object
New unified timing/functional access modelNew unified timing/functional access modelWorks with simple CPU and execWorks with simple CPU and exec--inin--exec CPUexec CPUDoesnDoesn’’t support older full CPU model (execute at t support older full CPU model (execute at fetch)fetch)
Overview Cont.Overview Cont.Supports atomic and event driven accessesSupports atomic and event driven accesses
Atomic modelAtomic modelEach memory transaction is independentEach memory transaction is independentEach transaction is followed completely through the memory Each transaction is followed completely through the memory systemsystemSpeeds simulation, useful when timing not important (miss Speeds simulation, useful when timing not important (miss stream)stream)
Event driven modelEvent driven modelEvents generated for each transaction as it traverses the Events generated for each transaction as it traverses the hierarchyhierarchyMultiple transactions interact in the memory systemMultiple transactions interact in the memory systemUseful when timing information is important, or MP system Useful when timing information is important, or MP system interaction is desiredinteraction is desired
All memory interactions are described in All memory interactions are described in terms of a memory request (terms of a memory request (MemReqMemReq) ) objectobjectEncapsulates all relevant informationEncapsulates all relevant information
Virtual and physical addressVirtual and physical addressRequest sizeRequest sizeRequesting deviceRequesting deviceEtc.Etc.
Makes memory system independentMakes memory system independent
CPU or device generates a CPU or device generates a MemReqMemReqMemReqMemReq is passed to the proper bus/cacheis passed to the proper bus/cacheEach cache creates a new Each cache creates a new MemReqMemReq to to pass through the hierarchy, holding on to pass through the hierarchy, holding on to the request it needs to respond tothe request it needs to respond toEventually the data requested is reached Eventually the data requested is reached and the responses are processedand the responses are processed
Miss QueueMiss Queuem5/mem/cache/miss/*m5/mem/cache/miss/*
Blocking bufferBlocking bufferUsed to simulate a blocking cacheUsed to simulate a blocking cacheSpeeds simulation in atomic bus modelSpeeds simulation in atomic bus model
MSHRMSHRBlocks when miss or Blocks when miss or writebackwriteback queue is fullqueue is fullConfigurable parametersConfigurable parameters
Size of miss and Size of miss and writebackwriteback queuesqueuesNumber of targets per MSHRNumber of targets per MSHR
Uni coherence modelUni coherence modelSingle ProcessorSingle ProcessorHandles DMA invalidate forwarding up the Handles DMA invalidate forwarding up the cache hierarchycache hierarchy
CSHRCSHR’’ssAdditional cache functions to parallel getMemReq, Additional cache functions to parallel getMemReq, etc. but in the opposite directionetc. but in the opposite direction
MOESI based ModelMOESI based ModelAlso has CSHR support invalidatesAlso has CSHR support invalidates
Configurable parameters:Configurable parameters:Protocol type (MSI,MOSI,MESI,MOESI)Protocol type (MSI,MOSI,MESI,MOESI)UpgradesUpgrades
Support for memory to snarf updatesSupport for memory to snarf updatesMemory object must be the slave device on Memory object must be the slave device on the coherent busthe coherent bus
Currently only a single level coherence model Currently only a single level coherence model existsexists
Coherent bus must be bus closest to CPUCoherent bus must be bus closest to CPUOnly one coherent bus allowed in a systemOnly one coherent bus allowed in a systemOnly L1Only L1’’s allowed to be above coherence bus s allowed to be above coherence bus (doesn(doesn’’t forward snoops up the hierarchy)t forward snoops up the hierarchy)Other levels of cache should be uniOther levels of cache should be uni--coherentcoherent
Coherence works in event driven mode if:Coherence works in event driven mode if:Ownership protocol is used or next level snarfs Ownership protocol is used or next level snarfs updatesupdates
Support atomic or splitSupport atomic or split--transactiontransactionAll timing in event driven mode done in busAll timing in event driven mode done in bus
Have separate address and data busses Have separate address and data busses and an arbiter to determine eventsand an arbiter to determine eventsConfigurable parameters:Configurable parameters:
Generic interface between bus and Generic interface between bus and memory objectsmemory objectsCaches have master and slave interfacesCaches have master and slave interfaces
Bus Interfaces Cont.Bus Interfaces Cont.Master interfaceMaster interface
Connected closer to memoryConnected closer to memoryInitiates bus transactions on cache missesInitiates bus transactions on cache missesHas snoop path to forward address bus requests to Has snoop path to forward address bus requests to the cachethe cacheResponds with the data when snoops need to supplyResponds with the data when snoops need to supply
Slave interfaceSlave interfaceConnected closer to the CPUConnected closer to the CPUInitiates bus transactions on coherence (Initiates bus transactions on coherence (CSHRCSHR) ) requestsrequestsResponds with the data, if a snoop didnResponds with the data, if a snoop didn’’t supply itt supply it
Simple objects to connect two different Simple objects to connect two different speed bussesspeed bussesQueues requests coming from either side Queues requests coming from either side and forwards them out the other when the and forwards them out the other when the arbiter grants the busarbiter grants the bus
All based on All based on FunctionalMemoryFunctionalMemoryPioDevicePioDevice -- Contains a pointer to the platform and Contains a pointer to the platform and pioInterfacepioInterfaceDMADeviceDMADevice -- Additionally contains a pointer to Additionally contains a pointer to dmaInterfacedmaInterfacePCIDevPCIDev -- PCI Configuration space, PCI interrupt handlingPCI Configuration space, PCI interrupt handling
Each device is sensitive to one or more address ranges Each device is sensitive to one or more address ranges (base + size)(base + size)
Functional access with Functional access with MemoryController::add_childMemoryController::add_child()()Timing access with Timing access with PioInterface::addAddrRangePioInterface::addAddrRange()()
A device implements a Read() and Write() for basic PIO A device implements a Read() and Write() for basic PIO reads and writesreads and writesDMAInterface::doDMADMAInterface::doDMA()() for DMA reads/writesfor DMA reads/writes
Miscellaneous DevicesMiscellaneous DevicesAlphaConsoleAlphaConsole -- Back door into simulator for console codeBack door into simulator for console codeBadDevBadDev -- Device panics on access Device panics on access CowDiskImageCowDiskImage -- CopyCopy--onon--write disk image write disk image PciConfigAllPciConfigAll -- PCI configuration space objectPCI configuration space objectPciDevPciDev -- PCI device base classPCI device base classPacketFifoPacketFifo -- FIFO for network packetsFIFO for network packetsSimConsoleSimConsole -- Device that provides the consoleDevice that provides the consoleTsunamiTsunami -- The tsunami platform that links together all itThe tsunami platform that links together all it’’s devicess devicesTsunamiCChipTsunamiCChip -- Tsunami interrupt controllerTsunami interrupt controllerTsunamiPChipTsunamiPChip -- Tsunami PCI interfaceTsunami PCI interfaceTsunamiIOTsunamiIO -- Legacy I/O devices (RTC, PIT, etc)Legacy I/O devices (RTC, PIT, etc)UartUart -- Serial UARTSerial UART
PCI & InterruptsPCI & InterruptsTsunami platformTsunami platformFour physical PCI slotsFour physical PCI slotsOnly implement one PCI busOnly implement one PCI bus
Possible to implement second, just not donePossible to implement second, just not doneDevices need their own interrupt and device IDDevices need their own interrupt and device ID
No interrupt sharing allowedNo interrupt sharing allowedSimulator will panic if detectedSimulator will panic if detected
PciConfigDataPciConfigDataInterruptLineInterruptLine -- needs to be different for each device needs to be different for each device (0x1c(0x1c--0x1e)0x1e)
Modeled as a IDE controller and separate Modeled as a IDE controller and separate IDE Disk(s)IDE Disk(s)CopyCopy--onon--write layer in betweenwrite layer in betweenSimple timing supportSimple timing supportEmulates a Intel IDE 2 channel controllerEmulates a Intel IDE 2 channel controller
Can connect 4 IDE devicesCan connect 4 IDE devices
Can be created with Can be created with mkblankimage.shmkblankimage.shDisk images are contiguous blocks Disk images are contiguous blocks created with created with ddddUnlike a normal image it needs to contain Unlike a normal image it needs to contain a partition tablea partition tableCan be mounted with the Can be mounted with the loopbackloopback devicedevice
Specifying Disk ImageSpecifying Disk Imageclass class FilesetDisk(IdeDiskFilesetDisk(IdeDisk):):raw_image = raw_image = RawDiskImage(image_fileRawDiskImage(image_file = =
NIC Device Model NIC Device Model m5/dev/ns_gige*m5/dev/ns_gige*
National Semiconductor DP83820National Semiconductor DP83820Gigabit Ethernet Full Duplex PCI controllerGigabit Ethernet Full Duplex PCI controllerTheir spec is actually publicTheir spec is actually public
Unmodified Linux driver will run on our Unmodified Linux driver will run on our model (linux/drivers/net/ns83820.c)model (linux/drivers/net/ns83820.c)If Linux doesnIf Linux doesn’’t use it, we dont use it, we don’’t model itt model it
Set in Python Set in Python configconfig object fileobject filem5/python/m5/objects/Ethernet.pym5/python/m5/objects/Ethernet.pyNSGigENSGigE object descriptionobject description
Actual device has bug that does not allow Actual device has bug that does not allow unaligned copiesunaligned copies
Fixed that in the device, and removed the Fixed that in the device, and removed the associated workaround in the Linux driverassociated workaround in the Linux driver
Checksum offloading didnChecksum offloading didn’’t work in every t work in every case in Linux, so we patched it.case in Linux, so we patched it.
Overview of M5 internalsOverview of M5 internals SteveSteveDefining new objectsDefining new objects SteveSteveISA Description LanguageISA Description Language SteveSteveDebuggingDebugging NateNateStatisticsStatistics NateNate
Process commandProcess command--line line argsargsBuild configurationBuild configurationUnserializeUnserialize from checkpoint, if anyfrom checkpoint, if anySet up statisticsSet up statisticsStart processing events from event queueStart processing events from event queueOn processing On processing SimExitEventSimExitEvent::
Configuration script is a Python programConfiguration script is a Python programPython Python ““m5m5”” module in m5/python/m5module in m5/python/m5
Object descriptions in m5/python/m5/objectsObject descriptions in m5/python/m5/objectsC++ forks Python interpreter C++ forks Python interpreter (m5/python/pyconfig.cc)(m5/python/pyconfig.cc)
Send it Python scripts/code from Send it Python scripts/code from cmdcmd linelinePython ends by printing final Python ends by printing final configconfig
iniini--file format for historical reasons (in file format for historical reasons (in config.iniconfig.ini))Hope to get rid of this step soonHope to get rid of this step soon
C++ builds C++ builds ConfigHierarchyConfigHierarchy from .from .iniini filefileLinking in a .o file automatically registers object Linking in a .o file automatically registers object in global tablein global table
Via Via REGISTER_SIM_OBJECT()REGISTER_SIM_OBJECT() macromacroMagic of C++ global object constructorsMagic of C++ global object constructors
TwoTwo--pass initializationpass initializationFirst pass: walk tree, call builder First pass: walk tree, call builder ““createcreate”” methodmethod
Defined by Defined by CREATE_SIM_OBJECT()CREATE_SIM_OBJECT() macromacroAll parameter values availableAll parameter values availableCalls C++ constructorCalls C++ constructor
Second pass: call Second pass: call init()init() methodsmethodsGuaranteed that all other objects are constructedGuaranteed that all other objects are constructed
Memory State InitializationMemory State InitializationSide effect of object creationSide effect of object creation
Process object for Process object for syscallsyscall emulationemulationLiveProcessLiveProcess for direct emulation (m5/sim/process.*)for direct emulation (m5/sim/process.*)EIOProcessEIOProcess for for SimpleScalarSimpleScalar EIO trace playbackEIO trace playback(m5/non(m5/non--free/eio/*)free/eio/*)AutoAuto--detects Tru64 detects Tru64 vsvs Linux binariesLinux binaries
System object for fullSystem object for full--system modesystem modeLinuxSystemLinuxSystem, , Tru64SystemTru64System (m5/kern/*)(m5/kern/*)Console/PAL codeConsole/PAL codeKernel image Kernel image
m5/base/loader/* has m5/base/loader/* has aoutaout, , ecoffecoff, elf, elfloader code, plus symbol table supportloader code, plus symbol table support
Create/restore state checkpointsCreate/restore state checkpointsSerializableSerializable is base of Event & is base of Event & SimObjectSimObject
defines defines serialize()serialize(), , unserializeunserialize()() methodsmethodsoverride to save/restore object stateoverride to save/restore object state..iniini--format text fileformat text file
If checkpoint is specified, M5 will call If checkpoint is specified, M5 will call unserializeunserialize()()on all objects after creationon all objects after creation
State identified by object name (system0.cpu0)State identified by object name (system0.cpu0)OK if no OK if no checkpointedcheckpointed state (e.g., added cache) state (e.g., added cache) Detailed CPU model can Detailed CPU model can unserializeunserialize from from SimpleCPUSimpleCPU serialized serialized statestate
Common error: adding field to object and not updating Common error: adding field to object and not updating serialize()serialize()//unserializeunserialize()()
Event object is abstract Event object is abstract superclasssuperclassDerive new subclass for specific eventDerive new subclass for specific event
Add fields for eventAdd fields for event--specific dataspecific dataOverride Override process()process() method for actionmethod for action
schedule(Tickschedule(Tick t)t) puts on event queueputs on event queueEvents may be statically or dynamically Events may be statically or dynamically allocatedallocated
Setting Setting AutoDeleteAutoDelete flag will call delete after flag will call delete after processingprocessing
ISA Description LanguageISA Description Languagearch/arch/isa_parser.pyisa_parser.py, arch/alpha/, arch/alpha/isa_descisa_desc
Custom domainCustom domain--specific languagespecific languageDefines decoding & behavior of ISADefines decoding & behavior of ISAGenerates C++ codeGenerates C++ code
Scads of Scads of StaticInstStaticInst subclassessubclassesdecodeInstdecodeInst()() functionfunction
Maps machine inst. to Maps machine inst. to StaticInstStaticInst instanceinstance
Multiple scads of Multiple scads of execute()execute() methodsmethodsCrossCross--prod. of CPU models and prod. of CPU models and StaticInstStaticInstsubclassessubclasses
Very compact representationVery compact representationMost instructions take 1 line of C codeMost instructions take 1 line of C code2639 lines of 2639 lines of isa_descisa_desc 39K lines of C++39K lines of C++
18K generic, 11K for each of 2 CPU models18K generic, 11K for each of 2 CPU modelsCharacteristics autoCharacteristics auto--extracted from Cextracted from C
source, source, destdest regsregs; ; funcfunc unit class; etc.unit class; etc.execute()execute() code customized for CPU modelscode customized for CPU models
Amazingly well documented Amazingly well documented (for us, anyway)(for us, anyway)See See doxygendoxygen docsdocs
printfprintf is a nice debugging toolis a nice debugging toolKeep good Keep good printfsprintfs for tracingfor tracingLots of debug output is a very good thingLots of debug output is a very good thingAdd new flags to Add new flags to traceflags.pytraceflags.py
Individual flags in the Individual flags in the baseFlagsbaseFlags arrayarrayGroups of flags in the Groups of flags in the compoundFlagMapcompoundFlagMap dictdict
Using GDB with M5Using GDB with M5% % gdbgdb m5/build/ALPHA_SE/m5.debugm5/build/ALPHA_SE/m5.debug((gdbgdb) run m5) run m5--test/test1/run.ini test/test1/run.ini ----debug:break_cyclesdebug:break_cycles="1000 2000="1000 2000““Starting program: /z/stever/bk/m5/build/ALPHA_SE/m5.debug m5Starting program: /z/stever/bk/m5/build/ALPHA_SE/m5.debug m5--
test/test1/run.ini test/test1/run.ini ----debug:break_cyclesdebug:break_cycles="1000 2000="1000 2000““Starting simulation...Starting simulation...Program received signal SIGTRAP, Trace/breakpoint trap.Program received signal SIGTRAP, Trace/breakpoint trap.0xffffe002 in ?? ()0xffffe002 in ?? ()((gdbgdb) p ) p curTickcurTick$1 = 1000$1 = 1000((gdbgdb) c) cProgram received signal SIGTRAP, Trace/breakpoint trap.Program received signal SIGTRAP, Trace/breakpoint trap.0xffffe002 in ?? ()0xffffe002 in ?? ()((gdbgdb) p ) p curTickcurTick$2 = 2000$2 = 2000((gdbgdb) call sched_break_cycle(3000)) call sched_break_cycle(3000)((gdbgdb) c) cProgram received signal SIGTRAP, Trace/breakpoint trap.Program received signal SIGTRAP, Trace/breakpoint trap.0xffffe002 in ?? ()0xffffe002 in ?? ()((gdbgdb) p ) p curTickcurTick$3 = 3000$3 = 3000((gdbgdb))
Remote DebuggingRemote Debugging% ~/m5/build/ALPHA_FS/m5.debug ~/m5/configs/fullsys/run.py% ~/m5/build/ALPHA_FS/m5.debug ~/m5/configs/fullsys/run.pyM5 Simulator SystemM5 Simulator SystemCopyright (c) 2001Copyright (c) 2001--20052005The Regents of The University of MichiganThe Regents of The University of MichiganAll Rights ReservedAll Rights Reserved
This code is part of the M5 simulator, developed by Nathan This code is part of the M5 simulator, developed by Nathan BinkertBinkert,,Erik Erik HallnorHallnor, Steve , Steve RaaschRaasch, and Steve Reinhardt, with contributions, and Steve Reinhardt, with contributionsfrom Ron from Ron DreslinskiDreslinski, Dave Greene, Lisa Hsu, Ali , Dave Greene, Lisa Hsu, Ali SaidiSaidi, and Andrew, and AndrewSchultz.Schultz.
M5 compiled on May 31 2005 11:43:24M5 compiled on May 31 2005 11:43:24M5 executing on M5 executing on ziff.eecs.umich.eduziff.eecs.umich.eduM5 simulation started Tue May 31 11:45:02 2005M5 simulation started Tue May 31 11:45:02 2005Listening for console connection on port 3456Listening for console connection on port 3456
0: 0: system.tsunami.iosystem.tsunami.io: Real: Real--time clock set to Sun Jan 1 00:00:00 2006time clock set to Sun Jan 1 00:00:00 2006command line: /n/ziff/z/binkertn/build/head/ALPHA_FS/m5.debug command line: /n/ziff/z/binkertn/build/head/ALPHA_FS/m5.debug
Listening for remote Listening for remote gdbgdb connection on port 7000connection on port 7000warn: Entering event queue. Starting simulation...warn: Entering event queue. Starting simulation...
Remote DebuggingRemote Debugging% % gdbgdb--linuxlinux--alphaalpha arch/alpha/boot/arch/alpha/boot/vmlinuxvmlinux... ... gdbgdb banner ...banner ...This GDB was configured as "This GDB was configured as "----host=i686host=i686--pcpc--linuxlinux--gnu gnu ----
target=alphatarget=alpha--linuxlinux"... (no debugging symbols found)... "... (no debugging symbols found)... ((gdbgdb) set remote Z) set remote Z--packet on [ This can be put in .packet on [ This can be put in .gdbinitgdbinit ]]((gdbgdb) target remote ziff:7000) target remote ziff:7000Remote debugging using ziff:7000Remote debugging using ziff:70000xfffffc0000496844 in 0xfffffc0000496844 in strcasecmpstrcasecmp (a=0xfffffc0000b13a80 "", b=0x0) (a=0xfffffc0000b13a80 "", b=0x0)
at arch/alpha/lib/strcasecmp.c:23at arch/alpha/lib/strcasecmp.c:232323 } while (ca == } while (ca == cbcb && ca != '&& ca != '\\0');0');((gdbgdb))
We hope you found this tutorial usefulWe hope you found this tutorial usefulWe hope you find M5 useful tooWe hope you find M5 useful tooWeWe’’d love to work with you to make M5 d love to work with you to make M5 even more useful to the communityeven more useful to the communityWe value your feedbackWe value your feedback
Please fill out a questionnairePlease fill out a questionnaireHand it to one of usHand it to one of us
On your way out, or during ISCAOn your way out, or during ISCA
Come talk to us at ISCACome talk to us at ISCACheck Check http://m5.eecs.umich.eduhttp://m5.eecs.umich.edu for updatesfor updatesUse, subscribe to our mailing lists:Use, subscribe to our mailing lists: