1
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 11
Using the M5 SimulatorUsing the M5 Simulator
Nathan Binkert, Ron Nathan Binkert, Ron DreslinskiDreslinski,,Lisa Hsu, Kevin Lim, Ali Lisa Hsu, Kevin Lim, Ali SaidiSaidi
Prof. Steve ReinhardtProf. Steve Reinhardt
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 22
Welcome!Welcome!
This tutorial is for This tutorial is for youyouFeel free to ask questionsFeel free to ask questions
WeWe’’ve got a lot to coverve got a lot to coverLots of cool stuff didnLots of cool stuff didn’’t even make the slidest even make the slidesDonDon’’t be offended if we have to move ont be offended if we have to move onCome talk to us laterCome talk to us later
wewe’’re all here through Wednesdayre all here through Wednesday
2
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 33
Outline & ScheduleOutline & ScheduleIntroduction & overviewIntroduction & overview 2:002:00--2:202:20Running M5Running M5 2:202:20--2:452:45FullFull--System WorkloadsSystem Workloads 2:452:45--3:003:00Current M5 object modelsCurrent M5 object models
CPUs: simple, detailedCPUs: simple, detailed 3:003:00--3:303:30(break)(break) 3:303:30--4:004:00Memory SystemMemory System 4:004:00--4:354:35I/OI/O 4:354:35--4:554:55
Extending M5Extending M5 4:554:55--5:305:30
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 44
Introduction & OverviewIntroduction & Overview
Steve ReinhardtSteve Reinhardt
3
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 55
Introduction & OverviewIntroduction & Overview
What M5 is and is notWhat M5 is and is notA brief peek insideA brief peek insideCurrent status & future developmentsCurrent status & future developments
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 66
What is M5?What is M5?
A tool for simulating A tool for simulating systemssystemsNot just CPU cores: memory, I/ONot just CPU cores: memory, I/ONot just SPEC apps: full OS codeNot just SPEC apps: full OS codeNot just single machines: client/server, etc.Not just single machines: client/server, etc.
4
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 77
Two Views of M5Two Views of M5
1.1. A framework for eventA framework for event--driven simulationdriven simulationEvents, objects, statistics, configurationEvents, objects, statistics, configuration
2.2. A collection of predefined object modelsA collection of predefined object modelsCPUs, caches, busses, devices, etc.CPUs, caches, busses, devices, etc.
This tutorial focuses on #2This tutorial focuses on #2You may find #1 useful even if #2 is notYou may find #1 useful even if #2 is not
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 88
Where Did M5 Come From?Where Did M5 Come From?Born of frustration with existing toolsBorn of frustration with existing tools
Did not do what we wantedDid not do what we wantedDid not scale with added complexityDid not scale with added complexity
Desire to simulate TCP/IP performanceDesire to simulate TCP/IP performanceFullFull--system supportsystem supportMultiple system simulationMultiple system simulation
Almost entirely original codeAlmost entirely original codeOld CPU model based on Old CPU model based on SimpleScalarSimpleScalar simsim--outorderoutorderFullFull--system support used system support used SimOSSimOS as referenceas reference
No premeditated distribution plansNo premeditated distribution plansJust hacking together the system Just hacking together the system wewe wantedwanted
5
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 99
Key M5 AttributesKey M5 Attributes
Heavily objectHeavily object--orientedorientedKey to modularity, flexibilityKey to modularity, flexibility
Necessarily complexNecessarily complex~90K lines of C++, ~7K lines of Python~90K lines of C++, ~7K lines of Python
Modular enough to hide the complexityModular enough to hide the complexityWe hope!We hope!
Free! All the code we wrote is open sourceFree! All the code we wrote is open sourceBSDBSD--style licensestyle license
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 1010
What M5 is What M5 is NotNot
A hardware design languageA hardware design languageHigher level for design space exploration, Higher level for design space exploration, simulation speedsimulation speed
A restrictive environmentA restrictive environmentJust C++/Python with an event queue and a Just C++/Python with an event queue and a bunch of APIs you can choose to ignorebunch of APIs you can choose to ignore
Finished!Finished!Always room for improvementAlways room for improvement……
6
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 1111
What We Would Like M5 to BeWhat We Would Like M5 to Be
Something that spares you the pain weSomething that spares you the pain we’’ve ve been throughbeen throughA community resourceA community resource
Modular enough to localize changesModular enough to localize changesContribute back, and spare others some painContribute back, and spare others some pain
A path to reproducible/comparable resultsA path to reproducible/comparable resultsA common platform for evaluating ideasA common platform for evaluating ideas
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 1212
A Peek Inside: ObjectsA Peek Inside: Objects
Everything you care about is an object Everything you care about is an object (C++/Python)(C++/Python)
Derived from Derived from SimObjectSimObject base classbase classCommon code for creation, configuration parameters, Common code for creation, configuration parameters, naming, naming, checkpointingcheckpointing, etc., etc.
Uniform methodUniform method--based APIs for object typesbased APIs for object typesCPUs, caches, memory, etc.CPUs, caches, memory, etc.PlugPlug--compatibility across implementationscompatibility across implementations
Functional vs. detailed CPUFunctional vs. detailed CPUConventional vs. indirectConventional vs. indirect--index cacheindex cache
Easy replication: MPs, multiple systems, Easy replication: MPs, multiple systems, ……
7
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 1313
A Peek Inside 2: EventsA Peek Inside 2: Events
Standard event queue timing modelStandard event queue timing modelGlobal logical time (in Global logical time (in ““ticksticks””))No fixed relation to real timeNo fixed relation to real time
Objects schedule their own eventsObjects schedule their own eventsFlexibility for detail vs. performance tradeoffsFlexibility for detail vs. performance tradeoffs
E.g., a CPU E.g., a CPU typtyp. schedules an event every cycle. schedules an event every cycleSimple CPU wonSimple CPU won’’t schedule self if stalled/idle t schedule self if stalled/idle Can also schedule every nCan also schedule every nthth cycle to model other cycle to model other clock ratesclock rates
e.g., none.g., non--integer CPU/bus clock ratiosinteger CPU/bus clock ratios
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 1414
Current Model StatusCurrent Model Status
Three CPU modelsThree CPU modelsOne functional & two detailed OOOOne functional & two detailed OOO
old old SimpleScalarSimpleScalar--based & new in developmentbased & new in developmentCPUs support Alpha ISACPUs support Alpha ISA
Others Others ““easilyeasily”” addedadded
Two major cache modelsTwo major cache modelsConventional & indirectConventional & indirect--indexindex
BusBus--based interconnectbased interconnectSplit transactions, snooping coherenceSplit transactions, snooping coherence
8
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 1515
Current Status (contCurrent Status (cont’’d)d)
SyscallSyscall emulation modeemulation modeAlpha Tru64 or Linux application binariesAlpha Tru64 or Linux application binariesHostHost--based or based or SimpleScalarSimpleScalar EIO tracesEIO traces
FullFull--system modesystem modeModels Compaq Models Compaq ““TsunamiTsunami””--based systembased system
Boots Linux 2.4 & 2.6 and L4; FreeBSD in progressBoots Linux 2.4 & 2.6 and L4; FreeBSD in progressAsk us if you want Tru64Ask us if you want Tru64
Extensions for >4 CPUsExtensions for >4 CPUs
Ethernet, IDE disk adaptersEthernet, IDE disk adaptersHandful of preHandful of pre--built benchmarks availablebuilt benchmarks available
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 1616
ShortShort--term Wish Listterm Wish ListFinish new detailed OOO CPU modelFinish new detailed OOO CPU model
Add full system, SMT supportAdd full system, SMT supportObsolete Obsolete SimpleScalarSimpleScalar--based modelbased model
Minor reMinor re--architecting of memory systemarchitecting of memory systemSupport nonSupport non--bus interconnects, directory coherencebus interconnects, directory coherence
More More ISAsISAsPowerPC, ARM likely candidatesPowerPC, ARM likely candidatesHeterogeneous system support (?)Heterogeneous system support (?)
More fullMore full--system benchmarkssystem benchmarksBetter C++/Python integrationBetter C++/Python integration
9
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 1717
Outline & ScheduleOutline & ScheduleIntroduction & overviewIntroduction & overview 2:002:00--2:202:20Running M5Running M5 2:202:20--2:452:45FullFull--System WorkloadsSystem Workloads 2:452:45--3:003:00Current M5 object modelsCurrent M5 object models
CPUs: simple, detailedCPUs: simple, detailed 3:003:00--3:303:30(break)(break) 3:303:30--4:004:00Memory SystemMemory System 4:004:00--4:354:35I/OI/O 4:354:35--4:554:55
Extending M5Extending M5 4:554:55--5:305:30
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 1818
Running M5Running M5
Nathan BinkertNathan Binkert
10
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 1919
Running M5Running M5
Source treeSource treeBuilding executablesBuilding executablesRunning simulationsRunning simulationsSpecifying configurationsSpecifying configurationsOutput filesOutput filesCheckpointingCheckpointingSampling & warmSampling & warm--upup
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 2020
Source Tree OrganizationSource Tree Organization
m5: actual simulator sourcem5: actual simulator sourcem5m5--test: regression teststest: regression testsext: 3ext: 3rdrd--party packagesparty packages
dnetdnet, , libelflibelf, ply, plylinuxlinux--dist: source for disk imagesdist: source for disk images
11
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 2121
M5 Source Tree OrganizationM5 Source Tree Organization
base: general data structures/facilitiesbase: general data structures/facilitiessimsim: simulation core: simulation corepython: Python python: Python configconfig codecodearch: ISAarch: ISA--specific componentsspecific componentscpucpu, , memmem, dev: specific models, dev: specific modelsbuild: build directoriesbuild: build directoriestest: component teststest: component testsutilutil: utility programs: utility programsconfigconfig: sample configurations: sample configurations
compiled in
not compiled in
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 2222
Building ExecutablesBuilding ExecutablesPlatformsPlatforms
Linux, BSD, CYGWIN (most UNIX like systems?)Linux, BSD, CYGWIN (most UNIX like systems?)Linux is primary, others may take a tiny bit of workLinux is primary, others may take a tiny bit of work
Little Little endianendian machines!machines!6464--bit machines help a lotbit machines help a lot
ToolsToolsGCC/G++ 3.0+GCC/G++ 3.0+
Recently tested with 3.3Recently tested with 3.3--3.53.5Python 2.4Python 2.4SConsSCons (we use 0.95 or 0.96.1)(we use 0.95 or 0.96.1)
http://http://www.scons.orgwww.scons.org
12
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 2323
% % cdcd m5m5% % cdcd buildbuild% % sconsscons ALPHA_SE/m5.opt ALPHA_FS/m5.debugALPHA_SE/m5.opt ALPHA_FS/m5.debugsconsscons: Reading : Reading SConscriptSConscript files ...files ...Configuring options for directory 'ALPHA_FS'.Configuring options for directory 'ALPHA_FS'.Compiling with Compiling with MySQLMySQL support!support!sconsscons: done reading : done reading SConscriptSConscript files.files.sconsscons: Building targets ...: Building targets .../z/binkertn/research/m5/build/head/m5/arch/isa_parser.py /z/binkertn/research/m5/build/head/m5/arch/isa_parser.py
m5/arch/alpha/isa_desc ALPHA_FS/arch/alpha arch/alpham5/arch/alpha/isa_desc ALPHA_FS/arch/alpha arch/alphaGenerating ALPHA_FS/arch/alpha/Generating ALPHA_FS/arch/alpha/decoder.hhdecoder.hhGenerating ALPHA_FS/arch/alpha/Generating ALPHA_FS/arch/alpha/decoder.ccdecoder.ccGenerating ALPHA_FS/arch/alpha/Generating ALPHA_FS/arch/alpha/inorder_cpu_exec.ccinorder_cpu_exec.ccGenerating ALPHA_FS/arch/alpha/Generating ALPHA_FS/arch/alpha/simple_cpu_exec.ccsimple_cpu_exec.ccGenerating ALPHA_FS/arch/alpha/Generating ALPHA_FS/arch/alpha/fast_cpu_exec.ccfast_cpu_exec.ccGenerating ALPHA_FS/arch/alpha/Generating ALPHA_FS/arch/alpha/full_cpu_exec.ccfull_cpu_exec.ccGenerating ALPHA_FS/arch/alpha/Generating ALPHA_FS/arch/alpha/alpha_full_cpu_exec.ccalpha_full_cpu_exec.ccecho '#include "arch/alpha/echo '#include "arch/alpha/isa_traits.hhisa_traits.hh"' > "' >
ALPHA_FS/ALPHA_FS/targetarch/isa_traits.hhtargetarch/isa_traits.hhecho '#include "arch/alpha/echo '#include "arch/alpha/alpha_memory.hhalpha_memory.hh"' > "' >
ALPHA_FS/ALPHA_FS/targetarch/alpha_memory.hhtargetarch/alpha_memory.hhecho '#include "arch/alpha/echo '#include "arch/alpha/byte_swap.hhbyte_swap.hh"' > "' >
ALPHA_FS/ALPHA_FS/targetarch/byte_swap.hhtargetarch/byte_swap.hhpython m5/base/traceflags.py ALPHA_FS/base/python m5/base/traceflags.py ALPHA_FS/base/traceflagstraceflagsg++ g++ --pipe pipe --fnofno--strictstrict--aliasing aliasing --Wall Wall --WnoWno--signsign--compare compare --WerrorWerror --WundefWundef --g g --
gstabsgstabs+ + --O0 O0 --DFULL_SYSTEM DFULL_SYSTEM --DUSE_MYSQL DUSE_MYSQL --DSTATS_BINNING DSTATS_BINNING --DDEBUG DDEBUG --Iext/dnetIext/dnet--I/I/usr/local/include/mysqlusr/local/include/mysql --I/I/usr/include/mysqlusr/include/mysql --IALPHA_FS IALPHA_FS --Im5 Im5 --c c --o o ALPHA_FS/arch/alpha/ALPHA_FS/arch/alpha/decoder.dodecoder.do ALPHA_FS/arch/alpha/ALPHA_FS/arch/alpha/decoder.ccdecoder.cc
......
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 2424
Running SimulationsRunning SimulationsUsage:Usage:m5.debug [m5.debug [--d <dir>] [d <dir>] [--E <E <varvar>[=<>[=<valval>]] [>]] [--I <dir>] [I <dir>] [--P <python>]P <python>]
[[----<<varvar>=<>=<valval>] <>] <configconfig file>file>
--d set the output directory to <dir>d set the output directory to <dir>--E set the environment variable <E set the environment variable <varvar> to <> to <valval> (or > (or 'True')'True')--I add the directory <dir> to python's pathI add the directory <dir> to python's path--P execute <python> directly in the configurationP execute <python> directly in the configuration----varvar==valval set the python variable <set the python variable <varvar> to '<> to '<valval>'>'<<configfileconfigfile> > configconfig file name (ends in .file name (ends in .pypy))
% % ALPHA_FS/m5.debug ALPHA_FS/m5.debug ––d output d output ––ETEST=SPECWEB ETEST=SPECWEB configs/fullsys/run.pyconfigs/fullsys/run.py
% % ALPHA_FS/m5.debug ALPHA_FS/m5.debug ––d output d output ––EDUMPFILE=EDUMPFILE=ethertraceethertrace--ETEST=NETPERF_STREAM ETEST=NETPERF_STREAM configs/fullsys/run.pyconfigs/fullsys/run.py----Trace.flagsTrace.flags==““EthernetAllEthernetAll””
% % ALPHA_SE/m5.opt ALPHA_SE/m5.opt ––d output m5d output m5--test/test1/run.pytest/test1/run.py
13
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 2525
% % ~/m5/build/ALPHA_FS/m5.debug ~/m5/build/ALPHA_FS/m5.debug ––d output ~/m5/configs/fullsys/run.pyd output ~/m5/configs/fullsys/run.pyM5 Simulator SystemM5 Simulator SystemCopyright (c) 2001Copyright (c) 2001--20052005The Regents of The University of MichiganThe Regents of The University of MichiganAll Rights ReservedAll Rights Reserved
This code is part of the M5 simulator, developed by Nathan BinkeThis code is part of the M5 simulator, developed by Nathan Binkert,rt,Erik Erik HallnorHallnor, Steve , Steve RaaschRaasch, and Steve Reinhardt, with contributions, and Steve Reinhardt, with contributionsfrom Ron from Ron DreslinskiDreslinski, Dave Greene, Lisa Hsu, Ali , Dave Greene, Lisa Hsu, Ali SaidiSaidi, and Andrew, and AndrewSchultz.Schultz.
M5 compiled on May 31 2005 11:43:24M5 compiled on May 31 2005 11:43:24M5 executing on M5 executing on ziff.eecs.umich.eduziff.eecs.umich.eduM5 simulation started Tue May 31 11:45:02 2005M5 simulation started Tue May 31 11:45:02 2005Listening for console connection on port 3456Listening for console connection on port 3456
0: 0: system.tsunami.iosystem.tsunami.io: Real: Real--time clock set to Sun Jan 1 00:00:00 2006time clock set to Sun Jan 1 00:00:00 2006command line: /n/ziff/z/binkertn/build/head/ALPHA_FS/m5.debug command line: /n/ziff/z/binkertn/build/head/ALPHA_FS/m5.debug ––d output d output
/n/ziff/z/binkertn/research/m5/head/configs/fullsys/run.py/n/ziff/z/binkertn/research/m5/head/configs/fullsys/run.py
Listening for remote Listening for remote gdbgdb connection on port 7000connection on port 7000warn: Entering event queue. Starting simulation...warn: Entering event queue. Starting simulation...
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 2626
m5termm5term
allows user to connect to the simulated allows user to connect to the simulated console interfaceconsole interface
% % cdcd m5m5
% % cdcd utilutil/term/term% % makemake
gccgcc --o m5term o m5term term.cterm.c
% % make installmake installsudosudo install install --o root o root --m 555 m5term /m 555 m5term /usrusr/local/bin/local/bin
14
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 2727
% % m5term m5term localhostlocalhost 34563456==== m5 slave console: Console 0 ======== m5 slave console: Console 0 ====M5 consoleM5 consoleGot Configuration 127 Got Configuration 127 memsizememsize 8000000 pages 4000 8000000 pages 4000 First free page after ROM 0xFFFFFC0000018000First free page after ROM 0xFFFFFC0000018000HWRPB 0xFFFFFC0000018000 l1pt 0xFFFFFC0000040000 l2pt 0xFFFFFC00HWRPB 0xFFFFFC0000018000 l1pt 0xFFFFFC0000040000 l2pt 0xFFFFFC0000042000 00042000
l3pt_rpb 0xFFFFFC0000044000 l3pt_kernel 0xFFFFFC0000048000 l2resl3pt_rpb 0xFFFFFC0000044000 l3pt_kernel 0xFFFFFC0000048000 l2reserv erv 0xFFFFFC00000460000xFFFFFC0000046000
CPU Clock at 2000 MHz CPU Clock at 2000 MHz IntrClockFrequencyIntrClockFrequency=1024 =1024 Booting with 1 Booting with 1 processor(sprocessor(s) ) ............VFS: Mounted root (ext2 VFS: Mounted root (ext2 filesystemfilesystem) ) readonlyreadonly..Freeing unused kernel memory: 480k freedFreeing unused kernel memory: 480k freedinit started: init started: BusyBoxBusyBox v1.00v1.00--rc2 (2004.11.18rc2 (2004.11.18--16:22+0000) multi16:22+0000) multi--call binarycall binary
PTXdistPTXdist--0.7.0 (20040.7.0 (2004--1111--18T11:23:4018T11:23:40--0500)0500)
mounting mounting filesystemsfilesystems......EXT2EXT2--fs warning: fs warning: checktimechecktime reached, running e2fsck is recommendedreached, running e2fsck is recommendedloading script...loading script...Script from M5 Script from M5 readfilereadfile is empty, starting bash shell...is empty, starting bash shell...# # lslsbenchmarks etc lib benchmarks etc lib mntmnt sbinsbin usrusrbin floppy bin floppy lost+foundlost+found modules sys modules sys varvardev home man proc dev home man proc tmptmp zz# #
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 2828
Configuration FilesConfiguration Files
PythonPythonConfigConfig objsobjs mapped to simulator mapped to simulator objsobjsNo need for scripts to generate No need for scripts to generate configsconfigs
All logic for running many simulations contained in a All logic for running many simulations contained in a single set of configurable single set of configurable configconfig files!files!
Pass parameters via environment Pass parameters via environment varsvars--E<E<varvar>[=<>[=<valval>]>]
Variables with units are enforcedVariables with units are enforcedLatency must be Latency must be ‘‘2ns2ns’’, not simply 2, not simply 2
15
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 2929
class class DCache(BaseCacheDCache(BaseCache):):latency = 3 * latency = 3 * Parent.clock.periodParent.clock.periodsize = '32kB'size = '32kB'mshrsmshrs = 32= 32
class class CPU(SimpleCPUCPU(SimpleCPU):):dcachedcache = = DCache(in_busDCache(in_bus=NULL, =NULL, out_busout_bus==Parent.membusParent.membus))icacheicache = = ICache(in_busICache(in_bus=NULL, =NULL, out_busout_bus==Parent.membusParent.membus))
class class System(LinuxSystemSystem(LinuxSystem):):cpucpu = = CPUCPU()()membusmembus = = Bus(widthBus(width=16, clock='400MHz')=16, clock='400MHz')ram = ram = BaseMemory(in_busBaseMemory(in_bus==Parent.membusParent.membus, latency='40ns',, latency='40ns',
addr_rangeaddr_range=[ =[ Parent.physmem.rangeParent.physmem.range ])])physmemphysmem = = PhysicalMemory(rangePhysicalMemory(range=AddrRange('128MB'))=AddrRange('128MB'))tsunami = Tsunami()tsunami = Tsunami()simple_disksimple_disk = = SimpleDisk(diskSimpleDisk(disk=Parent.tsunami.disk0.image)=Parent.tsunami.disk0.image)sim_consolesim_console = = SimConsole(listenerSimConsole(listener==ConsoleListener(portConsoleListener(port=3456))=3456))kernel = '/dist/m5/system/binaries/vmlinuxkernel = '/dist/m5/system/binaries/vmlinux--latest'latest'pal = '/dist/m5/system/binaries/ts_osfpal'pal = '/dist/m5/system/binaries/ts_osfpal'console = '/dist/m5/system/binaries/console_ts'console = '/dist/m5/system/binaries/console_ts'boot_osflagsboot_osflags = 'root=/dev/hda1 console=ttyS0'= 'root=/dev/hda1 console=ttyS0'
root = root = Root(clockRoot(clock='2GHz')='2GHz')root.clientroot.client = = System(readfileSystem(readfile='/dist/m5/system/boot/netperf='/dist/m5/system/boot/netperf--streamstream--client.rcS')client.rcS')root.serverroot.server = = System(readfileSystem(readfile='/dist/m5/system/boot/netperf='/dist/m5/system/boot/netperf--server.rcS')server.rcS')root.etherlinkroot.etherlink = EtherLink(int1 = Parent.server.tsunami.etherint[0],= EtherLink(int1 = Parent.server.tsunami.etherint[0],
int2 = Parent.client.tsunami.etherint[0])int2 = Parent.client.tsunami.etherint[0])
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 3030
Using ConfigurationsUsing Configurations
Language is semiLanguage is semi--declarativedeclarativeOrder on the command line may matterOrder on the command line may matter
% % m5 m5 ––d output d output configs/tutorial/fullsys.pyconfigs/tutorial/fullsys.py ––--DCache.sizeDCache.size==’’64kB64kB’’
----EtherLink.speedEtherLink.speed==’’10Gbps10Gbps’’ ----Root.clock=Root.clock=’’4GHz4GHz’’
16
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 3131
Output FilesOutput Files
Current Directory or Current Directory or --d <dir>d <dir>config.pyconfig.py, , config.iniconfig.ini, , config.outconfig.outconsole.<system>.console.<system>.sim_consolesim_consoleoutputoutputstats.txtstats.txtcptcpt.<number>/.<number>/
Database OutputDatabase OutputM5 can output to a MYSQL databaseM5 can output to a MYSQL database
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 3232
CheckpointingCheckpointing
----Serialize.cycleSerialize.cycle=<start cycle>=<start cycle>----Serialize.periodSerialize.period=<repeat interval>=<repeat interval>----Serialize.countSerialize.count=<# of checkpoints>=<# of checkpoints>
M5 instructionM5 instructionInsert special instruction into code to trigger a Insert special instruction into code to trigger a checkpoint to be droppedcheckpoint to be droppedOur benchmarks do thisOur benchmarks do this
17
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 3333
Starting From a CheckpointStarting From a Checkpoint
Same configuration as normal except you add:Same configuration as normal except you add:----Root.checkpointRoot.checkpoint=<path>/=<path>/cptcpt.<number>.<number>
Checkpoints must be regenerated with some Checkpoints must be regenerated with some configconfig changeschanges
Most Most configconfig changes that are architecturally visible changes that are architecturally visible (because the kernel may have behaved differently)(because the kernel may have behaved differently)Physical memory size, new kernelsPhysical memory size, new kernels
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 3434
Sampling & WarmSampling & Warm--upup
M5 canM5 candump statistics many timesdump statistics many timesaggregate statistics based on some eventaggregate statistics based on some event(keep stats according to kernel mode or user mode)(keep stats according to kernel mode or user mode)
Switch between CPU configurationsSwitch between CPU configurationsFunctional CPU Functional CPU Detailed CPUDetailed CPU
WarmWarm--up caches in a functional CPU, do up caches in a functional CPU, do measurements in a detailed CPUmeasurements in a detailed CPU
18
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 3535
Outline & ScheduleOutline & ScheduleIntroduction & overviewIntroduction & overview 2:002:00--2:202:20Running M5Running M5 2:202:20--2:452:45FullFull--System WorkloadsSystem Workloads 2:452:45--3:003:00Current M5 object modelsCurrent M5 object models
CPUs: simple, detailedCPUs: simple, detailed 3:003:00--3:303:30(break)(break) 3:303:30--4:004:00Memory SystemMemory System 4:004:00--4:354:35I/OI/O 4:354:35--4:554:55
Extending M5Extending M5 4:554:55--5:305:30
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 3636
Full System WorkloadsFull System Workloads
Lisa HsuLisa Hsu
19
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 3737
Basic OperationBasic OperationDisk imagesDisk images
Raw copies of Linux disk imageRaw copies of Linux disk imageBinaries to be run must be present on imageBinaries to be run must be present on image
rcSrcS files (files (m5/configs/boot/*.m5/configs/boot/*.rcSrcS))Exactly like normal boot scriptsExactly like normal boot scriptsUse them to start running a binary on the disk Use them to start running a binary on the disk image, configure image, configure ethernetethernet interfaces, etc.interfaces, etc.Can also execute m5 instructionsCan also execute m5 instructionsNice and flexible, since not compiled inNice and flexible, since not compiled in
Specified in configuration by Specified in configuration by readfilereadfile==‘‘path/to/path/to/script.rcSscript.rcS’’
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 3838
See for yourself!See for yourself!Going into / of disk image and typing Going into / of disk image and typing lsls will show:will show:
benchmarks etc lib benchmarks etc lib mntmnt sbinsbin usrusrbin floppy bin floppy lost+foundlost+found modules sys modules sys varvardev home man proc dev home man proc tmptmp zz
Snippet of .Snippet of .rcSrcS file:file:echo echo --n "setting up network..."n "setting up network..."//sbin/ifconfigsbin/ifconfig eth0 192.168.0.10 eth0 192.168.0.10 txqueuelentxqueuelen 10001000//sbin/ifconfigsbin/ifconfig lo 127.0.0.1lo 127.0.0.1echo echo --n "running surge client..."n "running surge client..."/bin/bash /bin/bash --c "c "cdcd /benchmarks/surge && ./Surge 2 100 1 192.168.0.1 5/benchmarks/surge && ./Surge 2 100 1 192.168.0.1 5““echo echo --n "halting machine"n "halting machine"m5 exitm5 exit
20
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 3939
Up and Running BenchmarksUp and Running Benchmarks
All networking focusedAll networking focusedSpecWEB99SpecWEB99NetperfNetperf
stream stream –– a transmit benchmarka transmit benchmarkmaertsmaerts –– a receive benchmarka receive benchmark
In progress:In progress:NFS NFS (server works; client tuning needed)(server works; client tuning needed)
iSCSIiSCSIvideo streamingvideo streaming
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 4040
Adding Your Own BenchmarksAdding Your Own BenchmarksHighly encouraged! Highly encouraged! ☺☺
Please share them with others!Please share them with others!
Since M5 is Alpha targeted, need to Since M5 is Alpha targeted, need to compile Alpha binariescompile Alpha binaries
CrossCross--compiler can be downloaded from compiler can be downloaded from www.kegel.com/crosstoolwww.kegel.com/crosstoolOr, if you have an Alpha, use thatOr, if you have an Alpha, use thatAdd the benchmark binaries to disk image Add the benchmark binaries to disk image Create .Create .rcSrcS file that executes the binaryfile that executes the binary
21
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 4141
Mounting Disk ImagesMounting Disk Images
If you need to mount a disk image to If you need to mount a disk image to change something (like add a benchmark change something (like add a benchmark binary)binary)As root:As root:mount mount ––o o loop,offsetloop,offset=32256 =32256 myimage.imgmyimage.img //mntmnt/point/point
You can then manipulate the file system You can then manipulate the file system directly and copy in binariesdirectly and copy in binariesDonDon’’t forget to t forget to unmountunmount!!
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 4242
ExampleExample
New benchmark: New benchmark: mybenchmybenchCompile and put it in disk image:Compile and put it in disk image:cp cp mybenchmybench //mnt/point/benchmarks/mybenchmnt/point/benchmarks/mybench
Create .Create .rcSrcS files:files:#! /bin/#! /bin/shshifconfigifconfig eth0 192.168.0.1eth0 192.168.0.1echo echo ““executing executing mybenchmybench””evaleval /benchmarks//benchmarks/mybenchmybench
22
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 4343
Outline & ScheduleOutline & ScheduleIntroduction & overviewIntroduction & overview 2:002:00--2:202:20Running M5Running M5 2:202:20--2:452:45FullFull--System WorkloadsSystem Workloads 2:452:45--3:003:00Current M5 object modelsCurrent M5 object models
CPUs: simple, detailedCPUs: simple, detailed 3:003:00--3:303:30(break)(break) 3:303:30--4:004:00Memory SystemMemory System 4:004:00--4:354:35I/OI/O 4:354:35--4:554:55
Extending M5Extending M5 4:554:55--5:305:30
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 4444
Kevin LimKevin LimCPU ModelsCPU Models
23
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 4545
CPU ModelsCPU Models
Models:Models:Simple CPUSimple CPUDetailed CPUDetailed CPU
Key classes:Key classes:StaticInstStaticInst –– Decoded instructionDecoded instructionExecContextExecContext –– Execution contextExecution context
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 4646
Simple CPU ModelSimple CPU Modelm5/cpu/simple/simple_cpu.{hh,cc}m5/cpu/simple/simple_cpu.{hh,cc}
Uses of the Uses of the SimpleCPUSimpleCPU::Warming up cachesWarming up cachesDriving systems that do not require detailed modelingDriving systems that do not require detailed modeling
Ideal starting point to learn how CPU models Ideal starting point to learn how CPU models work within M5work within M5
Simple overview of fetching, executing, and retiring Simple overview of fetching, executing, and retiring instructionsinstructionsHandles all the calls to support full system modeHandles all the calls to support full system mode
24
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 4747
Simple CPU DetailsSimple CPU Details
Simulates an inSimulates an in--order 1 CPI machineorder 1 CPI machineStalls on I or D cache missesStalls on I or D cache missesSingle threadedSingle threadedCan roughly model a superscalar machine by Can roughly model a superscalar machine by ticking multiple timesticking multiple times
Can be extended to simple pipelineCan be extended to simple pipelineAdd in functional units with latenciesAdd in functional units with latencies
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 4848
FunctionalityFunctionality
Execution drivenExecution drivenTied closely with Tied closely with ExecContextExecContext classclass
Works in full system and Works in full system and syscallsyscall emulation emulation modesmodes
Handles interrupts, traps, etc.Handles interrupts, traps, etc.
25
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 4949
Detailed CPU ModelDetailed CPU Modelm5/cpu/o3/*, m5/encumbered/cpu/fullm5/cpu/o3/*, m5/encumbered/cpu/full
Currently two detailed models within M5Currently two detailed models within M5Old model, distantly based on Old model, distantly based on simsim--oooooo
Used for SMT and full system mode, but will Used for SMT and full system mode, but will eventually be phased outeventually be phased out
New modelNew modelWill be the focus of this sectionWill be the focus of this sectionMore closely couples execution and timingMore closely couples execution and timingUses template policiesUses template policies
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 5050
Timing accuracyTiming accuracy
Previous model executes instructions at Previous model executes instructions at fetchfetch
Feeds instruction to timing backendFeeds instruction to timing backendNew model executes at execute, modeling New model executes at execute, modeling the timing for each pipeline stagethe timing for each pipeline stage
Important for coherenceImportant for coherenceUP and MP system studiesUP and MP system studies
Forces both timing and execution to be Forces both timing and execution to be accurateaccurate
26
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 5151
Code HierarchyCode Hierarchy
ISA Specific CPU CodeAlphaFullCPU
ISA Independent High Level CPU CodeFullCPU
ISA Independent Low Level CPU CodeFetch, Decode, Rename, etc
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 5252
Pipeline layoutPipeline layout
OoOOoO pipeline, loosely based on Alpha 21264pipeline, loosely based on Alpha 21264Low level structure:Low level structure:
Red is a time bufferRed is a time buffer
Fetch Decode RenameIssue
ExecuteWriteback
Commit
Backwards Communication
27
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 5353
Time BuffersTime Buffers
Similar to queuesSimilar to queuesAre Are tick()tick()’’dd each CPU cycleeach CPU cycle
Each pipeline stage places information Each pipeline stage places information into time bufferinto time buffer
Next stage reads info out of time buffer at Next stage reads info out of time buffer at appropriate cycleappropriate cycle
Used for both forwards and backwards Used for both forwards and backwards communicationcommunication
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 5454
Time Buffer UseTime Buffer Use
Time buffer class is Time buffer class is templatedtemplatedIts template parameter is the communication Its template parameter is the communication structstruct between stagesbetween stages
Stages must communicate to each other Stages must communicate to each other via the time buffervia the time buffer
Avoids unrealistic interaction between pipeline Avoids unrealistic interaction between pipeline stagesstages
28
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 5555
Template PoliciesTemplate Policies
Template policy classes used to define Template policy classes used to define CPU policiesCPU policies
Gives full type informationGives full type informationAvoids virtual functionsAvoids virtual functions
““ImplImpl”” class is passed in as template class is passed in as template parameter to all classesparameter to all classes
ImplImpl defines all the important types, classes, defines all the important types, classes, pipeline stages, etcpipeline stages, etc
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 5757
ImplImpl Template PolicyTemplate Policy
ImplImpl’’ss are used to define specific CPU are used to define specific CPU instances, down to the ISAinstances, down to the ISA
To create different types of CPUs, create a To create different types of CPUs, create a new new ImplImpl and define all of the specific typesand define all of the specific types
*_*_impl.hhimpl.hh files due to using templates in files due to using templates in our infrastructureour infrastructure
*.cc files have instantiations*.cc files have instantiations
29
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 5858
Future DirectionsFuture Directions
New model will eventually add full system New model will eventually add full system support, SMT supportsupport, SMT supportWill include a checker/verifier at commitWill include a checker/verifier at commitAlso will abstract away ISA specific Also will abstract away ISA specific functionsfunctions
Lower levels of code can be shared across Lower levels of code can be shared across platformsplatforms
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 5959
StaticInstStaticInst ClassClassm5/cpu/static_inst.{hh,cc}m5/cpu/static_inst.{hh,cc}
Represents a decoded instructionRepresents a decoded instructionHas classifications of the instHas classifications of the instCorresponds to the binary machine inst Corresponds to the binary machine inst Decoded onceDecoded onceOnly has static informationOnly has static information
Has all the methods needed to execute an Has all the methods needed to execute an instructioninstruction
Tells which Tells which regsregs are source and are source and destdest
30
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 6060
Individual Use of Individual Use of StaticInstStaticInst
Contains the execute() functionContains the execute() functionSpecific instruction classes override thisSpecific instruction classes override thisPython generates execute() for all Python generates execute() for all instsinsts
TemplatedTemplated on the ISAon the ISADifferent Different ISAsISAs can have specific instantiations can have specific instantiations of of StaticInstStaticInst
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 6161
DynInstDynInst ClassClassm5/cpu/base_dyn_inst.{hh, cc}m5/cpu/base_dyn_inst.{hh, cc}
Dynamic version of Dynamic version of StaticInstStaticInstUsed for detailed CPU modelUsed for detailed CPU modelHolds PC, results, renamed Holds PC, results, renamed regsregs, etc, etc
TemplatedTemplated on CPU policyon CPU policy
31
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 6262
ExecContextExecContext ClassClassm5/cpu/exec_context.{hh,cc}m5/cpu/exec_context.{hh,cc}
Represents total architectural state of a Represents total architectural state of a single thread in the systemsingle thread in the system
PC, register values, memory, etc.PC, register values, memory, etc.
Contains pointers to key classesContains pointers to key classesEverything needed to functionally execute an Everything needed to functionally execute an instructioninstruction
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 6363
ISA DetailsISA Details
Currently has a few ISA specific detailsCurrently has a few ISA specific detailsFuture direction has state being removed Future direction has state being removed from from ExecContextExecContext, and , and ExecContextExecContextserving mainly as an interfaceserving mainly as an interface
32
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 6464
BreakBreak
3:30 3:30 –– 4:004:00
Feel free to ask us questions!Feel free to ask us questions!
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 6565
Outline & ScheduleOutline & ScheduleIntroduction & overviewIntroduction & overview 2:002:00--2:202:20Running M5Running M5 2:202:20--2:452:45FullFull--System WorkloadsSystem Workloads 2:452:45--3:003:00Current M5 object modelsCurrent M5 object models
CPUs: simple, detailedCPUs: simple, detailed 3:003:00--3:303:30(break)(break) 3:303:30--4:004:00Memory SystemMemory System 4:004:00--4:354:35I/OI/O 4:354:35--4:554:55
Extending M5Extending M5 4:554:55--5:305:30
33
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 6666
Memory SystemMemory System
Ron DreslinskiRon Dreslinski
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 6767
OutlineOutline
OverviewOverviewCachesCachesCoherence SupportCoherence SupportBusesBusesBus InterfacesBus InterfacesBus BridgesBus BridgesWalkthrough of Memory Call GraphWalkthrough of Memory Call Graph
34
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 6868
CPUMain
Memory CPU CPU CPU
IL1 DL1
Bus
L2
Bus
Simple MemoryBank
IL1 DL1 IL1 DL1 IL1 DL1
Reads for Data in Cache
Syscall Emulation Memory
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 6969
CPU
IL1 DL1
L2
BB BB
CPU
IL1 DL1
SimpleMemory
Bank
I/ODevice
MemoryController
PhysicalMemory
Full System Memory
35
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 7070
OverviewOverview
Two versions of the memory systemTwo versions of the memory systemSeparate timing / functional modelsSeparate timing / functional models
Capability to have data in the timing cachesCapability to have data in the timing cachesMemory tester objectMemory tester object
New unified timing/functional access modelNew unified timing/functional access modelWorks with simple CPU and execWorks with simple CPU and exec--inin--exec CPUexec CPUDoesnDoesn’’t support older full CPU model (execute at t support older full CPU model (execute at fetch)fetch)
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 7171
Overview Cont.Overview Cont.Supports atomic and event driven accessesSupports atomic and event driven accesses
Atomic modelAtomic modelEach memory transaction is independentEach memory transaction is independentEach transaction is followed completely through the memory Each transaction is followed completely through the memory systemsystemSpeeds simulation, useful when timing not important (miss Speeds simulation, useful when timing not important (miss stream)stream)
Event driven modelEvent driven modelEvents generated for each transaction as it traverses the Events generated for each transaction as it traverses the hierarchyhierarchyMultiple transactions interact in the memory systemMultiple transactions interact in the memory systemUseful when timing information is important, or MP system Useful when timing information is important, or MP system interaction is desiredinteraction is desired
36
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 7272
Functional MemoryFunctional Memorym5/mem/functional/*m5/mem/functional/*
There are several classes derived from There are several classes derived from functional memory including: functional memory including:
Full systemFull systemPhysical memoryPhysical memoryMemory controllerMemory controller
SyscallSyscall emulationemulationMain memoryMain memory
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 7373
Timing MemoryTiming Memorym5/mem/timing/*m5/mem/timing/*
Simple memory bankSimple memory bankConfigurable parameters:Configurable parameters:
Address rangeAddress rangeLatenciesLatenciesSnarfSnarf updatesupdatesDo writes (used for memory tester)Do writes (used for memory tester)
37
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 7474
MemReqMemReq ObejctsObejctsm5/mem/mem_req.{cc,hh}m5/mem/mem_req.{cc,hh}
All memory interactions are described in All memory interactions are described in terms of a memory request (terms of a memory request (MemReqMemReq) ) objectobjectEncapsulates all relevant informationEncapsulates all relevant information
Virtual and physical addressVirtual and physical addressRequest sizeRequest sizeRequesting deviceRequesting deviceEtc.Etc.
Makes memory system independentMakes memory system independent
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 7575
MemReqMemReq FlowFlow
CPU or device generates a CPU or device generates a MemReqMemReqMemReqMemReq is passed to the proper bus/cacheis passed to the proper bus/cacheEach cache creates a new Each cache creates a new MemReqMemReq to to pass through the hierarchy, holding on to pass through the hierarchy, holding on to the request it needs to respond tothe request it needs to respond toEventually the data requested is reached Eventually the data requested is reached and the responses are processedand the responses are processed
38
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 7676
CachesCachesm5/mem/cache/*m5/mem/cache/*
Templated on:Templated on:Cache tags (LRU, etc.)Cache tags (LRU, etc.)Buffer type (blocking, Buffer type (blocking, MSHR)MSHR)Coherence protocol Coherence protocol ((uniuni, MSI, etc.), MSI, etc.) Cache
Tags
Miss Queue
Coherence Protocol
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 7777
Main Cache ParametersMain Cache Parameters
SizeSizeAssociativityAssociativityBlock SizeBlock SizeLatencyLatencyTrace addressTrace address
39
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 7878
Cache TagsCache Tagsm5/mem/cache/tags/*m5/mem/cache/tags/*
Holds blocks and block stateHolds blocks and block stateContains the replacement policyContains the replacement policy
OptimalOptimalLRULRUGenerational (IIC)Generational (IIC)
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 7979
Miss QueueMiss Queuem5/mem/cache/miss/*m5/mem/cache/miss/*
Blocking bufferBlocking bufferUsed to simulate a blocking cacheUsed to simulate a blocking cacheSpeeds simulation in atomic bus modelSpeeds simulation in atomic bus model
MSHRMSHRBlocks when miss or Blocks when miss or writebackwriteback queue is fullqueue is fullConfigurable parametersConfigurable parameters
Size of miss and Size of miss and writebackwriteback queuesqueuesNumber of targets per MSHRNumber of targets per MSHR
40
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 8080
Coherence SupportCoherence Supportm5/mem/cache/coherence/*m5/mem/cache/coherence/*
Uni coherence modelUni coherence modelSingle ProcessorSingle ProcessorHandles DMA invalidate forwarding up the Handles DMA invalidate forwarding up the cache hierarchycache hierarchy
CSHRCSHR’’ssAdditional cache functions to parallel getMemReq, Additional cache functions to parallel getMemReq, etc. but in the opposite directionetc. but in the opposite direction
MOESI based ModelMOESI based ModelAlso has CSHR support invalidatesAlso has CSHR support invalidates
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 8181
MOESI Bus CoherenceMOESI Bus Coherence
Configurable parameters:Configurable parameters:Protocol type (MSI,MOSI,MESI,MOESI)Protocol type (MSI,MOSI,MESI,MOESI)UpgradesUpgrades
Support for memory to snarf updatesSupport for memory to snarf updatesMemory object must be the slave device on Memory object must be the slave device on the coherent busthe coherent bus
41
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 8282
Coherence LimitationsCoherence Limitations
Currently only a single level coherence model Currently only a single level coherence model existsexists
Coherent bus must be bus closest to CPUCoherent bus must be bus closest to CPUOnly one coherent bus allowed in a systemOnly one coherent bus allowed in a systemOnly L1Only L1’’s allowed to be above coherence bus s allowed to be above coherence bus (doesn(doesn’’t forward snoops up the hierarchy)t forward snoops up the hierarchy)Other levels of cache should be uniOther levels of cache should be uni--coherentcoherent
Coherence works in event driven mode if:Coherence works in event driven mode if:Ownership protocol is used or next level snarfs Ownership protocol is used or next level snarfs updatesupdates
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 8383
BusesBusesm5/mem/bus/*m5/mem/bus/*
Support atomic or splitSupport atomic or split--transactiontransactionAll timing in event driven mode done in busAll timing in event driven mode done in bus
Have separate address and data busses Have separate address and data busses and an arbiter to determine eventsand an arbiter to determine eventsConfigurable parameters:Configurable parameters:
WidthWidthClock rateClock rate
42
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 8484
Bus InterfacesBus Interfaces
Generic interface between bus and Generic interface between bus and memory objectsmemory objectsCaches have master and slave interfacesCaches have master and slave interfaces
CPU
Slave Interface
Master Interface
Cache
BUS
Slave Interface
CPU
Slave Interface
Master Interface
Cache
CPU
Slave Interface
Master Interface
Cache
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 8585
Bus Interfaces Cont.Bus Interfaces Cont.Master interfaceMaster interface
Connected closer to memoryConnected closer to memoryInitiates bus transactions on cache missesInitiates bus transactions on cache missesHas snoop path to forward address bus requests to Has snoop path to forward address bus requests to the cachethe cacheResponds with the data when snoops need to supplyResponds with the data when snoops need to supply
Slave interfaceSlave interfaceConnected closer to the CPUConnected closer to the CPUInitiates bus transactions on coherence (Initiates bus transactions on coherence (CSHRCSHR) ) requestsrequestsResponds with the data, if a snoop didnResponds with the data, if a snoop didn’’t supply itt supply it
43
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 8686
Bus BridgesBus Bridges
Simple objects to connect two different Simple objects to connect two different speed bussesspeed bussesQueues requests coming from either side Queues requests coming from either side and forwards them out the other when the and forwards them out the other when the arbiter grants the busarbiter grants the bus
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 8787
Memory FlowMemory Flow
44
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 8888
Cache
Tags
MissQ
Coherence
Master Interface
Slave Interface
Cache
Tags
MissQ
Coherence
Master Interface
Slave Interface
Bus
Slave Interface
SlaveInterface::access() – Initiate requestCache::access()Tags::handleAccess()MissQueue::handleMiss()MasterInterface::request()Bus::requestAddr()Bus::arbitrateAddr()MasterInterface::grantAddr()Cache::getMemReq()MissQueue::getMemReq()CoherenceProtocol::getBusCmd()MissQueue::markInService()Bus::sendAddr()MasterInterface::access() – snoop callCache::snoop()CoherenceProtocol::handleBusRequest()Optionally Calls MasterInterface::respond() – if supplying dataSNOOPS ALL CACHESSlaveInterface::access()Continue till responseSlaveInterface::respond()Bus::requestData() – Snoop supplying continue hereBus::arbitrateDataBus()SlaveInterface::grantData()Bus::sendData()MasterInterface::deliver()Cache::handleResponse()Tags::handleFill()MissQueue::handleResponse()SlaveInterface::respond()
MISS MISS
HIT
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 8989
Outline & ScheduleOutline & ScheduleIntroduction & overviewIntroduction & overview 2:002:00--2:202:20Running M5Running M5 2:202:20--2:452:45FullFull--System WorkloadsSystem Workloads 2:452:45--3:003:00Current M5 object modelsCurrent M5 object models
CPUs: simple, detailedCPUs: simple, detailed 3:003:00--3:303:30(break)(break) 3:303:30--4:004:00Memory SystemMemory System 4:004:00--4:354:35I/OI/O 4:354:35--4:554:55
Extending M5Extending M5 4:554:55--5:305:30
45
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 9090
I/O ModelsI/O Models
Ali Ali SaidiSaidi and Lisa Hsuand Lisa Hsu
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 9191
OverviewOverview
Device BasicsDevice BasicsMiscellaneous DevicesMiscellaneous DevicesDisk ModelDisk ModelNetwork ModelNetwork Model
46
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 9292
Device BasicsDevice Basicsdev/*dev/*
All based on All based on FunctionalMemoryFunctionalMemoryPioDevicePioDevice -- Contains a pointer to the platform and Contains a pointer to the platform and pioInterfacepioInterfaceDMADeviceDMADevice -- Additionally contains a pointer to Additionally contains a pointer to dmaInterfacedmaInterfacePCIDevPCIDev -- PCI Configuration space, PCI interrupt handlingPCI Configuration space, PCI interrupt handling
Each device is sensitive to one or more address ranges Each device is sensitive to one or more address ranges (base + size)(base + size)
Functional access with Functional access with MemoryController::add_childMemoryController::add_child()()Timing access with Timing access with PioInterface::addAddrRangePioInterface::addAddrRange()()
A device implements a Read() and Write() for basic PIO A device implements a Read() and Write() for basic PIO reads and writesreads and writesDMAInterface::doDMADMAInterface::doDMA()() for DMA reads/writesfor DMA reads/writes
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 9393
Miscellaneous DevicesMiscellaneous DevicesAlphaConsoleAlphaConsole -- Back door into simulator for console codeBack door into simulator for console codeBadDevBadDev -- Device panics on access Device panics on access CowDiskImageCowDiskImage -- CopyCopy--onon--write disk image write disk image PciConfigAllPciConfigAll -- PCI configuration space objectPCI configuration space objectPciDevPciDev -- PCI device base classPCI device base classPacketFifoPacketFifo -- FIFO for network packetsFIFO for network packetsSimConsoleSimConsole -- Device that provides the consoleDevice that provides the consoleTsunamiTsunami -- The tsunami platform that links together all itThe tsunami platform that links together all it’’s devicess devicesTsunamiCChipTsunamiCChip -- Tsunami interrupt controllerTsunami interrupt controllerTsunamiPChipTsunamiPChip -- Tsunami PCI interfaceTsunami PCI interfaceTsunamiIOTsunamiIO -- Legacy I/O devices (RTC, PIT, etc)Legacy I/O devices (RTC, PIT, etc)UartUart -- Serial UARTSerial UART
47
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 9494
PCI & InterruptsPCI & InterruptsTsunami platformTsunami platformFour physical PCI slotsFour physical PCI slotsOnly implement one PCI busOnly implement one PCI bus
Possible to implement second, just not donePossible to implement second, just not doneDevices need their own interrupt and device IDDevices need their own interrupt and device ID
No interrupt sharing allowedNo interrupt sharing allowedSimulator will panic if detectedSimulator will panic if detected
PciConfigDataPciConfigDataInterruptLineInterruptLine -- needs to be different for each device needs to be different for each device (0x1c(0x1c--0x1e)0x1e)
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 9595
Disk InterfaceDisk Interfacedev/dev/ideide_*_*
Modeled as a IDE controller and separate Modeled as a IDE controller and separate IDE Disk(s)IDE Disk(s)CopyCopy--onon--write layer in betweenwrite layer in betweenSimple timing supportSimple timing supportEmulates a Intel IDE 2 channel controllerEmulates a Intel IDE 2 channel controller
Can connect 4 IDE devicesCan connect 4 IDE devices
48
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 9696
Disk ImagesDisk Images
Can be created with Can be created with mkblankimage.shmkblankimage.shDisk images are contiguous blocks Disk images are contiguous blocks created with created with ddddUnlike a normal image it needs to contain Unlike a normal image it needs to contain a partition tablea partition tableCan be mounted with the Can be mounted with the loopbackloopback devicedevice
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 9797
Specifying Disk ImageSpecifying Disk Imageclass class FilesetDisk(IdeDiskFilesetDisk(IdeDisk):):raw_image = raw_image = RawDiskImage(image_fileRawDiskImage(image_file = =
disk('fileset.imgdisk('fileset.img'), read_only=True)'), read_only=True)image = image = CowDiskImage(childCowDiskImage(child = parent.raw_image, = parent.raw_image,
read_only=False)read_only=False)
self.disk2 = self.disk2 = LinuxSwapDisk(driveIDLinuxSwapDisk(driveID='master')='master')self.ideself.ide = = IdeController(disksIdeController(disks=[parent.disk0, =[parent.disk0,
parent.disk1, parent.disk2],parent.disk1, parent.disk2],……
49
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 9898
NIC Device Model NIC Device Model m5/dev/ns_gige*m5/dev/ns_gige*
National Semiconductor DP83820National Semiconductor DP83820Gigabit Ethernet Full Duplex PCI controllerGigabit Ethernet Full Duplex PCI controllerTheir spec is actually publicTheir spec is actually public
Modeled Device Features/Components:Modeled Device Features/Components:PCI bus interfacePCI bus interfaceDevice registersDevice registersTxTx/Rx /Rx FIFOsFIFOsBuffer Management SchemeBuffer Management SchemeReceive Packet Filtering LogicReceive Packet Filtering LogicChecksum OffloadingChecksum Offloading
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 9999
NIC Device Model cont.NIC Device Model cont.
Unmodified Linux driver will run on our Unmodified Linux driver will run on our model (linux/drivers/net/ns83820.c)model (linux/drivers/net/ns83820.c)If Linux doesnIf Linux doesn’’t use it, we dont use it, we don’’t model itt model it
Except for Packet Filtering Except for Packet Filtering –– itit’’s modeleds modeledMultiple Rx/Multiple Rx/TxTx priority queuespriority queuesPower Management schemesPower Management schemesFIFO drain thresholdsFIFO drain thresholds
But these are easy to add if desiredBut these are easy to add if desired
50
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 100100
NIC Added FeaturesNIC Added Features
Interrupt coalescingInterrupt coalescingReduce interrupt load in a high bandwidth Reduce interrupt load in a high bandwidth environmentenvironment
collects interrupts for collects interrupts for intr_delayintr_delay time before time before poking CPUpoking CPU
Rx/Rx/TxTx FIFO sizeFIFO sizeParameters to define Rx/Parameters to define Rx/TxTx FIFO sizesFIFO sizes
rx_fifo_sizerx_fifo_size, , tx_fifo_sizetx_fifo_size
Set in Python Set in Python configconfig object fileobject filem5/python/m5/objects/Ethernet.pym5/python/m5/objects/Ethernet.pyNSGigENSGigE object descriptionobject description
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 101101
Linux Linux ModsMods for NICfor NIC
Actual device has bug that does not allow Actual device has bug that does not allow unaligned copiesunaligned copies
Fixed that in the device, and removed the Fixed that in the device, and removed the associated workaround in the Linux driverassociated workaround in the Linux driver
Checksum offloading didnChecksum offloading didn’’t work in every t work in every case in Linux, so we patched it.case in Linux, so we patched it.
51
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 102102
EtherlinkEtherlinkm5/dev/etherlink.{cc/hh} m5/dev/etherlink.{cc/hh} m5/python/objects/Ethernet.pym5/python/objects/Ethernet.py
Configurable linkConfigurable linkLink delayLink delayBandwidthBandwidth
Attached Attached NICsNICsConnect any two Connect any two NICsNICs
Packet dumpPacket dumpDumps a Dumps a pcappcap formatted Ethernet traceformatted Ethernet trace
Read with Read with tcpdumptcpdump, Ethereal, Ethereal
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 103103
Outline & ScheduleOutline & ScheduleIntroduction & overviewIntroduction & overview 2:002:00--2:202:20Running M5Running M5 2:202:20--2:452:45FullFull--System WorkloadsSystem Workloads 2:452:45--3:003:00Current M5 object modelsCurrent M5 object models
CPUs: simple, detailedCPUs: simple, detailed 3:003:00--3:303:30(break)(break) 3:303:30--4:004:00Memory SystemMemory System 4:004:00--4:354:35I/OI/O 4:354:35--4:554:55
Extending M5Extending M5 4:554:55--5:305:30
52
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 104104
Extending M5Extending M5
Steve ReinhardtSteve ReinhardtNateNate BinkertBinkert
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 105105
Extending M5Extending M5
Overview of M5 internalsOverview of M5 internals SteveSteveDefining new objectsDefining new objects SteveSteveISA Description LanguageISA Description Language SteveSteveDebuggingDebugging NateNateStatisticsStatistics NateNate
53
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 106106
Execution ProcessExecution Processm5/sim/main.ccm5/sim/main.cc
Process commandProcess command--line line argsargsBuild configurationBuild configurationUnserializeUnserialize from checkpoint, if anyfrom checkpoint, if anySet up statisticsSet up statisticsStart processing events from event queueStart processing events from event queueOn processing On processing SimExitEventSimExitEvent::
Dump statisticsDump statisticsExitExit
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 107107
Configuration ProcessingConfiguration Processing
Configuration script is a Python programConfiguration script is a Python programPython Python ““m5m5”” module in m5/python/m5module in m5/python/m5
Object descriptions in m5/python/m5/objectsObject descriptions in m5/python/m5/objectsC++ forks Python interpreter C++ forks Python interpreter (m5/python/pyconfig.cc)(m5/python/pyconfig.cc)
Send it Python scripts/code from Send it Python scripts/code from cmdcmd linelinePython ends by printing final Python ends by printing final configconfig
iniini--file format for historical reasons (in file format for historical reasons (in config.iniconfig.ini))Hope to get rid of this step soonHope to get rid of this step soon
54
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 108108
SimObjectSimObject ConstructionConstructionm5/sim/builder.*, m5/sim/configfile.*m5/sim/builder.*, m5/sim/configfile.*
C++ builds C++ builds ConfigHierarchyConfigHierarchy from .from .iniini filefileLinking in a .o file automatically registers object Linking in a .o file automatically registers object in global tablein global table
Via Via REGISTER_SIM_OBJECT()REGISTER_SIM_OBJECT() macromacroMagic of C++ global object constructorsMagic of C++ global object constructors
TwoTwo--pass initializationpass initializationFirst pass: walk tree, call builder First pass: walk tree, call builder ““createcreate”” methodmethod
Defined by Defined by CREATE_SIM_OBJECT()CREATE_SIM_OBJECT() macromacroAll parameter values availableAll parameter values availableCalls C++ constructorCalls C++ constructor
Second pass: call Second pass: call init()init() methodsmethodsGuaranteed that all other objects are constructedGuaranteed that all other objects are constructed
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 109109
Memory State InitializationMemory State InitializationSide effect of object creationSide effect of object creation
Process object for Process object for syscallsyscall emulationemulationLiveProcessLiveProcess for direct emulation (m5/sim/process.*)for direct emulation (m5/sim/process.*)EIOProcessEIOProcess for for SimpleScalarSimpleScalar EIO trace playbackEIO trace playback(m5/non(m5/non--free/eio/*)free/eio/*)AutoAuto--detects Tru64 detects Tru64 vsvs Linux binariesLinux binaries
System object for fullSystem object for full--system modesystem modeLinuxSystemLinuxSystem, , Tru64SystemTru64System (m5/kern/*)(m5/kern/*)Console/PAL codeConsole/PAL codeKernel image Kernel image
m5/base/loader/* has m5/base/loader/* has aoutaout, , ecoffecoff, elf, elfloader code, plus symbol table supportloader code, plus symbol table support
55
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 110110
SerializationSerializationm5/sim/serialize.{cc,hh}m5/sim/serialize.{cc,hh}
Create/restore state checkpointsCreate/restore state checkpointsSerializableSerializable is base of Event & is base of Event & SimObjectSimObject
defines defines serialize()serialize(), , unserializeunserialize()() methodsmethodsoverride to save/restore object stateoverride to save/restore object state..iniini--format text fileformat text file
If checkpoint is specified, M5 will call If checkpoint is specified, M5 will call unserializeunserialize()()on all objects after creationon all objects after creation
State identified by object name (system0.cpu0)State identified by object name (system0.cpu0)OK if no OK if no checkpointedcheckpointed state (e.g., added cache) state (e.g., added cache) Detailed CPU model can Detailed CPU model can unserializeunserialize from from SimpleCPUSimpleCPU serialized serialized statestate
Common error: adding field to object and not updating Common error: adding field to object and not updating serialize()serialize()//unserializeunserialize()()
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 111111
EventsEventsm5/sim/eventq.*m5/sim/eventq.*
Event object is abstract Event object is abstract superclasssuperclassDerive new subclass for specific eventDerive new subclass for specific event
Add fields for eventAdd fields for event--specific dataspecific dataOverride Override process()process() method for actionmethod for action
schedule(Tickschedule(Tick t)t) puts on event queueputs on event queueEvents may be statically or dynamically Events may be statically or dynamically allocatedallocated
Setting Setting AutoDeleteAutoDelete flag will call delete after flag will call delete after processingprocessing
56
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 112112
Creating New Creating New SimObjectsSimObjects
Derive C++ class from Derive C++ class from SimObjectSimObjectm5/sim/sim_object.{cc,hh}m5/sim/sim_object.{cc,hh}
Define parameters in a .Define parameters in a .pypy filefilein m5/python/m5/objects directoryin m5/python/m5/objects directory
C++ needs parameter/creation boilerplateC++ needs parameter/creation boilerplateUgly macros in m5/sim/builder.hhUgly macros in m5/sim/builder.hh
{BEGIN,END}_{DECLARE,INIT}_SIM_OBJECT_PARAMS{BEGIN,END}_{DECLARE,INIT}_SIM_OBJECT_PARAMS{CREATE,REGISTER}_SIM_OBJECT{CREATE,REGISTER}_SIM_OBJECT
Plan to be cleaning this up before longPlan to be cleaning this up before long……
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 113113
ISA Description LanguageISA Description Languagearch/arch/isa_parser.pyisa_parser.py, arch/alpha/, arch/alpha/isa_descisa_desc
Custom domainCustom domain--specific languagespecific languageDefines decoding & behavior of ISADefines decoding & behavior of ISAGenerates C++ codeGenerates C++ code
Scads of Scads of StaticInstStaticInst subclassessubclassesdecodeInstdecodeInst()() functionfunction
Maps machine inst. to Maps machine inst. to StaticInstStaticInst instanceinstance
Multiple scads of Multiple scads of execute()execute() methodsmethodsCrossCross--prod. of CPU models and prod. of CPU models and StaticInstStaticInstsubclassessubclasses
57
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 114114
Definitions etc.Definitions etc.def def bitfieldbitfield OPCODEOPCODE <31:26>;<31:26>;def def bitfieldbitfield RARA <25:21>;<25:21>;def def bitfieldbitfield RBRB <20:16>;<20:16>;def def bitfieldbitfield INTFUNCINTFUNC <11: 5>; // function code<11: 5>; // function codedef def bitfieldbitfield RCRC < 4: 0>; // < 4: 0>; // destdest regreg
def operands {{def operands {{'Ra': 'Ra': IntRegOperandTraits('uqIntRegOperandTraits('uq', 'RA', '', 'RA', 'IsIntegerIsInteger', 1),', 1),''RbRb': ': IntRegOperandTraits('uqIntRegOperandTraits('uq', 'RB', '', 'RB', 'IsIntegerIsInteger', 2),', 2),''RcRc': ': IntRegOperandTraits('uqIntRegOperandTraits('uq', 'RC', '', 'RC', 'IsIntegerIsInteger', 3),', 3),
}}}}
def format def format LoadAddress(codeLoadAddress(code) {{) {{// Python code here...// Python code here...
}}}}
def format def format IntegerOperate(codeIntegerOperate(code) {{) {{// Python code here...// Python code here...
}}}}
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 115115
Instruction Decode & SemanticsInstruction Decode & Semanticsdecode OPCODE {decode OPCODE {
format format LoadAddressLoadAddress {{0x08: 0x08: ldalda({{ Ra = ({{ Ra = RbRb + + dispdisp; }});; }});0x09: 0x09: ldahldah({{ Ra = ({{ Ra = RbRb + (+ (dispdisp << 16); }});<< 16); }});
}}format format IntegerOperateIntegerOperate {{
0x10: decode INTFUNC {0x10: decode INTFUNC {0x00: 0x00: addladdl({{ ({{ Rc.slRc.sl = = Ra.slRa.sl + + Rb_or_imm.slRb_or_imm.sl; }}); ; }}); 0x20: 0x20: addqaddq({{ ({{ RcRc = Ra + = Ra + Rb_or_immRb_or_imm; }});; }});0x22: s4addq({{ 0x22: s4addq({{ RcRc = (Ra << 2) + = (Ra << 2) + Rb_or_immRb_or_imm; }});; }});0x32: s8addq({{ 0x32: s8addq({{ RcRc = (Ra << 3) + = (Ra << 3) + Rb_or_immRb_or_imm; }});; }});// etc.// etc.
}}}}// etc.// etc.
}}
58
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 116116
Key FeaturesKey Features
Very compact representationVery compact representationMost instructions take 1 line of C codeMost instructions take 1 line of C code2639 lines of 2639 lines of isa_descisa_desc 39K lines of C++39K lines of C++
18K generic, 11K for each of 2 CPU models18K generic, 11K for each of 2 CPU modelsCharacteristics autoCharacteristics auto--extracted from Cextracted from C
source, source, destdest regsregs; ; funcfunc unit class; etc.unit class; etc.execute()execute() code customized for CPU modelscode customized for CPU models
Amazingly well documented Amazingly well documented (for us, anyway)(for us, anyway)See See doxygendoxygen docsdocs
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 117117
Debugging M5Debugging M5
DPRINTF and TracingDPRINTF and TracingInstruction TracingInstruction Tracing
rundiffrundiffUsing the DebuggerUsing the DebuggerRemote GDBRemote GDB
59
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 118118
TracingTracingm5/base/trace.{cc,hh} m5/base/traceflags.pym5/base/trace.{cc,hh} m5/base/traceflags.py
printfprintf is a nice debugging toolis a nice debugging toolKeep good Keep good printfsprintfs for tracingfor tracingLots of debug output is a very good thingLots of debug output is a very good thingAdd new flags to Add new flags to traceflags.pytraceflags.py
Individual flags in the Individual flags in the baseFlagsbaseFlags arrayarrayGroups of flags in the Groups of flags in the compoundFlagMapcompoundFlagMap dictdict
Fetch, Decode, Ethernet, IPI, TLB, DMA, Fetch, Decode, Ethernet, IPI, TLB, DMA, Bus, Cache, Loader, Bus, Cache, Loader, AlphaConsoleAlphaConsole, etc, etc……
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 119119
TracingTracingm5/base/trace.{cc,hh} m5/base/traceflags.pym5/base/trace.{cc,hh} m5/base/traceflags.py
DPRINTF(FlagDPRINTF(Flag, , ““normal normal printfprintf %%ss\\nn””, , ““argumentsarguments””););
Command line flags:Command line flags:m5.opt m5.opt ––--Trace.flagsTrace.flags==““Space Separated ListSpace Separated List””
From From gdbgdb::((gdbgdb) call ) call setTraceFlag(setTraceFlag(““FlagFlag””))((gdbgdb) call ) call clearTraceFlag(clearTraceFlag(““FlagFlag””))
60
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 120120
Instruction TracingInstruction Tracing
----Trace.flagsTrace.flags==““InstExecInstExec””----ExecutionTrace.speculativeExecutionTrace.speculative=True=True
capture speculative instructionscapture speculative instructions----ExecutionTrace.print_cycleExecutionTrace.print_cycle=True=True
print cycle numberprint cycle number
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 121121
Using GDB with M5Using GDB with M5% % gdbgdb m5/build/ALPHA_SE/m5.debugm5/build/ALPHA_SE/m5.debug((gdbgdb) run m5) run m5--test/test1/run.ini test/test1/run.ini ----debug:break_cyclesdebug:break_cycles="1000 2000="1000 2000““Starting program: /z/stever/bk/m5/build/ALPHA_SE/m5.debug m5Starting program: /z/stever/bk/m5/build/ALPHA_SE/m5.debug m5--
test/test1/run.ini test/test1/run.ini ----debug:break_cyclesdebug:break_cycles="1000 2000="1000 2000““Starting simulation...Starting simulation...Program received signal SIGTRAP, Trace/breakpoint trap.Program received signal SIGTRAP, Trace/breakpoint trap.0xffffe002 in ?? ()0xffffe002 in ?? ()((gdbgdb) p ) p curTickcurTick$1 = 1000$1 = 1000((gdbgdb) c) cProgram received signal SIGTRAP, Trace/breakpoint trap.Program received signal SIGTRAP, Trace/breakpoint trap.0xffffe002 in ?? ()0xffffe002 in ?? ()((gdbgdb) p ) p curTickcurTick$2 = 2000$2 = 2000((gdbgdb) call sched_break_cycle(3000)) call sched_break_cycle(3000)((gdbgdb) c) cProgram received signal SIGTRAP, Trace/breakpoint trap.Program received signal SIGTRAP, Trace/breakpoint trap.0xffffe002 in ?? ()0xffffe002 in ?? ()((gdbgdb) p ) p curTickcurTick$3 = 3000$3 = 3000((gdbgdb))
61
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 122122
Remote DebuggingRemote Debugging% ~/m5/build/ALPHA_FS/m5.debug ~/m5/configs/fullsys/run.py% ~/m5/build/ALPHA_FS/m5.debug ~/m5/configs/fullsys/run.pyM5 Simulator SystemM5 Simulator SystemCopyright (c) 2001Copyright (c) 2001--20052005The Regents of The University of MichiganThe Regents of The University of MichiganAll Rights ReservedAll Rights Reserved
This code is part of the M5 simulator, developed by Nathan This code is part of the M5 simulator, developed by Nathan BinkertBinkert,,Erik Erik HallnorHallnor, Steve , Steve RaaschRaasch, and Steve Reinhardt, with contributions, and Steve Reinhardt, with contributionsfrom Ron from Ron DreslinskiDreslinski, Dave Greene, Lisa Hsu, Ali , Dave Greene, Lisa Hsu, Ali SaidiSaidi, and Andrew, and AndrewSchultz.Schultz.
M5 compiled on May 31 2005 11:43:24M5 compiled on May 31 2005 11:43:24M5 executing on M5 executing on ziff.eecs.umich.eduziff.eecs.umich.eduM5 simulation started Tue May 31 11:45:02 2005M5 simulation started Tue May 31 11:45:02 2005Listening for console connection on port 3456Listening for console connection on port 3456
0: 0: system.tsunami.iosystem.tsunami.io: Real: Real--time clock set to Sun Jan 1 00:00:00 2006time clock set to Sun Jan 1 00:00:00 2006command line: /n/ziff/z/binkertn/build/head/ALPHA_FS/m5.debug command line: /n/ziff/z/binkertn/build/head/ALPHA_FS/m5.debug
/n/ziff/z/binkertn/research/m5/head/configs/fullsys/run.py/n/ziff/z/binkertn/research/m5/head/configs/fullsys/run.py
Listening for remote Listening for remote gdbgdb connection on port 7000connection on port 7000warn: Entering event queue. Starting simulation...warn: Entering event queue. Starting simulation...
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 123123
Remote DebuggingRemote Debugging% % gdbgdb--linuxlinux--alphaalpha arch/alpha/boot/arch/alpha/boot/vmlinuxvmlinux... ... gdbgdb banner ...banner ...This GDB was configured as "This GDB was configured as "----host=i686host=i686--pcpc--linuxlinux--gnu gnu ----
target=alphatarget=alpha--linuxlinux"... (no debugging symbols found)... "... (no debugging symbols found)... ((gdbgdb) set remote Z) set remote Z--packet on [ This can be put in .packet on [ This can be put in .gdbinitgdbinit ]]((gdbgdb) target remote ziff:7000) target remote ziff:7000Remote debugging using ziff:7000Remote debugging using ziff:70000xfffffc0000496844 in 0xfffffc0000496844 in strcasecmpstrcasecmp (a=0xfffffc0000b13a80 "", b=0x0) (a=0xfffffc0000b13a80 "", b=0x0)
at arch/alpha/lib/strcasecmp.c:23at arch/alpha/lib/strcasecmp.c:232323 } while (ca == } while (ca == cbcb && ca != '&& ca != '\\0');0');((gdbgdb))
62
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 124124
The M5 Statistics PackageThe M5 Statistics Package
Statistics typesStatistics typesScalar<>Scalar<>Vector<>Vector<>FormulaFormulaDistribution<>Distribution<>VectorDistVectorDist<><>
M5 has phases, once it moves to the M5 has phases, once it moves to the running phase, no new statsrunning phase, no new stats
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 125125
Statistics ExampleStatistics Example..hhhh filefile
class class MySimObjectMySimObject : public : public SimObjectSimObject{{private:private:
Stats::ScalarStats::Scalar<> <> txBytestxBytes;;Stats::FormulaStats::Formula txBandwidthtxBandwidth;;Stats::VectorStats::Vector<> <> syscallsyscall;;
public:public:void void regStatsregStats();();
};};
.cc file (.cc file (regStatsregStats))txBytestxBytes
..name(namename(name() + ".() + ".txBytestxBytes")")
..desc("Bytesdesc("Bytes Transmitted")Transmitted")
..prereq(txBytesprereq(txBytes));;
txBandwidthtxBandwidth..name(namename(name() + ".() + ".txBandwidthtxBandwidth")")..desc("Transmitdesc("Transmit Bandwidth (bits/s)")Bandwidth (bits/s)").precision(0).precision(0)..prereq(txBytesprereq(txBytes));;
txBandwidthtxBandwidth = = txBytestxBytes * * Stats::constant(8) / Stats::constant(8) / simSecondssimSeconds;;
syscallsyscall..init(SystemCallsinit(SystemCalls<Linux>::Number)<Linux>::Number)..name(namename(name() + ".() + ".syscallsyscall")")..desc("numberdesc("number of of syscallssyscalls executed")executed")..flags(totalflags(total | | pdfpdf | | nozeronozero | | nonannonan));;
63
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 126126
Statistics OutputStatistics Outputclient.tsunami.etherdev.txBandwidthclient.tsunami.etherdev.txBandwidth 4302720 4302720 client.tsunami.etherdev.txBytesclient.tsunami.etherdev.txBytes 1344613446server.tsunami.etherdev.txBandwidthserver.tsunami.etherdev.txBandwidth 4684921600 4684921600 server.tsunami.etherdev.txBytesserver.tsunami.etherdev.txBytes 14640380 14640380 sim_secondssim_seconds 0.0250000.025000server.cpu.kern.syscallserver.cpu.kern.syscall 492492server.cpu.kern.syscall_1server.cpu.kern.syscall_1 189189 38.41%38.41% 38.41%38.41%server.cpu.kern.syscall_2server.cpu.kern.syscall_2 249249 50.61%50.61% 89.02%89.02%server.cpu.kern.syscall_3server.cpu.kern.syscall_3 5454 10.98%10.98% 100.00%100.00%
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 127127
WrapWrap--UpUp
Steve ReinhardtSteve Reinhardt
64
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 128128
Thank You!Thank You!
We hope you found this tutorial usefulWe hope you found this tutorial usefulWe hope you find M5 useful tooWe hope you find M5 useful tooWeWe’’d love to work with you to make M5 d love to work with you to make M5 even more useful to the communityeven more useful to the communityWe value your feedbackWe value your feedback
Please fill out a questionnairePlease fill out a questionnaireHand it to one of usHand it to one of us
On your way out, or during ISCAOn your way out, or during ISCA
June 5, 2005June 5, 2005 ISCA 2005 TutorialISCA 2005 Tutorial 129129
Keep In TouchKeep In Touch
Come talk to us at ISCACome talk to us at ISCACheck Check http://m5.eecs.umich.eduhttp://m5.eecs.umich.edu for updatesfor updatesUse, subscribe to our mailing lists:Use, subscribe to our mailing lists:
[email protected]@lists.sourceforge.netm5simm5sim--announce@[email protected]