10/7/05 10/7/05 IOSCA 2005 Tut IOSCA 2005 Tut orial orial 1 Using the M5 Simulator Using the M5 Simulator Nathan Binkert and Ali Saidi Nathan Binkert and Ali Saidi anks also to Ron Dreslinski, Lisa Hsu, Ron Dreslinski, Lisa Hsu, Kevin Lim, Kevin Lim, Steve Raasch, Erik Hallnor and Prof. Steve Reinhardt and Prof. Steve Reinhardt
44
Embed
10/7/05 IOSCA 2005 Tutorial 1 Using the M5 Simulator Nathan Binkert and Ali Saidi Ron Dreslinski, Lisa Hsu, Thanks also to Ron Dreslinski, Lisa Hsu, Kevin.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Nathan Binkert and Ali SaidiNathan Binkert and Ali Saidi
Thanks also to Ron Dreslinski, Lisa Hsu, Ron Dreslinski, Lisa Hsu, Kevin Lim, Kevin Lim, Steve Raasch, Erik Hallnor,and Prof. Steve Reinhardtand Prof. Steve Reinhardt
This tutorial is for This tutorial is for youyou Feel free to ask questionsFeel free to ask questions
We’ve got a lot to coverWe’ve got a lot to cover Lots of cool stuff didn’t even make the slidesLots of cool stuff didn’t even make the slides Don’t be offended if we have to move onDon’t be offended if we have to move on Come talk to us laterCome talk to us later
What M5 is and is notWhat M5 is and is notA brief peek insideA brief peek insideCurrent status & future developmentsCurrent status & future developments
A tool for simulating A tool for simulating systemssystems Not just CPU cores: memory, I/ONot just CPU cores: memory, I/O Not just SPEC apps: full OS codeNot just SPEC apps: full OS code Not just single machines: client/server, etc.Not just single machines: client/server, etc.
A framework for event-driven simulationA framework for event-driven simulation Events, objects, statistics, configurationEvents, objects, statistics, configuration
A collection of predefined object modelsA collection of predefined object models CPUs, caches, busses, devices, etc.CPUs, caches, busses, devices, etc.
Born of frustration with existing toolsBorn of frustration with existing tools Did not do what we wantedDid not do what we wanted Did not scale with added complexityDid not scale with added complexity
Desire to simulate TCP/IP performanceDesire to simulate TCP/IP performance Full-system supportFull-system support Multiple system simulationMultiple system simulation
Almost entirely original codeAlmost entirely original code Old CPU model based on SimpleScalar sim-outorderOld CPU model based on SimpleScalar sim-outorder Full-system support used SimOS as referenceFull-system support used SimOS as reference
No premeditated distribution plansNo premeditated distribution plans Just hacking together the system Just hacking together the system wewe wanted wanted
What We Would Like M5 to BeWhat We Would Like M5 to Be
Something that spares you the pain we’ve Something that spares you the pain we’ve been throughbeen through
A community resourceA community resource Modular enough to localize changesModular enough to localize changes Contribute back, and spare others some painContribute back, and spare others some pain
A path to reproducible/comparable resultsA path to reproducible/comparable results A common platform for evaluating ideasA common platform for evaluating ideas
Everything you care about is an object Everything you care about is an object (C++/Python)(C++/Python)
Derived from SimObject base classDerived from SimObject base class Common code for creation, configuration parameters, Common code for creation, configuration parameters,
naming, checkpointing, etc.naming, checkpointing, etc.
Uniform method-based APIs for object typesUniform method-based APIs for object types CPUs, caches, memory, etc.CPUs, caches, memory, etc. Plug-compatibility across implementationsPlug-compatibility across implementations
Functional vs. detailed CPUFunctional vs. detailed CPU Conventional vs. indirect-index cacheConventional vs. indirect-index cache
Standard event queue timing modelStandard event queue timing model Global logical time (in “ticks”)Global logical time (in “ticks”) No fixed relation to real timeNo fixed relation to real time
Objects schedule their own eventsObjects schedule their own events Flexibility for detail vs. performance tradeoffsFlexibility for detail vs. performance tradeoffs
E.g., a CPU typ. schedules an event every cycleE.g., a CPU typ. schedules an event every cycle Simple CPU won’t schedule self if stalled/idle Simple CPU won’t schedule self if stalled/idle Can also schedule every nCan also schedule every nthth cycle to model other cycle to model other
Syscall emulation modeSyscall emulation mode Alpha Tru64 or Linux application binariesAlpha Tru64 or Linux application binaries Host-based or SimpleScalar EIO tracesHost-based or SimpleScalar EIO traces
Full-system modeFull-system mode Models Compaq “Tsunami”-based systemModels Compaq “Tsunami”-based system
Boots Linux 2.4 & 2.6, FreeBSD and L4Boots Linux 2.4 & 2.6, FreeBSD and L4 Ask us if you want Tru64Ask us if you want Tru64
Extensions for >4 CPUsExtensions for >4 CPUs Ethernet, IDE disk adaptersEthernet, IDE disk adapters Handful of pre-built benchmarks availableHandful of pre-built benchmarks available
Finish new detailed OOO CPU modelFinish new detailed OOO CPU model Add full system, SMT supportAdd full system, SMT support Obsolete SimpleScalar-based modelObsolete SimpleScalar-based model
Minor re-architecting of memory systemMinor re-architecting of memory system Support non-bus interconnects, directory coherenceSupport non-bus interconnects, directory coherence
More ISAsMore ISAs PowerPC, ARM likely candidatesPowerPC, ARM likely candidates Heterogeneous system support (?)Heterogeneous system support (?)
More full-system benchmarksMore full-system benchmarksBetter C++/Python integrationBetter C++/Python integration
PlatformsPlatforms Linux, BSD, CYGWIN (most UNIX like systems?)Linux, BSD, CYGWIN (most UNIX like systems?)
Linux is primary, others may take a tiny bit of workLinux is primary, others may take a tiny bit of work Little endian machines!Little endian machines! 64-bit machines help a lot64-bit machines help a lot
ToolsTools GCC/G++ 3.0+GCC/G++ 3.0+
Recently tested with 3.3,3.4,4.0Recently tested with 3.3,3.4,4.0 Python 2.3+Python 2.3+ SCons (we use 0.96+)SCons (we use 0.96+)
% cd m5% cd m5% cd build% cd build% scons % scons ALPHA_FS/m5.optALPHA_FS/m5.optscons: Reading SConscript files ...scons: Reading SConscript files ...Checking for C header file fenv.h... yesChecking for C header file fenv.h... yesCompiling in ALPHA_FS with MySQL support.Compiling in ALPHA_FS with MySQL support.scons: done reading SConscript files.scons: done reading SConscript files.scons: Building targets ...scons: Building targets .../z/saidi/work/m5/build/m5/arch/isa_parser.py m5/arch/alpha/isa_desc /z/saidi/work/m5/build/m5/arch/isa_parser.py m5/arch/alpha/isa_desc
ALPHA_FS/arch/alpha arch/alphaALPHA_FS/arch/alpha arch/alphaGenerating ALPHA_FS/arch/alpha/decoder.hhGenerating ALPHA_FS/arch/alpha/decoder.hhGenerating ALPHA_FS/arch/alpha/decoder.ccGenerating ALPHA_FS/arch/alpha/decoder.ccGenerating ALPHA_FS/arch/alpha/simple_cpu_exec.ccGenerating ALPHA_FS/arch/alpha/simple_cpu_exec.ccGenerating ALPHA_FS/arch/alpha/fast_cpu_exec.ccGenerating ALPHA_FS/arch/alpha/fast_cpu_exec.ccGenerating ALPHA_FS/arch/alpha/full_cpu_exec.ccGenerating ALPHA_FS/arch/alpha/full_cpu_exec.ccGenerating ALPHA_FS/arch/alpha/alpha_o3_exec.ccGenerating ALPHA_FS/arch/alpha/alpha_o3_exec.ccDefining SS_COMPATIBLE_FP as 0 in ALPHA_FS/config/ss_compatible_fp.hh.Defining SS_COMPATIBLE_FP as 0 in ALPHA_FS/config/ss_compatible_fp.hh.Defining USE_FENV as 1 in ALPHA_FS/config/use_fenv.hh.Defining USE_FENV as 1 in ALPHA_FS/config/use_fenv.hh.echo '#include "arch/alpha/isa_traits.hh"' > ALPHA_FS/targetarch/isa_traits.hhecho '#include "arch/alpha/isa_traits.hh"' > ALPHA_FS/targetarch/isa_traits.hhDefining FULL_SYSTEM as 1 in ALPHA_FS/config/full_system.hh.Defining FULL_SYSTEM as 1 in ALPHA_FS/config/full_system.hh.echo '#include "arch/alpha/alpha_memory.hh"' > echo '#include "arch/alpha/alpha_memory.hh"' >
ALPHA_FS/targetarch/alpha_memory.hhALPHA_FS/targetarch/alpha_memory.hhecho '#include "arch/alpha/byte_swap.hh"' > ALPHA_FS/targetarch/byte_swap.hhecho '#include "arch/alpha/byte_swap.hh"' > ALPHA_FS/targetarch/byte_swap.hhDefining NO_FAST_ALLOC as 0 in ALPHA_FS/config/no_fast_alloc.hh.Defining NO_FAST_ALLOC as 0 in ALPHA_FS/config/no_fast_alloc.hh.Defining STATS_BINNING as 1 in ALPHA_FS/config/stats_binning.hh.Defining STATS_BINNING as 1 in ALPHA_FS/config/stats_binning.hh.python m5/base/traceflags.py ALPHA_FS/base/traceflagspython m5/base/traceflags.py ALPHA_FS/base/traceflags......
-d set the output directory to <dir>-d set the output directory to <dir> -E set the environment variable <var> to <val> (or -E set the environment variable <var> to <val> (or
'True')'True') -I add the directory <dir> to python's path-I add the directory <dir> to python's path -P execute <python> directly in the configuration-P execute <python> directly in the configuration --var=val set the python variable <var> to '<val>'--var=val set the python variable <var> to '<val>' <configfile> config file name (ends in .py)<configfile> config file name (ends in .py)
% ~/m5/build/ALPHA_FS/m5.debug –d output ~/m5/configs/fullsys/run.py% ~/m5/build/ALPHA_FS/m5.debug –d output ~/m5/configs/fullsys/run.pyM5 Simulator SystemM5 Simulator SystemCopyright (c) 2001-2005Copyright (c) 2001-2005The Regents of The University of MichiganThe Regents of The University of MichiganAll Rights ReservedAll Rights Reserved
This code is part of the M5 simulator, developed by Nathan Binkert,This code is part of the M5 simulator, developed by Nathan Binkert,Erik Hallnor, Steve Raasch, and Steve Reinhardt, with contributionsErik Hallnor, Steve Raasch, and Steve Reinhardt, with contributionsfrom Ron Dreslinski, Dave Greene, Lisa Hsu, Kevin Lim, Ali Saidi,from Ron Dreslinski, Dave Greene, Lisa Hsu, Kevin Lim, Ali Saidi,and Andrew Schultz.and Andrew Schultz.
M5 compiled on Oct 2 2005 22:16:33M5 compiled on Oct 2 2005 22:16:33M5 simulation started Sun Oct 2 22:17:40 2005M5 simulation started Sun Oct 2 22:17:40 2005Listening for console connection on port 3456Listening for console connection on port 3456 0: system.tsunami.io.rtc: Real-time clock set to Sun Jan 1 0: system.tsunami.io.rtc: Real-time clock set to Sun Jan 1
Listening for remote gdb connection on port 7000Listening for remote gdb connection on port 7000warn: Entering event queue. Starting simulation...warn: Entering event queue. Starting simulation...
PythonPythonConfig objs mapped to simulator objsConfig objs mapped to simulator objsNo need for scripts to generate configsNo need for scripts to generate configs
All logic for running many simulations contained in a All logic for running many simulations contained in a single set of configurable config files!single set of configurable config files!
Pass parameters via environment varsPass parameters via environment vars -E<var>[=<val>]-E<var>[=<val>]
Variables with units are enforcedVariables with units are enforced Latency must be ‘2ns’, not simply 2Latency must be ‘2ns’, not simply 2
--Serialize.cycle=<start cycle>--Serialize.cycle=<start cycle>--Serialize.period=<repeat interval>--Serialize.period=<repeat interval>--Serialize.count=<# of checkpoints>--Serialize.count=<# of checkpoints>
M5 instructionM5 instruction Insert special instruction into code to trigger a Insert special instruction into code to trigger a
checkpoint to be droppedcheckpoint to be dropped Our benchmarks do thisOur benchmarks do this
Checkpoints must be regenerated with some Checkpoints must be regenerated with some config changesconfig changes Most config changes that are architecturally visible Most config changes that are architecturally visible
(because the kernel may have behaved differently)(because the kernel may have behaved differently) Physical memory size, new kernelsPhysical memory size, new kernels
M5 canM5 can dump statistics many timesdump statistics many times aggregate statistics based on some eventaggregate statistics based on some event
(keep stats according to kernel mode or user mode)(keep stats according to kernel mode or user mode)
Switch between CPU configurationsSwitch between CPU configurationsFunctional CPU Functional CPU Detailed CPU Detailed CPU Warm-up caches in a functional CPU, do Warm-up caches in a functional CPU, do
measurements in a detailed CPUmeasurements in a detailed CPU
Raw copies of Linux disk imageRaw copies of Linux disk image Binaries to be run must be present on imageBinaries to be run must be present on image
rcS files (rcS files (m5/configs/boot/*.rcSm5/configs/boot/*.rcS)) Exactly like normal boot scriptsExactly like normal boot scripts Use them to start running a binary on the disk Use them to start running a binary on the disk
image, configure ethernet interfaces, etc.image, configure ethernet interfaces, etc. Can also execute m5 instructionsCan also execute m5 instructions Nice and flexible, since not compiled inNice and flexible, since not compiled in
Specified in configuration by Specified in configuration by readfile=‘path/to/script.rcS’readfile=‘path/to/script.rcS’
See for yourself!See for yourself!Going into / of disk image and typing ls will show:Going into / of disk image and typing ls will show:
benchmarks etc lib mnt sbin usrbenchmarks etc lib mnt sbin usr bin floppy lost+found modules sys varbin floppy lost+found modules sys var dev home man proc tmp zdev home man proc tmp z
Adding Your Own BenchmarksAdding Your Own BenchmarksHighly encouraged! Highly encouraged!
Please share them with others!Please share them with others!
Since M5 is Alpha targeted, need to Since M5 is Alpha targeted, need to compile Alpha binariescompile Alpha binaries Cross-compiler can be downloaded from Cross-compiler can be downloaded from
www.kegel.com/crosstoolwww.kegel.com/crosstool Or, if you have an Alpha, use thatOr, if you have an Alpha, use that Add the benchmark binaries to disk image Add the benchmark binaries to disk image Create .rcS file that executes the binaryCreate .rcS file that executes the binary
If you need to mount a disk image to If you need to mount a disk image to change something (like add a benchmark change something (like add a benchmark binary)binary)
We hope you found this tutorial usefulWe hope you found this tutorial usefulWe hope you find M5 useful tooWe hope you find M5 useful tooWe’d love to work with you to make M5 We’d love to work with you to make M5
even more useful to the communityeven more useful to the community