Microprocessors Microprocessors The MIPS Architecture The MIPS Architecture (User Level Instruction (User Level Instruction Set) Set) Mar 21st, 2002 Mar 21st, 2002
Jan 07, 2016
MicroprocessorsMicroprocessors
The MIPS ArchitectureThe MIPS Architecture
(User Level Instruction Set)(User Level Instruction Set)
Mar 21st, 2002Mar 21st, 2002
Generations of MIPSGenerations of MIPS
Original architecture R2000Original architecture R2000Later version R3000-R10000Later version R3000-R10000
Added instructionsAdded instructionsMoved from 32-bit to 64-bitMoved from 32-bit to 64-bitDoubled number of fpt registersDoubled number of fpt registersRemoved some restrictionsRemoved some restrictionsRemoved features inhibiting high Removed features inhibiting high
performanceperformanceWe will look at the R2000 hereWe will look at the R2000 here
Overall CharacteristicsOverall Characteristics
Byte Addressed MemoryByte Addressed Memory32-bits = 4 gigabytes max memory32-bits = 4 gigabytes max memoryConfigurable on reset to big/little-endianConfigurable on reset to big/little-endianNormally run in big-endian modeNormally run in big-endian mode
RegistersRegisters32 x 32 integer registers32 x 32 integer registersMultiply/divide registers HI/LO (32 bits)Multiply/divide registers HI/LO (32 bits)Program Counter 32 bitsProgram Counter 32 bits
More on RegistersMore on Registers
Register 0 is specialRegister 0 is specialReads as zero bitsReads as zero bitsDiscards any write attemptsDiscards any write attempts
Registers 31 is specialRegisters 31 is specialIt is the link register for CALL instructionIt is the link register for CALL instruction
More on MemoryMore on Memory
In user mode, 2 gigs addressableIn user mode, 2 gigs addressable Positive addresses from 0 to 7FFF_FFFFPositive addresses from 0 to 7FFF_FFFF Memory is mapped and cachedMemory is mapped and cached
In system modeIn system mode User memory mapped and cached as aboveUser memory mapped and cached as above 8000_0000 – 9FFF_FFFF (cached only, maps to 8000_0000 – 9FFF_FFFF (cached only, maps to
first 0.5 gig of physical memory)first 0.5 gig of physical memory) A000_0000 – BFFF_FFFF (not cached, maps to A000_0000 – BFFF_FFFF (not cached, maps to
first 0.5 gig of physical memory)first 0.5 gig of physical memory) C000_0000 – FFFF_FFFF (mapped and cached)C000_0000 – FFFF_FFFF (mapped and cached)
Instruction ExecutionInstruction Execution
Uses 5-stage pipelineUses 5-stage pipelineIFIF instruction fetchinstruction fetchRDRD decode and read register valuesdecode and read register valuesALUALU perform required operationperform required operationMEMMEM access memoryaccess memoryWBWB write back results to registerswrite back results to registers
One clock per pipe line stageOne clock per pipe line stage
Pipeline ConfigurationPipeline Configuration
R2000 Instruction PipelineR2000 Instruction Pipeline IFIF RDRD ALUALU MEMMEM WBWB
IFIF RDRD ALUALU MEMMEM WBWBIFIF RDRD ALUALU MEMMEM
WBWBOne instruction enters pipeline on each One instruction enters pipeline on each
clockclock
Interface to MemoryInterface to Memory
External Cache memoryExternal Cache memoryFirst chip with large cache memoryFirst chip with large cache memorySeparate cache for memory and Separate cache for memory and
instructionsinstructionsAllows fetching instruction and data in same Allows fetching instruction and data in same
clockclock
Write BufferWrite BufferAll data is written to write bufferAll data is written to write bufferReads go through write bufferReads go through write buffer
To ensure serial consistencyTo ensure serial consistency
Instruction Formats (One Instruction Formats (One Slide!)Slide!)
All instructions are 32 bitsAll instructions are 32 bits I-type (Immediate)I-type (Immediate)
opop rsrs rtrt immediateimmediate 6 6 5 5 55 1616
J-type (Jump)J-type (Jump) opop targettarget
6 6 2626 R-type (Register)R-type (Register)
opop rsrs rtrt rdrd shamtshamt funct funct 6 6 55 55 55 5 5 6 6
I-Type InstructionI-Type Instruction
I-type (Immediate)I-type (Immediate) opop rsrs rtrt immediateimmediate
6 6 55 55 1616 The opcode is always first 6 bitsThe opcode is always first 6 bits The rs field is source registerThe rs field is source register The rt field is target registerThe rt field is target register The immediate field is a 16-bit signed valThe immediate field is a 16-bit signed val Used for reg instructs: rt := rs op immedUsed for reg instructs: rt := rs op immed Used for load/stores: rt is register loaded/storedUsed for load/stores: rt is register loaded/stored
Address for load store is rs (base) + offsetAddress for load store is rs (base) + offset Used for conditional jump instructions: rs, rt areUsed for conditional jump instructions: rs, rt are
registers tested, immediate is a 16-bit signed offsetregisters tested, immediate is a 16-bit signed offset
More on Load/StoresMore on Load/Stores
Addressing modes areAddressing modes areImmediate (first 32K of memory, really Immediate (first 32K of memory, really
useful only for operating system), base useful only for operating system), base = 0= 0
Register indirect. Base is the register, Register indirect. Base is the register, offset is set to zero.offset is set to zero.
Register+offset. Base is the register, Register+offset. Base is the register, offset gives a +/- 32K offset from this offset gives a +/- 32K offset from this registerregister
J-Type InstructionJ-Type Instruction
J-type (Jump)J-type (Jump) opop targettarget
6 6 2626The opcode is always first 6 bitsThe opcode is always first 6 bitsTarget is 28 bit address (last 2 bits Target is 28 bit address (last 2 bits
zero)zero)Used for unconditional jumps/callsUsed for unconditional jumps/callsCan only address 256 megabytesCan only address 256 megabytes
Upper 4 bits of PC is unchangedUpper 4 bits of PC is unchanged
R-Type InstructionR-Type Instruction
R-type (Register)R-type (Register) opop rsrs rtrt rdrd shamt functshamt funct
6 6 55 55 55 5 5 6 6Opcode is always first 6 bitsOpcode is always first 6 bitsTypical use is rd Typical use is rd rs (op) rt rs (op) rt(op) is determined by op+funct fields(op) is determined by op+funct fieldsshamt is shift amount for shift shamt is shift amount for shift
instructionsinstructions
The Instruction SetThe Instruction Set
The following slides cover the entire The following slides cover the entire user level instruction set of the MIPS user level instruction set of the MIPS R2000R2000
Contrast this with the extensive Contrast this with the extensive instruction set of CISC chips (or instruction set of CISC chips (or modern RISC chips!)modern RISC chips!)
Load InstructionsLoad Instructions
LBLB - Load byte, sign extended- Load byte, sign extendedLBU – Load byte, zero extendedLBU – Load byte, zero extendedLH – Load halfword, sign extendedLH – Load halfword, sign extendedLHU – Load halfword, zero extendedLHU – Load halfword, zero extendedLW – Load wordLW – Load wordLWL – Load word leftLWL – Load word leftLWR – Load word rightLWR – Load word right
Notes on Sign/Zero ExtensionNotes on Sign/Zero Extension
Sign extend means copy sign bitsSign extend means copy sign bitsFor example, for LB, if memory has FE (-For example, for LB, if memory has FE (-
2) then 32-bit register will have 2) then 32-bit register will have FFFF_FFFE (which is also -2)FFFF_FFFE (which is also -2)
Zero extend means supply zero bitsZero extend means supply zero bitsFor example, for LB, if memory has FE (-For example, for LB, if memory has FE (-
2 or 254 depending on how you look at 2 or 254 depending on how you look at it), register wil have 0000_00FE (254)it), register wil have 0000_00FE (254)
Notes on Load Left/RightNotes on Load Left/Right
Memory references must be alignedMemory references must be alignedLoad Left/Load Right allow a non-Load Left/Load Right allow a non-
aligned word to be loaded in two aligned word to be loaded in two separate instructionsseparate instructions
Load left loads left bytes of registerLoad left loads left bytes of registerLoad right loads right bytes of Load right loads right bytes of
registerregister
Load Delay SlotLoad Delay Slot
Instruction immediately after a load Instruction immediately after a load cannot reference the loaded registercannot reference the loaded register
If it does reference it, then result is If it does reference it, then result is undefined.undefined.
Idea is that if data is in cache, then no Idea is that if data is in cache, then no need to delay or interlock (1 clock is need to delay or interlock (1 clock is enough time to get the data)enough time to get the data)
If data is not in cache, pipeline will If data is not in cache, pipeline will stallstall
Store InstructionsStore Instructions
SB – Store byteSB – Store byteSH – Store halfwordSH – Store halfwordSW – Store wordSW – Store wordSWL – Store word leftSWL – Store word leftSWR – Store word rightSWR – Store word rightSWL/SWR allow storing of non-SWL/SWR allow storing of non-
aligned word in two instructionsaligned word in two instructions
Computational ImmediateComputational Immediate
ADDI – Add immediateADDI – Add immediateADDIU – Add immediate unsignedADDIU – Add immediate unsignedSLTI – Set on less than immediateSLTI – Set on less than immediateSLTIU – Set on less than immediate SLTIU – Set on less than immediate
(uns)(uns)ANDI – And immediateANDI – And immediateORI – Or immediateORI – Or immediateXORI – Exclusive or immediateXORI – Exclusive or immediateLUI – Load upper immediateLUI – Load upper immediate
Note on Set InstructionsNote on Set Instructions
For the set instructionsFor the set instructionsA comparison is done (signed or A comparison is done (signed or
unsigned) on the two source values unsigned) on the two source values (register and immediate, or (register and immediate, or register/register)register/register)
The result register has 0 or 1 depending The result register has 0 or 1 depending on whether the less than condition is on whether the less than condition is false or true (same values as used by C false or true (same values as used by C language)language)
Note on Load Upper Note on Load Upper ImmediateImmediate
LUI – Load Upper ImmediateLUI – Load Upper ImmediateThe immediate value is loaded into the The immediate value is loaded into the
upper 16 bits of the target register, upper 16 bits of the target register, lower 16 bits are set to zero.lower 16 bits are set to zero.
Use with following ORI to set full 32-bit Use with following ORI to set full 32-bit immediate value (e.g. an address).immediate value (e.g. an address).
Note on signed/unsigned addNote on signed/unsigned add
For add/subtract instructionsFor add/subtract instructionsSigned instructions trap on signed Signed instructions trap on signed
overflowoverflowUnsigned instructions do not trapUnsigned instructions do not trap
Result is same (in the absence of Result is same (in the absence of trap)trap)
2’s complement used for signed 2’s complement used for signed valuesvalues
Computational Reg/RegComputational Reg/Reg
ADD – Add registersADD – Add registers ADDU – Add registers unsignedADDU – Add registers unsigned SUB – Subtract registersSUB – Subtract registers SUBU – Subtract registers unsignedSUBU – Subtract registers unsigned SLT – Set on registers less thanSLT – Set on registers less than SLTU – Set on registers less than unsignedSLTU – Set on registers less than unsigned AND – Logical and registersAND – Logical and registers OR – Logical or registersOR – Logical or registers XOR – Logical exclusive or registersXOR – Logical exclusive or registers NOR – Logical nor registersNOR – Logical nor registers
Shift Instructions (fixed Shift Instructions (fixed count)count)
Here rt is input, rd is output, shamt is Here rt is input, rd is output, shamt is shift count in bitsshift count in bitsSLL – Shift left logical (zero fill)SLL – Shift left logical (zero fill)SRL – Shift right logical (zero fill)SRL – Shift right logical (zero fill)SRA – Shift right arithmetic (sign bit fill)SRA – Shift right arithmetic (sign bit fill)
Note this not quite an ordinary divide, since Note this not quite an ordinary divide, since -5 shifted right one bit gives -3, not -2-5 shifted right one bit gives -3, not -2
Shift Instructions (variable Shift Instructions (variable count)count)
For these instructions, rt is input For these instructions, rt is input register, rd is output register, low 5 register, rd is output register, low 5 bits of rs is shift amountbits of rs is shift amountSLLV – Shift left logical variableSLLV – Shift left logical variableSRLV – Shift right logical variableSRLV – Shift right logical variableSRAV – Shift right arithmetic variableSRAV – Shift right arithmetic variable
Multiply/Divide InstructionsMultiply/Divide Instructions
For these, input is in rs,rt, result in For these, input is in rs,rt, result in HI/LOHI/LOMULT – Signed multiplyMULT – Signed multiplyMULTU – Unsigned multiplyMULTU – Unsigned multiply
For these, rs is dividend, rs is divisorFor these, rs is dividend, rs is divisorLO gets quotient, HI gets remainder LO gets quotient, HI gets remainder DIV – Signed divideDIV – Signed divideDIVU – unsigned divideDIVU – unsigned divide
Accessing HI/LO registersAccessing HI/LO registers
MFHI – Move from HI to rdMFHI – Move from HI to rdMFLO – Move from LO to rdMFLO – Move from LO to rdMTHI – Move from rd to HIMTHI – Move from rd to HIMTLO – Move from rd to LOMTLO – Move from rd to LO
Note: for the move from instructions, Note: for the move from instructions, there is an interlock to wait for there is an interlock to wait for completion of the (relatively long) completion of the (relatively long) divide/multiplydivide/multiply
Jump InstructionsJump Instructions
J – Unconditional jump (26 bit target)J – Unconditional jump (26 bit target) JAL – Jump and link (26 bit target, JAL – Jump and link (26 bit target,
R31 set to instruction address + 8)R31 set to instruction address + 8) JR – Jump Register (target is in rs)JR – Jump Register (target is in rs) JALR – Jump and link register (target JALR – Jump and link register (target
in rs, rd set to instruction address + in rs, rd set to instruction address + 8)8)
Branch InstructionsBranch Instructions
All these instructions use rs/rt as input All these instructions use rs/rt as input instructions and the target is in offsetinstructions and the target is in offset BEQ – Branch on Equal, branch if rs = rtBEQ – Branch on Equal, branch if rs = rt BNE – Branch on Not Equal, branch if rs /= rtBNE – Branch on Not Equal, branch if rs /= rt BLEZ – Branch on LE zero, branch if rs <= 0BLEZ – Branch on LE zero, branch if rs <= 0 BGTZ – Branch on GT zero, branch if rs > 0BGTZ – Branch on GT zero, branch if rs > 0 BLTZ – Branch on LT zero, branch if rs < 0BLTZ – Branch on LT zero, branch if rs < 0 BGEZ – Branch on GE zero, branch if rs >= 0BGEZ – Branch on GE zero, branch if rs >= 0 BLTZAL – Branch and link on LT zeroBLTZAL – Branch and link on LT zero BGEZAL – Branch and link on GE zeroBGEZAL – Branch and link on GE zero
Note: last two instructions set 31 to instruc + 8Note: last two instructions set 31 to instruc + 8
Jump/Branch Delay SlotsJump/Branch Delay Slots
All jumps and branches have a delay slotAll jumps and branches have a delay slotThis slot is unconditionalThis slot is unconditionalThe instruction in the delay slot is The instruction in the delay slot is
executed logically before the jumpexecuted logically before the jump Jump cannot depend on value from Jump cannot depend on value from
instruction in delay slotinstruction in delay slotThis is why link register set to instruction This is why link register set to instruction
+ 8, since that is the return point+ 8, since that is the return point
Special InstructionsSpecial Instructions
SYSCALLSYSCALLInitiates system call trapInitiates system call trapParameters passed in registers (like Int 21 in Parameters passed in registers (like Int 21 in
DOS on the ia32)DOS on the ia32)Transfers control to system exception Transfers control to system exception
handlerhandlerBREAKBREAK
Initiates breakpoint trapInitiates breakpoint trapTransfers control to system exception Transfers control to system exception
handlerhandler
That’s All Folks!That’s All Folks!
This is slide 33This is slide 33And we have done the ENTIRE MIPS And we have done the ENTIRE MIPS
user level instruction set without any user level instruction set without any omissionsomissions
Well we left out floating-point, but Well we left out floating-point, but technically that is the coprocessor, technically that is the coprocessor, not the MIPS processor itself.not the MIPS processor itself.