Page 1
ECE2030 Introduction to Computer Engineering
Lecture 18: Instruction Set Architecture
Prof. Hsien-Hsin Sean LeeProf. Hsien-Hsin Sean LeeSchool of Electrical and Computer EngineeringSchool of Electrical and Computer EngineeringGeorgia TechGeorgia Tech
Page 2
2
Breakdown of a Computing Problem
Instruction Set Architecture (ISA)Instruction Set Architecture (ISA)
ProblemProblem AlgorithAlgorithmsms
Programming inProgramming inHigh-Level LanguageHigh-Level Language
Compiler/Assembler/Compiler/Assembler/LinkerLinker
System LevelSystem LevelHuman LevelHuman Level
System architectureSystem architecture
Target Machine Target Machine (one implementation)(one implementation)Micro-architectureMicro-architecture
Functional units/Functional units/Data Path Data Path
Gates Level Gates Level Design Design
TransistorsTransistors ManufacturingManufacturing
RTL Level RTL Level Logic Level Logic Level Circuit Level Circuit Level Silicon Level Silicon Level
Page 3
3
Instruction Set Architecture (ISA)• An abstraction
– Interface between hardware and low-level software
– Alleviate programmers from specifying control signals to harness a machine
• Defined by– An Instruction Set– Software convention
• Independent from a specific internal implementation (microarchitecture + system architecture)
Page 4
4
ISA design principles• Compatibility• Implementability• Programmability• Usability• Encoding efficiency
High Level Language
ISA
CompilerCompiler
… lw r2, mem[r7] add r3, r4, r2 st r3, mem[r8]
main() { int i,b,c,a[10]; for (i=0; i<10; i++)… a[2] = b + c*i;}
AssemblerAssembler
Binary code
Page 5
5
General Purpose Computer
Central Processing UnitCentral Processing Unit(CPU)(CPU)
MemoryMemory
DataData &Instruction
0101 1001 1010 1001 1000 0100 1000 1110 1111 00110010 1011 1000 …… ……
A stored-program computer called EDVACEDVAC proposed in 1944 while developing ENIACENIAC, first general purpose computer
Contributors:Presper EckertJohn MauchlyJohn von Neumann
EDSACEDSAC built by Maurice Wilkes implements the first operational stored-program machine
Von Neumann Machine
Page 6
6
Basic Operation 1000 1100 1110 0010 0000 0000 0000 00001000 1100 1110 0010 0000 0000 0000 0000 (= lw R2, mem[R7])
Instruction fetch from memory
Instruction Decoder/Microcode ROM
Datapath UnitData written back to memory
It’s called HarvardHarvard ArchitectureArchitecture (Mark-III/IV) if instruction and data memory are separated
MICROPROCESSORMICROPROCESSOR
Page 7
7
Commercial ISA• CDC6600, IBM 360, DEC VAX (good old days,
360 is now IBM z-series)• x86 (Intel 32, Intel 64, AMD64), Itanium (IA-
64)• Sun Sparc• Xscale (PocketPC)• IBM PowerPC (Mac, BlueGene)• ARM, MIPS (embedded, MIPS once was
popular in workstations)
Page 8
8
Basic Instruction Format (Assembly code)
• ISA defines a set of “architectural registersarchitectural registers” to avoid going to memory all the time– X86: EAX, EBX, ECX, EDX, ESI, EDI, EBP, ESP– MIPS: r0 to r31 and Hi, Lo (or sometimes we use alias to
show the software convention when using these registers)• Instruction main classes
– Arithmetic / Logical– Data transfer (load or store for different data sizes)– Change-of-flow
• Conditional branches• Unconditional branches (e.g. jump, subroutine calls.)
• Operands – Architectural registers– Memory addresses– Target address for change-of-flow
<instruction mnemonic><instruction mnemonic> <destination operand>, <source op>, <source op><destination operand>, <source op>, <source op>
Page 9
9
MIPS Register AliasesRegister Names Usage by Software Convention
$0 $zero Hardwired to zero$1 $at Reserved by assembler
$2 - $3 $v0 - $v1 Function return result registers$4 - $7 $a0 - $a3 Function passing argument value registers
$8 - $15 $t0 - $t7 Temporary registers, caller saved$16 - $23 $s0 - $s7 Saved registers, callee saved$24 - $25 $t8 - $t9 Temporary registers, caller saved$26 - $27 $k0 - $k1 Reserved for OS kernel
$28 $gp Global pointer$29 $sp Stack pointer$30 $fp Frame pointer$31 $ra Return address (pushed by call instruction)$hi $hi High result register (remainder/div, high
word/mult)$lo $lo Low result register (quotient/div, low
word/mult)
Page 10
10
Basic Instruction Format (Assembly code)
<instruction mnemonic><instruction mnemonic> <destination operand>, <source op>, <source op><destination operand>, <source op>, <source op>
R8 = R6 + R7 add $8, $6, $7or add $t0, $a1, $a2
R9 = R9 + 2004 addi $9, $9, 2004 R3 = R4 R5 xor $3, $4, $5
operationoperation MIPS assemblyMIPS assembly
R10 = R8 << R9 sllv $10, $8, $9 R24 = R15 >> 2 sra $24, $15, 2 (arith right shift)
R2 = mem[R3+100] lw $2, 100($3) mem[R3+100] = R2 sw $2, 100($3)
if (R2<R3) R4=1 else R4=0 slt $4, $2, $3 Procedural call jal _func $31=PC+4; go to address pointed label _func (assuming no delay slot)
Page 11
11
MIPS R-format Encoding31
opcode rs rt rd
26 25 21 20 1615 11 10 6 5 0
shamt funct
0 0 0 0 0 0 0 1 1 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 00 031
opcode rs rt rd
26 25 21 20 1615 11 10 6 5 0
shamt funct
add $4, $3, $2 rt
rs
rd
0 0 0 0 0 0 0 1 1 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 00 0Encoding = 0x00622020
Page 12
12
MIPS R-format Encoding31
opcode rs rt rd
26 25 21 20 1615 11 10 6 5 0
shamt funct
0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 1 1 0 0 1 1 1 0 0 0 0 0 00 031
opcode rs rt rd
26 25 21 20 1615 11 10 6 5 0
shamt funct
sll $3, $5, 7 shamt
rt
rd
0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 1 1 0 0 1 1 1 0 0 0 0 0 00 0Encoding = 0x000519C0
Page 13
13
MIPS I-format Encoding31
opcode rs rt Immediate Value
26 25 21 20 1615 0
lw $5, 3000($2)Immediate
rs
rt
0 0 1 1 0 0 0 1 0 0 0 1 0 1 0 0 0 0 1 0 1 1 1 0 1 1 1 0 0 01 0Encoding = 0x8C450BB8
0 0 1 1 0 0 0 1 0 0 0 1 0 1 0 0 0 0 1 0 1 1 1 0 1 1 1 0 0 01 031
opcode rs rt
26 25 21 20 1615 0
Immediate Value
Page 14
14
MIPS I-format Encoding31
opcode rs rt Immediate Value
26 25 21 20 1615 0
sw $5, 3000($2)Immediate
rs
rt
1 0 1 1 0 0 0 1 0 0 0 1 0 1 0 0 0 0 1 0 1 1 1 0 1 1 1 0 0 01 0Encoding = 0xAC450BB8
1 0 1 1 0 0 0 1 0 0 0 1 0 1 0 0 0 0 1 0 1 1 1 0 1 1 1 0 0 01 031
opcode rs rt
26 25 21 20 1615 0
Immediate Value
Page 15
15
MIPS J-format Encoding31
opcode Target Address
26 0
jal 0x00400030Target
0 0 1 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 00 0Encoding = 0x0C10000C
0 0 1 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 00 031
opcode
26 25 0
Target Address
25
0000 0000 0100 0000 0000 0000 0011 0000XInstruction=4 bytesTarget Address
•jal will jump and pushreturn address in $ra ($31)•Use “jr $31” to return
Page 16
16
JR and JALR• JALR (Jump And Link Register) and JR (Jump
Register)– Considered as R-type– Unconditional jump – JALR used for procedural call
0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 1 0 0 10 031
opcode rs 0 rd (default=31)
26 25 21 20 1615 11 10 6 5 0
0 funct jalr r2
jr r2 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 00 031
opcode rs 0
26 25 21 20 1615 11 10 6 5 0
0 funct0
Page 17
17
Assembly Program Example.data.globl array
array: .word 0x12345678 .word 0x98765432 .word 0x66bbccdd .word 0x44332211
.text
.globl __start__start:
jal main# more code below
.globl mainmain:
la $8, arraylb $9, ($8)lb $10, 1($8)add $11, $9, $10sb $11, ($8)addiu $8, $8, 4lh $9, ($8)lhu $10, 2($8)add $11, $9, $10sh $11, ($8)addiu $8, $8, 4lw $9, ($8)lw $10, 4($8)sub $11, $9, $10sw $11, ($8)
Page 18
18
Interface for System Services
Page 20
20
ISA Design Philosophy
RISC Reduced Instruction Set Computers
versus
CISC Complex Instruction Set Computers
• IBM 801 led by John Cocke pioneered RISC concept• Berkeley’s RISC-I and Stanford’s MIPS led the first academic implementations
Page 21
21
RISC versus CISC• Why CISC?
– Memory are expensive and slow back then
– Cramming more functions into one instruction
– Using microcode ROM (μROM) for “complex” operations
• Justification for RISC– Complex apps are mostly
composed of simple assignments
– RAM speed catching up– Compiler (human) getting
smarter– Frequency shorter
pipe stages (also easier to design a regular pipeline)
CISC RISCVariable length instructions
Fixed-length instructions
Abundant instructions and addressing modes
Fewer instructions and addressing modes
Longer decoding Easier decodingMem-to-mem operations Load/store architectureUse on-core microcode No microinstructions,
directly executed by HW logic
Less pipelineability Better pipelineabilityCloser semantic gap between high level code and assembly (shift complexity to microcode)
Needs smart compilers
Intel IA32, IBM 360, DEC VAX, Motorola 68030
MIPS, IBM 801, IBM PowerPC, Sun Sparc
Page 22
22
Other ISA Design Philosophy• VLIW (Very Long Instruction Word)
– A Dumb Machine with a Smart Compiler– Packing multiple (RISC-like) operation into one VLIW– Instruction scheduling performed completely by
compiler – Multiflow, Cydrome in the 80s and most of the digital
signal processor (DSP) today• EPIC (Explicit Parallel Instruction Computing)
– The return of the VLIW– With new features in the ISA such as
• Data and control speculation• Full Predication
– Intel/HP’s Itanium and Itanium 2 (or once called IA-64)