Lec18 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Instruction Set Architecture

ECE2030 Introduction to Computer Engineering

Lecture 18: Instruction Set Architecture

Prof. Hsien-Hsin Sean LeeProf. Hsien-Hsin Sean LeeSchool of Electrical and Computer EngineeringSchool of Electrical and Computer EngineeringGeorgia TechGeorgia Tech

2

Breakdown of a Computing Problem

Instruction Set Architecture (ISA)Instruction Set Architecture (ISA)

ProblemProblem AlgorithAlgorithmsms

Programming inProgramming inHigh-Level LanguageHigh-Level Language

Compiler/Assembler/Compiler/Assembler/LinkerLinker

System LevelSystem LevelHuman LevelHuman Level

System architectureSystem architecture

Target Machine Target Machine (one implementation)(one implementation)Micro-architectureMicro-architecture

Functional units/Functional units/Data Path Data Path

Gates Level Gates Level Design Design

TransistorsTransistors ManufacturingManufacturing

RTL Level RTL Level Logic Level Logic Level Circuit Level Circuit Level Silicon Level Silicon Level

3

Instruction Set Architecture (ISA)• An abstraction

– Interface between hardware and low-level software

– Alleviate programmers from specifying control signals to harness a machine

• Defined by– An Instruction Set– Software convention

• Independent from a specific internal implementation (microarchitecture + system architecture)

4

ISA design principles• Compatibility• Implementability• Programmability• Usability• Encoding efficiency

High Level Language

ISA

CompilerCompiler

… lw r2, mem[r7] add r3, r4, r2 st r3, mem[r8]

main() { int i,b,c,a[10]; for (i=0; i<10; i++)… a[2] = b + c*i;}

AssemblerAssembler

Binary code

5

General Purpose Computer

Central Processing UnitCentral Processing Unit(CPU)(CPU)

MemoryMemory

DataData &Instruction

0101 1001 1010 1001 1000 0100 1000 1110 1111 00110010 1011 1000 …… ……

A stored-program computer called EDVACEDVAC proposed in 1944 while developing ENIACENIAC, first general purpose computer

Contributors:Presper EckertJohn MauchlyJohn von Neumann

EDSACEDSAC built by Maurice Wilkes implements the first operational stored-program machine

Von Neumann Machine

6

Basic Operation 1000 1100 1110 0010 0000 0000 0000 00001000 1100 1110 0010 0000 0000 0000 0000 (= lw R2, mem[R7])

Instruction fetch from memory

Instruction Decoder/Microcode ROM

Datapath UnitData written back to memory

It’s called HarvardHarvard ArchitectureArchitecture (Mark-III/IV) if instruction and data memory are separated

MICROPROCESSORMICROPROCESSOR

7

Commercial ISA• CDC6600, IBM 360, DEC VAX (good old days,

360 is now IBM z-series)• x86 (Intel 32, Intel 64, AMD64), Itanium (IA-

64)• Sun Sparc• Xscale (PocketPC)• IBM PowerPC (Mac, BlueGene)• ARM, MIPS (embedded, MIPS once was

popular in workstations)

8

Basic Instruction Format (Assembly code)

• ISA defines a set of “architectural registersarchitectural registers” to avoid going to memory all the time– X86: EAX, EBX, ECX, EDX, ESI, EDI, EBP, ESP– MIPS: r0 to r31 and Hi, Lo (or sometimes we use alias to

show the software convention when using these registers)• Instruction main classes

– Arithmetic / Logical– Data transfer (load or store for different data sizes)– Change-of-flow

• Conditional branches• Unconditional branches (e.g. jump, subroutine calls.)

• Operands – Architectural registers– Memory addresses– Target address for change-of-flow

<instruction mnemonic><instruction mnemonic> <destination operand>, <source op>, <source op><destination operand>, <source op>, <source op>

9

MIPS Register AliasesRegister Names Usage by Software Convention

$0 $zero Hardwired to zero$1 $at Reserved by assembler

$2 - $3 $v0 - $v1 Function return result registers$4 - $7 $a0 - $a3 Function passing argument value registers

$8 - $15 $t0 - $t7 Temporary registers, caller saved$16 - $23 $s0 - $s7 Saved registers, callee saved$24 - $25 $t8 - $t9 Temporary registers, caller saved$26 - $27 $k0 - $k1 Reserved for OS kernel

$28 $gp Global pointer$29 $sp Stack pointer$30 $fp Frame pointer$31 $ra Return address (pushed by call instruction)$hi $hi High result register (remainder/div, high

word/mult)$lo $lo Low result register (quotient/div, low

word/mult)

10

Basic Instruction Format (Assembly code)

<instruction mnemonic><instruction mnemonic> <destination operand>, <source op>, <source op><destination operand>, <source op>, <source op>

R8 = R6 + R7 add $8, $6, $7or add $t0, $a1, $a2

R9 = R9 + 2004 addi $9, $9, 2004 R3 = R4 R5 xor $3, $4, $5

operationoperation MIPS assemblyMIPS assembly

R10 = R8 << R9 sllv $10, $8, $9 R24 = R15 >> 2 sra $24, $15, 2 (arith right shift)

R2 = mem[R3+100] lw $2, 100($3) mem[R3+100] = R2 sw $2, 100($3)

if (R2<R3) R4=1 else R4=0 slt $4, $2, $3 Procedural call jal _func $31=PC+4; go to address pointed label _func (assuming no delay slot)

11

MIPS R-format Encoding31

opcode rs rt rd

26 25 21 20 1615 11 10 6 5 0

shamt funct

0 0 0 0 0 0 0 1 1 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 00 031

opcode rs rt rd

26 25 21 20 1615 11 10 6 5 0

shamt funct

add $4, $3, $2 rt

rs

rd

0 0 0 0 0 0 0 1 1 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 00 0Encoding = 0x00622020

12

MIPS R-format Encoding31

opcode rs rt rd

26 25 21 20 1615 11 10 6 5 0

shamt funct

0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 1 1 0 0 1 1 1 0 0 0 0 0 00 031

opcode rs rt rd

26 25 21 20 1615 11 10 6 5 0

shamt funct

sll $3, $5, 7 shamt

rt

rd

0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 1 1 0 0 1 1 1 0 0 0 0 0 00 0Encoding = 0x000519C0

13

MIPS I-format Encoding31

opcode rs rt Immediate Value

26 25 21 20 1615 0

lw $5, 3000($2)Immediate

rs

rt

0 0 1 1 0 0 0 1 0 0 0 1 0 1 0 0 0 0 1 0 1 1 1 0 1 1 1 0 0 01 0Encoding = 0x8C450BB8

0 0 1 1 0 0 0 1 0 0 0 1 0 1 0 0 0 0 1 0 1 1 1 0 1 1 1 0 0 01 031

opcode rs rt

26 25 21 20 1615 0

Immediate Value

14

MIPS I-format Encoding31

opcode rs rt Immediate Value

26 25 21 20 1615 0

sw $5, 3000($2)Immediate

rs

rt

1 0 1 1 0 0 0 1 0 0 0 1 0 1 0 0 0 0 1 0 1 1 1 0 1 1 1 0 0 01 0Encoding = 0xAC450BB8

1 0 1 1 0 0 0 1 0 0 0 1 0 1 0 0 0 0 1 0 1 1 1 0 1 1 1 0 0 01 031

opcode rs rt

26 25 21 20 1615 0

Immediate Value

15

MIPS J-format Encoding31

opcode Target Address

26 0

jal 0x00400030Target

0 0 1 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 00 0Encoding = 0x0C10000C

0 0 1 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 00 031

opcode

26 25 0

Target Address

25

0000 0000 0100 0000 0000 0000 0011 0000XInstruction=4 bytesTarget Address

•jal will jump and pushreturn address in $ra ($31)•Use “jr $31” to return

16

JR and JALR• JALR (Jump And Link Register) and JR (Jump

Register)– Considered as R-type– Unconditional jump – JALR used for procedural call

0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 1 0 0 10 031

opcode rs 0 rd (default=31)

26 25 21 20 1615 11 10 6 5 0

0 funct jalr r2

jr r2 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 00 031

opcode rs 0

26 25 21 20 1615 11 10 6 5 0

0 funct0

17

Assembly Program Example.data.globl array

array: .word 0x12345678 .word 0x98765432 .word 0x66bbccdd .word 0x44332211

.text

.globl __start__start:

jal main# more code below

.globl mainmain:

la $8, arraylb $9, ($8)lb $10, 1($8)add $11, $9, $10sb $11, ($8)addiu $8, $8, 4lh $9, ($8)lhu $10, 2($8)add $11, $9, $10sh $11, ($8)addiu $8, $8, 4lw $9, ($8)lw $10, 4($8)sub $11, $9, $10sw $11, ($8)

18

Interface for System Services

Backup

20

ISA Design Philosophy

RISC Reduced Instruction Set Computers

versus

CISC Complex Instruction Set Computers

• IBM 801 led by John Cocke pioneered RISC concept• Berkeley’s RISC-I and Stanford’s MIPS led the first academic implementations

21

RISC versus CISC• Why CISC?

– Memory are expensive and slow back then

– Cramming more functions into one instruction

– Using microcode ROM (μROM) for “complex” operations

• Justification for RISC– Complex apps are mostly

composed of simple assignments

– RAM speed catching up– Compiler (human) getting

smarter– Frequency shorter

pipe stages (also easier to design a regular pipeline)

CISC RISCVariable length instructions

Fixed-length instructions

Abundant instructions and addressing modes

Fewer instructions and addressing modes

Longer decoding Easier decodingMem-to-mem operations Load/store architectureUse on-core microcode No microinstructions,

directly executed by HW logic

Less pipelineability Better pipelineabilityCloser semantic gap between high level code and assembly (shift complexity to microcode)

Needs smart compilers

Intel IA32, IBM 360, DEC VAX, Motorola 68030

MIPS, IBM 801, IBM PowerPC, Sun Sparc

22

Other ISA Design Philosophy• VLIW (Very Long Instruction Word)

– A Dumb Machine with a Smart Compiler– Packing multiple (RISC-like) operation into one VLIW– Instruction scheduling performed completely by

compiler – Multiflow, Cydrome in the 80s and most of the digital

signal processor (DSP) today• EPIC (Explicit Parallel Instruction Computing)

– The return of the VLIW– With new features in the ISA such as

• Data and control speculation• Full Predication

– Intel/HP’s Itanium and Itanium 2 (or once called IA-64)

Lec18 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Instruction Set Architecture

Devices & Hardware