Top Banner
ARM Introduction & Instruction Set Architecture Aleksandar Milenkovic E-mail: milenka @ ece . uah . edu Web: http://www. ece . uah . edu /~ milenka
31

ARM Introduction & Instruction Set Architecture

Feb 25, 2016

Download

Documents

Gotzon

ARM Introduction & Instruction Set Architecture. Aleksandar Milenkovic E-mail: [email protected] Web: http://www.ece.uah.edu/~milenka. Outline. ARM Architecture ARM Organization and Implementation ARM Instruction Set Architectural Support for High-level Languages - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 2: ARM Introduction & Instruction Set Architecture

2

Outline ARM Architecture ARM Organization and Implementation ARM Instruction Set Architectural Support for High-level Languages Thumb Instruction Set Architectural Support for System Development ARM Processor Cores Memory Hierarchy Architectural Support for Operating Systems ARM CPU Cores Embedded ARM Applications

Page 3: ARM Introduction & Instruction Set Architecture

3

ARM History ARM – Acorn RISC Machine (1983 – 1985)

Acorn Computers Limited, Cambridge, England ARM – Advanced RISC Machine 1990

ARM Limited, 1990 ARM has been licensed to many semiconductor

manufacturers

Page 4: ARM Introduction & Instruction Set Architecture

4

ARM’s visible registers User level

15 GPRs, PC, CPSR (current program status register)

Remaining registers are used for system-level programming and for handling exceptions

r13_und r14_und r14_irq

r13_irq

SPSR_und

r14_abt r14_svc

user mode fiqmode

svcmode

abortmode

irqmode

undefinedmode

usable in user mode

system modes only

r13_abt r13_svc

r8_fiqr9_fiq

r10_fiqr11_fiq

SPSR_irq SPSR_abt SPSR_svc SPSR_fiqCPSR

r14_fiqr13_fiqr12_fiq

r0r1r2r3r4r5r6r7r8r9r10r11r12r13r14r15 (PC)

Page 5: ARM Introduction & Instruction Set Architecture

5

ARM CPSR format N (Negative), Z (Zero), C (Carry), V (oVerflow) mode – control processor mode T – control instruction set

T = 1 – instruction stream is 16-bit Thumb instructions

T = 0 – instruction stream is 32-bit ARM instructions I F – interrupt enables

N Z C V unused mode31 28 27 8 7 6 5 4 0

I F T

Page 6: ARM Introduction & Instruction Set Architecture

6

ARM memory organization Linear array of bytes numbered

from 0 to 232 – 1 Data items

bytes (8 bits) half-words (16 bits) – always

aligned to 2-byte boundaries (start at an even byte address)

words (32 bits) – always aligned to 4-byte boundaries (start at a byte address which is multiple of 4)

half-word4

word16

0123

4567

891011

byte0byte

12131415

16171819

20212223

byte1byte2

half-word14

byte3

byte6

address

bit 31 bit 0

half-word12

word8

Page 7: ARM Introduction & Instruction Set Architecture

7

ARM instruction set Load-store architecture

operands are in GPRs load/store – only instructions that operate with memory

Instructions Data Processing – use and change only register values Data Transfer – copy memory values into registers

(load) or copy register values into memory (store) Control Flow

o branch o branch-and-link –

save return address to resume the original sequenceo trapping into system code – supervisor calls

Page 8: ARM Introduction & Instruction Set Architecture

8

ARM instruction set (cont’d) Three-address data processing instructions Conditional execution of every instruction Powerful load/store multiple register instructions Ability to perform a general shift operation and a

general ALU operation in a single instruction that executes in a single clock cycle

Open instruction set extension through coprocessor instruction set, including adding new registers and data types to the programmer’s model

Very dense 16-bit compressed representation of the instruction set in the Thumb architecture

Page 9: ARM Introduction & Instruction Set Architecture

9

I/O system I/O is memory mapped

internal registers of peripherals (disk controllers, network interfaces, etc) are addressable locations within the ARM’s memory map and may be read and written using the load-store instructions

Peripherals may use either the normal interrupt (IRQ) or fast interrupt (FIQ) input normally most interrupt sources share the IRQ input,

while just one or two time-critical sources are connected to the FIQ input

Some systems may include external DMA hardware to handle high-bandwidth I/O traffic

Page 10: ARM Introduction & Instruction Set Architecture

10

ARM exceptions ARM supports a range of interrupts, traps, and supervisor calls –

all are grouped under the general heading of exceptions Handling exceptions

current state is saved by copying the PC into r14_exc and CPSR into SPSR_exc (exc stands for exception type)

processor operating mode is changed to the appropriate exception mode

PC is forced to a value between 0016 and 1C16, the particular value depending on the type of exception

instruction at the location PC is forced to (the vector address) usually contains a branch to the exception handler; the exception handler will use r13_exc, which is normally initialized to point to a dedicated stack in memory, to save some user registers

return: restore the user registers and then restore PC and CPSR atomically

Page 11: ARM Introduction & Instruction Set Architecture

11

ARM cross-development toolkit Software development

tools developed by ARM Limited

public domain tools (ARM back end for gcc C compiler)

Cross-development tools run on different

architecture from one for which they produce code

assemblerC compiler

C source asm source

.aof

C libraries

linker

.axf

ARMsd

debug

ARMulator development

system model

board

objectlibraries

Page 12: ARM Introduction & Instruction Set Architecture

12

Outline ARM Architecture ARM Assembly Language Programming ARM Organization and Implementation ARM Instruction Set Architectural Support for High-level Languages Thumb Instruction Set Architectural Support for System Development ARM Processor Cores Memory Hierarchy Architectural Support for Operating Systems ARM CPU Cores Embedded ARM Applications

Page 13: ARM Introduction & Instruction Set Architecture

13

ARM Instruction Set Data Processing Instructions Data Transfer Instructions Control flow Instructions

Page 14: ARM Introduction & Instruction Set Architecture

14

Data Processing Instructions Classes of data processing instructions

Arithmetic operations Bit-wise logical operations Register-movement operations Comparison operations

Operands: 32-bits wide;there are 3 ways to specify operands come from registers the second operand may be a constant (immediate) shifted register operand

Result: 32-bits wide, placed in a register long multiply produces a 64-bit result

Page 15: ARM Introduction & Instruction Set Architecture

15

Data Processing Instructions (cont’d)

ADD r0, r1, r2 r0 := r1 + r2ADC r0, r1, r2 r0 := r1 + r2 + CSUB r0, r1, r2 r0 := r1 - r2SBC r0, r1, r2 r0 := r1 - r2 + C - 1RSB r0, r1, r2 r0 := r2 – r1RSC r0, r1, r2 r0 := r2 – r1 + C - 1

Arithmetic Operations Bit-wise Logical Operations

AND r0, r1, r2 r0 := r1 and r2ORR r0, r1, r2 r0 := r1 or r2EOR r0, r1, r2 r0 := r1 xor r2BIC r0, r1, r2 r0 := r1 and (not)

r2

Register MovementMOV r0, r2 r0 := r2MVN r0, r2 r0 := not r2

Comparison OperationsCMP r1, r2 set cc on r1 - r2CMN r1, r2 set cc on r1 + r2TST r1, r2 set cc on r1 and

r2TEQ r1, r2 set cc on r1 xor r2

Page 16: ARM Introduction & Instruction Set Architecture

16

Data Processing Instructions (cont’d) Immediate operands:

immediate = (0->255) x 22n, 0 <= n <= 12

Shifted register operands the second operand is subject to a shift operation

before it is combined with the first operand

ADD r3, r2, r1, LSL #3

r3 := r2 + 8 x r1

ADD r5, r5, r3, LSL r2 r5 := r5 + 2r2 x r3

ADD r3, r3, #3 r3 := r3 + 3AND r8, r7, #&ff r8 := r7[7:0], & for hex

Page 17: ARM Introduction & Instruction Set Architecture

17

ARM shift operations LSL – Logical Shift Left LSR – Logical Shift Right ASR – Arithmetic Shift

Right ROR – Rotate Right RRX – Rotate Right

Extended by 1 place

031

00000

LSL #5

031

00000

LSR #5

031

11111 1

ASR #5 , negative operand

031

00000 0

ASR #5 , positive operand

0 1

031

ROR #5

031

RRX

C

C C

Page 18: ARM Introduction & Instruction Set Architecture

18

Setting the condition codes Any DPI can set the condition codes (N, Z, V, and C)

for all DPIs except the comparison operations a specific request must be made

at the assembly language level this request is indicated by adding an `S` to the opcode

Example (r3-r2 := r1-r0 + r3-r2)

Arithmetic operations set all the flags (N, Z, C, and V) Logical and move operations set N and Z

preserve V and either preserve C when there is no shift operation, or set C according to shift operation (fall off bit)

ADDS r2, r2, r0ADC r3, r3, r1

; carry out to C; ... add into high word

Page 19: ARM Introduction & Instruction Set Architecture

19

Multiplies Example (Multiply, Multiply-Accumulate)

Note least significant 32-bits are placed in the result register,

the rest are ignored immediate second operand is not supported result register must not be the same

as the first source register if `S` bit is set the V is preserved and

the C is rendered meaningless Example (r0 = r0 x 35)

ADD r0, r0, r0, LSL #2 ; r0’ = r0 x 5RSB r3, r3, r1 ; r0’’ = 7 x r0’

MUL r4, r3, r2 r4 := [r3 x r2]<31:0>

MLA r4, r3, r2, r1 r4 := [r3 x r2 + r1] <31:0>

Page 20: ARM Introduction & Instruction Set Architecture

20

Data transfer instructions Single register load and store instructions

transfer of a data item (byte, half-word, word) between ARM registers and memory

Multiple register load and store instructions enable transfer of large quantities of data used for procedure entry and exit, to save/restore

workspace registers, to copy blocks of data around memory

Single register swap instructions allow exchange between a register and memory

in one instruction used to implement semaphores to ensure mutual

exclusion on accesses to shared data in multis

Page 21: ARM Introduction & Instruction Set Architecture

21

Data Transfer Instructions (cont’d)

LDR r0, [r1] r0 := mem32[r1]STR r0, [r1] mem32[r1] := r0Note: r1 keeps a word address (2 LSBs are 0)

LDR r0, [r1, #4]

r0 := mem32[r1 +4]

Register-indirect addressing

Base+offset addressing (offset of up to 4Kbytes)

LDR r0, [r1, #4]!

r0 := mem32[r1 + 4]r1 := r1 + 4

Auto-indexing addressing

LDR r0, [r1], #4

r0 := mem32[r1]r1 := r1 + 4

Post-indexed addressing

LDRB r0, [r1]

r0 := mem8[r1]Note: no restrictions for r1

Single register load and store

Page 22: ARM Introduction & Instruction Set Architecture

22

Data Transfer Instructions (cont’d)COPY: ADR r1, TABLE1 ; r1 points to TABLE1

ADR r2, TABLE2 ; r2 points to TABLE2LOOP: LDR r0, [r1]

STR r0, [r2]ADD r1, r1, #4ADD r2, r2, #4...

TABLE1: ...TABLE2:... COPY: ADR r1, TABLE1 ; r1 points to

TABLE1ADR r2, TABLE2 ; r2 points to

TABLE2LOOP: LDR r0, [r1], #4

STR r0, [r2], #4...

TABLE1: ...TABLE2:...

Page 23: ARM Introduction & Instruction Set Architecture

23

Data Transfer Instructions

Block copy view data is to be stored above

or below the the address held in the base register

address incrementing or decrementing begins before or after storing the first value

LDMIA r1, {r0, r2, r5}

r0 := mem32[r1]r2 := mem32[r1 + 4]r5 := mem32[r1 + 8]

Note: any subset (or all) of the registers may be transferred with a single instruction

Note: the order of registers within the list is insignificant

Note: including r15 in the list will cause a change in the control flow

Multiple register data transfers

Stack organizations FA – full ascending EA – empty ascending FD – full descending ED – empty descending

Page 24: ARM Introduction & Instruction Set Architecture

24

Multiple register transfer addressing modes

r5r1

r9’

r0r9

STMIA r9!, {r0,r1,r5}

100016

100c 16

1018 16

r1r5r9

STMDA r9!, {r0,r1,r5}

r0r9’ 100016

100c 16

1018 16

r5r9

STMDB r9!, {r0,r1,r5}

r1r0r9’ 100016

100c 16

1018 16

r5r1r0

r9’

r9

STMIB r9!, {r0,r1,r5}

100016

100c 16

1018 16

Page 25: ARM Introduction & Instruction Set Architecture

25

The mapping between the stack and block copy views

Page 26: ARM Introduction & Instruction Set Architecture

26

Control flow instructions

Page 27: ARM Introduction & Instruction Set Architecture

27

Conditional execution Conditional execution to avoid branch instructions

used to skip a small number of non-branch instructions

ExampleCMP r0, #5 ; BEQ BYPASS ; if (r0!=5) {ADD r1, r1, r0 ; r1:=r1+r0-

r2SUB r1, r1, r2 ; }

BYPASS: ...

CMP r0, #5 ; ADDNE r1, r1, r0 ;

SUBNE r1, r1, r2 ; ...

With conditional execution

Note: add 2 –letter condition after the 3-letter opcode

; if ((a==b) && (c==d)) e++;

CMP r0, r1CMPEQ r2, r3ADDEQ r4, r4, #1

Page 28: ARM Introduction & Instruction Set Architecture

28

Branch and link instructions Branch to subroutine (r14 serves as a link register)

Nested subroutines

BL SUBR ; branch to SUBR.. ; return here

SUBR: .. ; SUBR entry pointMOV pc, r14 ; return

BL SUB1 ..SUB1: ; save work and link register

STMFD r13!, {r0-r2,r14} BL SUB2..LDMFD r13!, {r0-r2,pc}

SUB2: ..MOV pc, r14 ; copy r14 into r15

Page 29: ARM Introduction & Instruction Set Architecture

29

Supervisor calls Supervisor is a program which operates at a

privileged level – it can do things that a user-level program cannot do directly Example: send text to the display

ARM ISA includes SWI (SoftWare Interrupt); output r0[7:0]

SWI SWI_WriteC; return from a user program back to monitorSWI SWI_Exit

Page 30: ARM Introduction & Instruction Set Architecture

30

Jump tables Call one of a set of subroutines depending on a

value computed by the programBL JTAB...

JTAB: CMP r0, #0BEQ SUB0CMP r0, #1BEQ SUB1CMP r0, #2BEQ SUB2

Note: slow when the list is long, and all subroutines are equally frequent

BL JTAB...

JTAB: ADR r1, SUBTABCMP r0, #SUBMAX ; overrun?LDRLS pc, [r1, r0, LSL #2]B ERROR

SUBTAB:DCD SUB0DCD SUB1DCD SUB2...

Page 31: ARM Introduction & Instruction Set Architecture

31

Hello ARM World!AREA HelloW, CODE, READONLY ; declare code area

SWI_WriteC EQU &0 ; output character in r0SWI_Exit EQU &11 ; finish program

ENTRY ; code entry pointSTART: ADR r1, TEXT ; r1 <- Hello ARM World!LOOP: LDRB r0, [r1], #1 ; get the next byte

CMP r0, #0 ; check for text endSWINE SWI_WriteC ; if not end of string, print BNE LOOPSWI SWI_Exit ; end of execution

TEXT = “Hello ARM World!”, &0a, &0d, 0END