This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
11
Architecture Revisions
1998 2000 2002 2004time
vers
ion
ARMv5
ARMv6
1994 1996 2006
V4
StrongARM® ARM926EJ-S™
XScaleTMARM102xE ARM1026EJ-S™
ARM9x6EARM92xT
ARM1136JF-S™
ARM7TDMI-S™
ARM720T™
XScale is a trademark of Intel Corporation
ARMv7
SC100™
SC200™
ARM1176JZF-S™
ARM1156T2F-S™
22
Data Sizes and Instruction Sets The ARM is a 32-bit architecture.
When used in relation to the ARM: Byte means 8 bits Halfword means 16 bits (two bytes) Word means 32 bits (four bytes)
Most ARM’s implement two instruction sets 32-bit ARM Instruction Set 16-bit Thumb Instruction Set
Jazelle cores can also execute Java bytecode
33
Processor Modes
The ARM has seven basic operating modes:
User : unprivileged mode under which most tasks run
FIQ : entered when a high priority (fast) interrupt is raised
IRQ : entered when a low priority (normal) interrupt is raised
Supervisor : entered on reset and when a Software Interrupt instruction is executed
Abort : used to handle memory access violations
Undef : used to handle undefined instructions
System : privileged mode using the same registers as user mode
44
r0r1r2r3r4r5r6r7r8r9r10r11r12
r13 (sp)r14 (lr)r15 (pc)
cpsr
r13 (sp)r14 (lr)
spsr
r13 (sp)r14 (lr)
spsr
r13 (sp)r14 (lr)
spsr
r13 (sp)r14 (lr)
spsr
r8r9r10r11r12
r13 (sp)r14 (lr)
spsr
FIQ IRQ SVC Undef Abort
User Mode r0r1r2r3r4r5r6r7r8r9r10r11r12
r13 (sp)r14 (lr)r15 (pc)
cpsr
r13 (sp)r14 (lr)
spsr
r13 (sp)r14 (lr)
spsr
r13 (sp)r14 (lr)
spsr
r13 (sp)r14 (lr)
spsr
r8r9r10r11r12
r13 (sp)r14 (lr)
spsr
Current Visible Registers
Banked out Registers
FIQ IRQ SVC Undef Abort
r0r1r2r3r4r5r6r7
r15 (pc)
cpsr
r13 (sp)r14 (lr)
spsr
r13 (sp)r14 (lr)
spsr
r13 (sp)r14 (lr)
spsr
r13 (sp)r14 (lr)
spsr
r8r9r10r11r12
r13 (sp)r14 (lr)
spsr
Current Visible Registers
Banked out Registers
User IRQ SVC Undef Abort
r8r9r10r11r12
r13 (sp)r14 (lr)
FIQ ModeIRQ Mode r0r1r2r3r4r5r6r7r8r9r10r11r12
r15 (pc)
cpsr
r13 (sp)r14 (lr)
spsr
r13 (sp)r14 (lr)
spsr
r13 (sp)r14 (lr)
spsr
r13 (sp)r14 (lr)
spsr
r8r9r10r11r12
r13 (sp)r14 (lr)
spsr
Current Visible Registers
Banked out Registers
User FIQ SVC Undef Abort
r13 (sp)r14 (lr)
Undef Mode r0r1r2r3r4r5r6r7r8r9r10r11r12
r15 (pc)
cpsr
r13 (sp)r14 (lr)
spsr
r13 (sp)r14 (lr)
spsr
r13 (sp)r14 (lr)
spsr
r13 (sp)r14 (lr)
spsr
r8r9r10r11r12
r13 (sp)r14 (lr)
spsr
Current Visible Registers
Banked out Registers
User FIQ IRQ SVC Abort
r13 (sp)r14 (lr)
SVC Mode r0r1r2r3r4r5r6r7r8r9r10r11r12
r15 (pc)
cpsr
r13 (sp)r14 (lr)
spsr
r13 (sp)r14 (lr)
spsr
r13 (sp)r14 (lr)
spsr
r13 (sp)r14 (lr)
spsr
r8r9r10r11r12
r13 (sp)r14 (lr)
spsr
Current Visible Registers
Banked out Registers
User FIQ IRQ Undef Abort
r13 (sp)r14 (lr)
Abort Mode r0r1r2r3r4r5r6r7r8r9r10r11r12
r15 (pc)
cpsr
r13 (sp)r14 (lr)
spsr
r13 (sp)r14 (lr)
spsr
r13 (sp)r14 (lr)
spsr
r13 (sp)r14 (lr)
spsr
r8r9r10r11r12
r13 (sp)r14 (lr)
spsr
Current Visible Registers
Banked out Registers
User FIQ IRQ SVC Undef
r13 (sp)r14 (lr)
The ARM Register Set
55
Vector Table
Exception Handling
When an exception occurs, the ARM: Copies CPSR into SPSR_<mode> Sets appropriate CPSR bits
Change to ARM state Change to exception mode Disable interrupts (if appropriate)
Stores the return address in LR_<mode> Sets PC to vector address
To return, exception handler needs to: Restore CPSR from SPSR_<mode> Restore PC from LR_<mode>This can only be done in ARM state.
Vector table can be at 0xFFFF0000 on ARM720T and on ARM9/10 family
devices
FIQIRQ
(Reserved)Data Abort
Prefetch AbortSoftware Interrupt
Undefined InstructionReset
0x1C0x180x140x100x0C0x080x040x00
6039v12
Program Status Registers
Condition code flags N = Negative result from ALU Z = Zero result from ALU C = ALU operation Carried out V = ALU operation oVerflowed
Sticky Overflow flag - Q flag Architecture 5TE/J only Indicates if saturation has
occurred
J bit Architecture 5TEJ only J = 1: Processor in Jazelle state
Interrupt Disable bits. I = 1: Disables the IRQ. F = 1: Disables the FIQ.
T Bit Architecture xT only T = 0: Processor in ARM state T = 1: Processor in Thumb state
Mode bits Specify the processor mode
2731
N Z C V Q28 67
I F T mode1623
815
5 4 024
f s x c U n d e f i n e dJ
7039v12
When the processor is executing in ARM state: All instructions are 32 bits wide All instructions must be word aligned Therefore the pc value is stored in bits [31:2] with bits [1:0] undefined (as instruction cannot be halfword or byte aligned)
When the processor is executing in Thumb state: All instructions are 16 bits wide All instructions must be halfword aligned Therefore the pc value is stored in bits [31:1] with bit [0] undefined (as instruction cannot be byte aligned)
When the processor is executing in Jazelle state: All instructions are 8 bits wide Processor performs a word access to read 4 instructions at once
Program Counter (r15)
88
ARM instructions can be made to execute conditionally by postfixing them with the appropriate condition code field. This improves code density and performance by reducing the number of forward branch instructions.
By default, data processing instructions do not affect the condition code flags but the flags can be optionally set by using “S”. CMP does not need “S”.
loop … SUBS r1,r1,#1 BNE loop
if Z flag clear then branch decrement r1 and set flags
Conditional Execution and Flags
99
Condition Codes
Not equalUnsigned higher or sameUnsigned lowerMinus
Equal
OverflowNo overflowUnsigned higherUnsigned lower or same
Positive or Zero
Less thanGreater thanLess than or equalAlways
Greater or equal
EQNECS/HSCC/LO
PLVS
HILSGELTGTLEAL
MI
VC
Suffix Description
Z=0C=1C=0
Z=1Flags tested
N=1N=0V=1V=0C=1 & Z=0C=0 or Z=1N=VN!=VZ=0 & N=VZ=1 or N=!V
The possible condition codes are listed below Note AL is the default and does not need to be specified
Address accessed by LDR/STR is specified by a base register with an offset
For word and unsigned byte accesses, offset can be: An unsigned 12-bit immediate value (i.e. 0 - 4095 bytes)
LDR r0, [r1, #8] A register, optionally shifted by an immediate value
LDR r0, [r1, r2]LDR r0, [r1, r2, LSL#2]
This can be either added or subtracted from the base register:LDR r0, [r1, #-8]LDR r0, [r1, -r2, LSL#2]
For halfword and signed halfword / byte, offset can be: An unsigned 8 bit immediate value (i.e. 0 - 255 bytes) A register (unshifted)
Choice of pre-indexed or post-indexed addressing Choice of whether to update the base pointer (pre-indexed only)
LDR r0, [r1, #-8]!
1919
Load/Store ExerciseAssume an array of 25 words. A compiler associates y with r1. Assume that the base address for the array is located in r2. Translate this C statement/assignment using just three instructions:
Most ARM cores do not offer integer divide instructions Division operations will be performed by C library routines or inline shifts
Multiply and Divide
2323
Branch : B{<cond>} label Branch with Link : BL{<cond>} subroutine_label
The processor core shifts the offset field left by 2 positions, sign-extends it and adds it to the PC ± 32 Mbyte range How to perform longer branches?
2831 24 0
Cond 1 0 1 L Offset
Condition fieldLink bit 0 = Branch
1 = Branch with link
232527
Branch instructions
2424
Register Usage
r8r9/sbr10/slr11
r12
r13/spr14/lrr15/pc
r0r1r2r3
r4r5r6r7Register
variablesMust be preserved
Arguments into functionResult(s) from function
otherwise corruptible(Additional parameters
passed on stack)
Scratch register
(corruptible)Stack PointerLink Register
Program Counter
The compiler has a set of rules known as a Procedure Call Standard that determine how to pass parameters to a function (see AAPCS)
CPSR flags may be corrupted by function call.Assembler code which links with compiled code must follow the AAPCS at external interfaces
The AAPCS is part of the new ABI for the ARM Architecture
Register
- Stack base- Stack limit if software stack checking selected
- R14 can be used as a temporary once value stacked- SP should always be 8-byte (2 word) aligned
25039v12
ARM Branches and Subroutines B <label>
PC relative. ±32 Mbyte range. BL <subroutine>
Stores return address in LR Returning implemented by restoring the PC from LR For non-leaf functions, LR will have to be stacked
STMFD sp!,{regs,lr}
:BL func2
:LDMFD sp!,{regs,pc}
func1 func2
::
BL func1::
:::::
MOV pc, lr
2626
PSR access
MRS and MSR allow contents of CPSR / SPSR to be transferred to / from a general purpose register or take an immediate value MSR allows the whole status register, or just parts of it
to be updated Interrupts can be enable/disabled and modes changed, by
writing to the CPSR Typically a read/modify/write strategy should be used:
MRS r0,CPSR ; read CPSR into r0BIC r0,r0,#0x80 ; clear bit 7 to enable IRQMSR CPSR_c,r0 ; write modified value to ‘c’ byte only
In User Mode, all bits can be read but only the condition flags (_f) can be modified
f s x c
2731
N Z C V Q28 67
I F T mode1623
15
5 4 024
J10 89 19
GE[3:0] E A IT cond_abcde
2727
AgendaIntroduction to ARM LtdFundamentals, Programmer’s Model, and Instructions
Core Family PipelinesAMBA
2828
Pipeline changes for ARM9TDMI
InstructionFetch Shift + ALU Memory
AccessRegWriteReg
ReadReg
DecodeFETCH DECODE EXECUTE MEMORY WRITE
ARM9TDMIARM or ThumbInst Decode
Reg Select
RegRead Shift ALU Reg
Write
ThumbARMdecompress
ARM decodeInstructionFetch
FETCH DECODE EXECUTE
ARM7TDMI
2929
ARM10 vs. ARM11 Pipelines
ARM11
Fetch1
Fetch2 Decode Issue
Shift ALU Saturate
Writeback
MAC1
MAC2
MAC3
AddressDataCache
1
DataCache
2
Shift + ALU MemoryAccess Reg
Write
FETCH DECODE EXECUTE MEMORY WRITE
Reg Read
Multiply
BranchPrediction
InstructionFetch
ISSUE
ARM or Thumb
InstructionDecode Multipl
y Add
ARM10
3030
AgendaIntroduction to ARM LtdFundamentals, Programmer’s Model, and InstructionsCore Family Pipelines
AMBA
31039v12
Example ARM-based System
16 bit RAM
8 bit ROM
32 bit RAM
ARMCore
I/OPeripherals
InterruptController
nFIQnIRQ
32039v12
High PerformanceARM processor
High-bandwidthon-chip RAM
HighBandwidthExternalMemory
Interface
DMABus Master
APBBridge
Timer
Keypad
UART
PIO
AHB
APB
High PerformancePipelinedBurst SupportMultiple Bus Masters