ECE 425

ECE 425

ECE 425Subroutines and Stacks1SubroutinesSeparate, independent module of program, performs a specific taskshortens code, provide reusable toolsHigh-level languages typically have libraries of subroutines.The objective is to avoid reinventing the wheel.22Subroutine Modularity3Main ProgramSubroutine 2Subroutine 1Subroutine 3Subroutine 2.1Subroutine 3.1Subroutine 3.1.1Sub. 3.2Subroutine TechniquesTechniquesKeep them shortMake them reusable (more on this coming)Main will frequently do nothing but call subroutines.Good subroutine PracticeIndependence: does not depend on other code, can be used in many programs.Registers: stores/restores key registers upon entrance/exit using Push and Pull commands (where?)Data and code location independentUse local variables, do not use hardcoded addresses.

4Nothing But SubroutinesWell written code will often consist of nothing but subroutines at the top:

mainBLGetPositionBLCalcOffsetBLDisplayData

Using BL presumes the subroutine code is within 32MB of the calling routine. This is almost always a fair assumption.

5The StackCalls to subroutines require use of the stack.The stack is a piece of memory dedicated to temporary storage of run time variablesWhat type of memory? RAM, ROM, DRAM, SRAM, register file?The stack is always organized as a LIFO queue.Last In, First OutPUSH an element onto the stack.POP one off.Mostly used for saving/restoring machine register.6

An example of stack operations7

Load/Store ArchitectureBeing a RISC machine, ARM processor does not have dedicated PUSH/POP instructions.Uses LDR/STR instead.Assembler may translate PUSH/POP mnemonics to load/store opcodesR13 is normally designated as the stack pointer (sp).Points to top of stack.Top can be either the next empty location or the last filled one. More flexibility than other architectures.8Push/Pull MultipleGoing to a subroutine may require saving/restoring a lot of registers.ARM has LDM/STM instruction to push/pull multiple registers with one opcode.Actual reads/writes are still done sequentially. Cant push/pull more than one register on any given clock cycle.Saves code space, saves fetching/decoding multiple load/store opcodes.9LDM/STMLoad/Store multiple ops can transfer between 1 and 16 registers to/from memory.Only 32 bit words, not valid for bytes or half wordsOrder of registers to be transferred can not be specified: reordering in transfer list will be ignoredLowest register number will be transferred to/from lowest addressContents of base register will be used to determine lowest address10LDM/STM Addressing ModesLDM and STM have special operating modes:IA: increment afterIB: increment beforeDA: decrement afterDB: decrement beforeNot for use with other instructionsPUSH/POP mnemonics are the same as LDMIA and STMDB. 11LDMIA ExampleLoad r0,r2,r4 with memory data starting at address 0xBEEF0000. LDR r1,= 0xBEEF0000LDMIA r1, {r0, r2, r4};mem32[r1] r0 ;mem32[r1 + 4] r2 ;mem32[r1 + 8] r4LDMIA r1, {r4, r0, r2}; EXACTLY the samer1 will remain unchanged. Effective addresses are generated on the fly.

12

STMDB ExampleSave values of r0,r2,r4 to the stack with top byte address 0xBEEF0000.LDR r1,= 0xBEEF0000STMDB r1, {r0, r2, r4}; r0 mem32[r1-4] ; r2 mem32[r1 - 8] ; r4 mem32[r1 - 12]STMDB r1, {r4, r0, r2}; EXACTLY the samer1 will remain unchanged. Effective addresses are generated on the fly.

13

Whats the Point?Can be used for block data transfers.Mostly used for context switches.Most common context switch is going to a subroutine.Save the current processor state so it can be restored after the subroutine.Can also be interrupt (to be covered shortly) or switching users/tasks in multi-threaded applications14What If You Want Base Register to Change?Previous examples all leave base address pointer alone.Effective address can be saved in it.Just append ! to base register.LDMIA r10!, r0; r10 will be incremented by 4 (one word)LDMIA r10! {r0-r3}; r10 will be incremented by 16

15Up/Down, Empty/FullMost processors have rigid definitions for stack ops.e.g., stacks always decrement on push, increment on pop. e.g., stack pointer always points to next address to be pushed.ARM leaves that up to the developer. Stack can grow up or down (i.e., address could increase or decrease)Stack pointer can point to next address to be filled or last address that was filled.16What kind of stack it is? 17

Up/Down, Empty/Full18r5r1r0STMIA r9!, {r0, r1, r5}r9 final value: 1018r5r1r01014100010041008100C1010STMDA r9!, {r0, r1, r5}r9 final value: 10001018r1r0r5r5r1r0STMIB r9!, {r0, r1, r5}r9 final value: 1018STMDB r9!, {r0, r1, r5}r9 final value: 1000In all cases, r9 starts out with 100C. If the ! was left out, r9 would always be 100C.IncreasingUp/Down, Empty/FullStack can be full or emptyFull stack pointer (sp) points to last address written toEmpty - stack pointer points to next available addressStack can be ascending or descending Ascending memory address goes up as data push inDescending memory address goes down as data push inHow to use the stack? Normally, LDM/STM work togetherStore register values before executing subroutine tasksLoad back original register values after subroutine tasks

19How to use the stack operations? A stack can be Full descending (FD)Full ascending (FA)Empty descending (ED)Empty ascending (EA)Select a type, and be consistent.20Stack TypePUSHPULLFull descendingSTMFD (STMDB)LDMFD (LDMIA)Full ascending STMFA (STMIB)LDMFA (LDMDA)Empty descending STMED (STMDA)LDMED (LDMIB)Empty ascending STMEA (STMIA)LDMEA (LDMDB)SubroutinesThe general structure of a subroutine in a program is:21

Using stack in subroutines22BL SUB1SUB1STMFDsp!,{r0-r2,r5};r0-r2, r5 are used in here;may be altered by SUB1LDMFDsp!,{r0-r2,r5}MOVpc, lrBL SUB1SUB1STMFDsp!,{r0-r2,r5}BL SUB2;calls SUB2 in SUB1;LDMFDsp!,{r0-r2,r5}MOVpc, lranything wrong?Subroutines NestingSubroutine may call another subroutine or itselfA routine that can call itself is recursive.Example: factorialFactorial(N) = N*factorial(N-1)Return address is stored in r14, which is transferred to pc when routine terminates.But if a subroutine is called recursively (or calls other subroutines), what happens?23Subroutine NestingMain calls subroutine A.Return address to main is stored in r14Subroutine A calls subroutine B. r14 is over written with new return address to A.Subroutine B finishes, r14 is transferred to PC.PC now points where? To the subroutine A.Start executing subroutine A again. Solution:Use the stack.Push link register before the next BL.Pop it upon return.Always push/pop link register value in subroutine24Previous Example25BL SUB1SUB1STMFDsp!,{r0-r2,r5, lr}BL SUB2;calls SUB2 in SUB1;LDMFDsp!,{r0-r2,r5, pc}MOVpc, lr;no longer necessary, why? ::BL SUB1::STMFD sp!,{regs,lr}:BL SUB2:LDMFD sp!,{regs,pc}SUB1SUB2:::::MOV pc, lrParameter PassingUsually a subroutine needs to operate on some data that has been set by the calling program.For flexibility, it would be poor practice to hardwire the passed parameters.For example, square root function that only calculates the square root of 8These data can be passed to the subroutine by Value or by Reference.

2626Call by ValueUses CPU registers: subroutine operates on the values of the associated registers.Values used by the routine are parameters.Call by value means the parameters are held in certain registers.Any/all registers r0 r12 may be used.ARM Application Procedure standard calls for them to be held in r0 r3.If thats not enough, use the stack, too.Calling program sets the registers, subroutine uses values found there.2727Saving register valuesIf the subroutine is going to operate on a register should that register be pushed before starting the subroutine?A subroutine may need registers for temporary storage, intermediate variables.Solution is to push any registers the routine will use, pop them when its done.Use SDM to save machine state, including link register.Use LDM to restore machine state, including link register (to program counter).28Call by ReferenceIf the parameters are in large amount, stored in memory, use Call-by-reference.For example: get average of 100 grades listed in a table.Used to manipulate data in memory.Data are stored in memory consecutivelyThe start address of the data block is passed to subroutineHow are memory addresses passed? 29By putting them in registers!

29Call by Reference Example30startaddrEQU0x5000;start address of block to be scrambledstopaddrEQU0x6000;stop scrambling hereLDRsp, =SRAM_BASELDRr0, = startaddrLDRr1, =stopaddrBLscrambleWhat will happen if sp is not initialized? 30ARM APCApplication Procedure Call Standard sets guidelines for subroutine behavior.Allows multiple programmers to write routines that wont corrupt other segmentsSays some registers may be changed, others must be preserved.Requires stack to be eight-byte aligned (top of stack address ends in 000) and full descending. Stack may be used for parameters and to preserve other register values that the routine would otherwise corrupt.31APC Registers32r0r1r2r3Pass parameters via r0 r3.r4r5r6r7r8r9r10r11Register r4 r 11 values must be preserved. r12r13r12 can be changed by the routine and does not need to be restored.r14r15Registers 13, 14 and 15 are the stack pointer, link register and program counter. Their functioning does not change in subroutines.

ECE 425

Documents

subroutine code

r2 mem32r1

mem32r1 r0 mem32r1

r4 mem32r1

pushpull multiple registers

registers tofrom memory

r4ldmia r1

0xbeef0000stmdb r1