Advanced Procedures Assembly Language Programming Chapter 8.

Advanced Procedures

Assembly Language Programming

Chapter 8

Programming is:1. Exploring the problem space – Knowing what the

problem is (10%) - Requirements

2. Understanding the problem – Knowing what to program (10%) - Specification

3. Solving the problem – Knowing how to program it (20%)

Design = Data + Algorithms + Users (I/O)

4. Implementing the solution – Coding (20%)

5. Evaluating the solution – Testing (40%)

Unit, Integration, Performance/Stress, Usability, Security

Things to ConsiderLimitations, Constraints, Trade-offs

Performance (Efficiency), Space/Size

Data – Encoding, representation, size, portability

Architecture, Algorithm, Programming language

Libraries, Tools, Networks, Platform

Users, Security, Privacy

Faults, fault tolerance, error recovery, error handling, failures, downtime, overflow, underflow

This list is endless and grows with experience!

Program Design TechniquesTop-Down Design (functional decomposition) involves

the following:1. design your program before starting to code

2. break large tasks into smaller ones

3. use a hierarchical structure based on procedure calls

4. test individual procedures separately

The Problems …1. Assumes programmer has a strong understanding of

the necessary and correct architecture

2. Important decisions made early – mistakes thus costly

3. Assumes hierarchical structure is actually possible

4. All initial work on design, none on coding, nothing to show for large periods of time

5. Highly likely to fail or be ineffective on large projects

Program Design, AlternativelyBottom-Up Design (functional synthesis) involves the

following:1. Pick one small part of the program, write a procedure to

perform it

2. Repeat until a collection of small parts is formed

3. Write a procedure to join several small parts

4. Continue until all the parts implemented and joined

Build some Lego blocks, connect them, build some more …

Some comments …1. Design tends to emerge, not necessarily planned

2. Gets parts working sooner, easier to spot programming road-blocks

3. Hard to know how big the parts should be

4. Can ignore some parts if time runs out (and they are not important)

5. Not good for teams as no plan exists

6. Generally safer, but less overall structure

Reality of Programming

A mixture of bottom-up and top-down approachs

Top-down design – Some planning needed before you start

Bottom-up implementation – Don't over plan

Iterative methodologies – Get it to work then refine and improve

Opportunistic approaches – do what works, when it works

Much personal preference exists (strong sex-based differences as well)

Creating ProceduresLarge problems can be divided into smaller tasks to

make them more manageable

A procedure is the ASM equivalent of a Java Method, C/C++ Function, Basic Subroutine, or Pascal Procedure

Same thing as what is in the Irvine32 library

The following is an assembly language procedure named sample:

sample PROC… Code for procedure goes here …retsample ENDP

CALL and RET The CALL instruction calls a procedure 1. pushes offset of next instruction on the

stack (saves the value of the instruction pointer)

2. copies the address of the called procedure into EIP (puts the address of the procedure into the instruction pointer)

3. Begins to execute the code of the procedure

The RET instruction returns from a procedure1. pops top of stack into EIP (over-writes

instruction pointer with the value of the instruction after the call)

CALL-RET Examplemain PROC

00000020 call MySub00000025 mov eax,ebx..main ENDP

MySub PROC00000040 mov eax,edx..retMySub ENDP

0000025 is the offset of the instruction immediately following the CALL instruction

00000040 is the offset of the first instruction inside MySub

CALL-RET in Action00000025 ESP

EIP

00000040The CALL instruction pushes 00000025 onto the stack, and loads 00000040 into EIP

CALL = PUSH eip MOV EIP, OFFSET proc

00000025 ESP

EIP

00000025The RET instruction pops 00000025 from the stack into EIP

RET = POP eip

(stack shown before RET executes)

Nested Procedure Calls

main PROC . . call Sub1 exitmain ENDP

Sub1 PROC . . call Sub2 retSub1 ENDP

Sub2 PROC . . call Sub3 retSub2 ENDP

Sub3 PROC . . retSub3 ENDP

(ret to main)

(ret to Sub1)

(ret to Sub2) ESP

By the time Sub3 is called, the stack contains all three return addresses:

USES OperatorLists the registers that are used by a procedureMASM inserts code that will try to preserve them

ArraySum PROC USES esi ecxmov eax,0 ; set the sum to zeroetc.

MASM generates the code shown in gold

ArraySum PROCpush esipush ecx..pop ecxpop esiret

ArraySum ENDP

Terminologyint sum (int x, int y) { return (x+y);}…printf(“%d\n”, sum(2,3));

x, y are the parameters of the procedure called sum

2, 3 are the arguments for a specific call, invocation, instance of sum

Stack FrameAlso known as an activation record

Area of the stack set aside for a procedure's return address, passed parameters, saved registers, and local variables

Created by the following steps:Calling program pushes arguments on the stack and calls the procedure

The called procedure pushes EBP on the stack, and sets EBP to ESP

If local variables are needed, a constant is subtracted from ESP to make room on the stack

Registers are saved if they will be altered

Calling a ProcPush arguments (parameter values) on stack

Call proc (and push the return address on the stack)

Push EBP on the stack

Set EBP equal to ESP (points to its own location on stack)

Add stack space for local variables (“push space on stack")

Push registers to save on the stack

MEMORISE THIS SET OF STEPS!

Stack after the CALLThis is what we must buildVal/Address Arg 2: EBP

+ 12

Val/Address Arg 1: EBP + 8

Return Address

Old/Saved EBP

Local Variable 1: EBP - 4


EAX (saved)

…

EDX (saved)

Empty Space

EBP

ESP

Stack ParametersSometimes more convenient than register parameters

Any number of values can be pushed/passed

Values are in memory and don’t need to be moved there later

Two possible ways of calling DumpMempushadmov esi,OFFSET arraymov ecx,LENGTHOF arraymov ebx,TYPE arraycall DumpMempopad

push TYPE arraypush LENGTHOF arraypush OFFSET arraycall DumpMem

Passing ArgsPush arguments on stack

Suggestion: Use only 32-bit values in protected mode to keep the stack aligned – Don’t push random length strings!

1. By Value: Push the values on the stack

2. By Reference: Push the address (offsets) on the stack

Call the called-procedure

Put return value in eax

Return

Remove arguments from the stack if the called-procedure did not remove them

Example: Pass by Value

.dataval1 DWORD 5val2 DWORD 6

.codepush val2push val1

(val2) 6(val1) 5 ESP

Stack prior to CALL

Example: Pass by Reference

.dataval1 DWORD 5val2 DWORD 6

.codepush OFFSET val2push OFFSET val1

(offset val2) 00000004(offset val1) 00000000 ESP

Stack prior to CALL

Stack after the CALLThis is what we must buildVal/Address Arg 2: EBP

+ 12


Return Address

Old/Saved EBP



EAX (saved)

…

EDX (saved)

Empty Space

EBP

ESP

Simple ExampleThe ArrayFill procedure fills an array with 16-bit random integers

The calling program passes the address of the array, along with a count of the number of array elements:

.datacount = 100array WORD count DUP(?).code

push OFFSET arraypush COUNTcall ArrayFill

Passing an Array

ArrayFill PROCpush ebpmov ebp,esppushadmov esi,[ebp+12]mov ecx,[ebp+8].

offset(array)

count

EBP

[EBP + 8]

[EBP + 12]

return address

EBP

ESI points to the beginning of the array, so it's easy to use a loop to access each array element

ArrayFill can reference an array without knowing the arrays name

Stack Parameters (C/C++)

C and C++ functions access stack parameters using constant offsets from EBP

Example: [ebp + 8]

EBP is called the base pointer or frame pointer because it holds the base address of the stack frame.

EBP does not change value during the function.

EBP must be restored to its original value when a function returns.

RET InstructionReturn from subroutine

Pops stack into the instruction pointer (EIP or IP) -- Control transfers to the target address

Syntax:RETRET n

Optional operand n causes n bytes to be added to the stack pointer after EIP (or IP) is assigned a value

Adding n bytes to ESP “deletes” the pushed arguments from the stack

Stack after the CALLAll this has to go …



Return Address

Old/Saved EBP



EAX (saved)

…

EDX (saved)

Empty Space

EBP

ESP

Deleting ParametersIn the Windows StdCall Model the called

procedure uses ret n to remove the parameters

In C/C++ Model, the calling procedure removes them after the call has returned either by

1. Popping them off the stack

2. Adding n to ESP

ExampleProcedure Difference subtracts the first argument from the second one

Sample call:push 14 ; first argument

push 30 ; second argument

call Difference ; EAX = -16

Difference PROCpush ebpmov ebp,espmov eax,[ebp + 12] ; first argumentsub eax,[ebp + 8] ; second argumentpop ebpret 8 ; remove the 2 argsDifference ENDP

Passing 8/16-bit Arguments

Cannot push 8-bit values on stack

Pushing 16-bit operand may cause page fault or ESP alignment problem-- also incompatible with Windows API functions

Expand smaller arguments into 32-bit values, using MOVZX or MOVSX

.data charVal BYTE 'x'.code

movzx eax,charValpush eaxcall Uppercase

Passing Multiword Arguments

Push high-order values on the stack first and work backward in memory

Results in little-endian ordering of data

Example:.data

longVal DQ 1234567800ABCDEFh

.code

push DWORD PTR longVal + 4 ; high doubleword

push DWORD PTR longVal ; low doubleword

call WriteHex64

Saving & Restoring Registers

Push registers on stack just after assigning ESP to EBP to save registers that are modified inside the procedure

Remember: Don't overwrite the register containing the return value when you restore the registers!

MySub PROC

push ebp

mov ebp,esp

push ecx ; save local registers

push edx

Local VariablesOnly statements within subroutine can view or modify local variables

Storage used by local variables is released when subroutine ends

local variable name can have the same name as a local variable in another function without creating a name clash

Essential when writing recursive procedures, as well as procedures executed by multiple execution threads

Creating LOCAL VariablesExample - create two DWORD local variables:

Say: int x=10, y=20;

ret address

saved ebp EBP

10 (x) [ebp-4] MySub PROC

20 (y) [ebp-8]push ebpmov ebp,espsub esp,8 ;create 2 DWORD variables

mov DWORD PTR [ebp-4],10 ;[ebp-4] = x = 10mov DWORD PTR [ebp-8],20 ;[ebp-8] = y = 20

A Complete Procedureint swap (string s, index i1, index i2) { char c = s[i1]; s[i1] = s[i2]; s[i2] = c; return;}

(Address of) s: EBP+16

I1: EBP+12

I2: EBP+8

Return Address

Saved EBP (of main)

C: EBP-4

ESI

EDI

EAX

Empty Space…

A Complete Procedureswap PROC push ebp mov ebp,esp sub esp,4 push esi push edi push eax ;c = s[i1]; ;s[i1] = s[i2]; ;s[i2] = c; pop eax pop edi pop esi add esp,4 push OFFSET sbuf pop ebp push 0 ret 12 push (SIZE sbuf) - 1swap ENDP call swap


I1: EBP+12

I2: EBP+8

Return Address

Saved EBP (of main)

C: EBP-4

ESI

EDI

EAX

Empty Space

A Complete Procedureswap PROC push ebp mov ebp,esp sub esp,4 push esi push edi push eax ;do the swap pop eax pop edi pop esi add esp,4 pop ebp ret 12 swap ENDP

;esi = address of str[index1]mov esi,[ebp+16] add esi,[ebp+12];edi = address of str[index2]mov edi,[ebp+16]add edi,[ebp+8];c = str[index1]mov al,BYTE PTR [esi]mov [ebp-4],eax;str[index1] = str[index2]mov al, BYTE PTR [edi]mov BYTE PTR [esi],al;str[index2] = cmov eax,[ebp-4]mov BYTE PTR [edi],al

ENTER InstructionENTER partially creates a stack frame for a called proc1. pushes EBP on the stack2. sets EBP to the base of the stack frame3. reserves space for local variablesExample: MySub PROC

enter 8,0 ;0 = nesting level (not used here)

Equivalent to:MySub PROC

push ebp mov ebp,esp sub esp,8

LEAVE InstructionPartially removes the stack frame for a procedure

MySub PROCenter 8,0.........leaveret

MySub ENDP

push ebpmov ebp,espsub esp,8 ; 2 local DWORDs

mov esp,ebp ; free local spacepop ebp

Equivalent operations

A Simpler Procswap PROC enter 4,0 pushad ;c = s[i1]; ;s[i1] = s[i2]; ;s[i2] = c; popad leave ret 12swap ENDP

push OFFSET sbufpush 0push (SIZE sbuf) -1call swap


I1: EBP+12

I2: EBP+8

Return Address

Saved EBP (of main)

C: EBP-4

EAX

...

EDI

Empty Space

LOCAL DirectiveThe LOCAL directive declares a list of local variables

immediately follows the PROC directive

each variable is assigned a type

replaces ENTER and LEAVE

Syntax: LOCAL varlist

Example:MySub PROCLOCAL var1:BYTE, var2:WORD, var3:SDWORD

Using LOCAL

LOCAL t1:BYTE ; single character

LOCAL flagVals[20]:BYTE ; array of bytes

LOCAL pArray:PTR WORD ; pointer to an array

Note: inc BYTE PTR [esi]

Casts the contents of esi to a byte Different use of PTR!

Examples:

ExampleBubbleSort PROC

LOCAL temp:DWORD, SwapFlag:BYTE

. . .ret

BubbleSort ENDP

BubbleSort PROCpush ebpmov ebp,espadd esp,0FFFFFFF8h ; add -8 to ESP. . .mov esp,ebppop ebpret

BubbleSort ENDP

MASM generates the following code:

return address

EBP EBP

[EBP - 4]

ESP

temp

SwapFlag [EBP - 8]

Non-Doubleword Local Variables

Local variables can be different sizes1. 8-bit: assigned to next available byte

2. 16-bit: assigned to next even (word) boundary

3. 32-bit: assigned to next doubleword boundary

Be very careful – you can save space but it can add complexity and make code difficult to maintain and understand

Generally not worth the hassle

LEA InstructionLEA returns offsets of direct and indirect operands

OFFSET operator only returns constant offsets

LEA required when obtaining offsets of stack parameters and local variables (anything that isn't in the .data segment)

Example

CopyString PROC,LOCAL count:DWORD, temp[20]:BYTE

;mov edi,OFFSET count <<ERROR: invalid operand>>;mov esi,OFFSET temp <<ERROR: invalid operand>>lea edi,count ; oklea esi,temp ; ok

LEA ExampleSuppose you have a Local variable at [ebp-8]

And you need the address of that local variable in ESI

You cannot use this: mov esi, OFFSET [ebp-8] ; error

Use this instead:lea esi, [ebp-8] ; YAY!

Why? Because OFFSET uses assembly time info and you need lea to access runtime information

What is Recursion?The process created when either:

1. A procedure calls itself

2. A cycle exists: i.e., Procedure A calls procedure B, which in turn calls procedure A

Using a graph in which each node is a procedure and each edge is a procedure call, recursion forms a cycle:

A

B

D

E

C

Calculating a Sum

CalcSum PROCcmp ecx,0 ;check counter valuejz L2 ;quit if zeroadd eax,ecx ;otherwise, add to sumdec ecx ;decrement countercall CalcSum ;recursive callL2: retCalcSum ENDP

The CalcSum procedure recursively calculates the sum of an array of integers. Receives: ECX = count. Returns: EAX = sum

Call#

pre 1 2 3 4 5 6 post

ecx 5 5 4 3 2 1 0 0

eax 0 0 5 9 12 14 15 15

Calculating a Factorialint factorial(int n){ if (n == 0) { return(1); } else { return(n*factorial(n-1)); }} /*end factorial*/

5! = 5 * 4!

4! = 4 * 3!

3! = 3 * 2!

2! = 2 * 1!

1! = 1 * 0!

0! = 1

(base case)

1 * 1 = 1

2 * 1 = 2

3 * 2 = 6

4 * 6 = 24

5 * 24 = 120

1 = 1

recursive calls backing up

This function calculates the factorial of integer n. A new value of n is saved in each stack frame:

When each call instance returns, the product it returns is multiplied by the previous value of n.

Calculating a FactorialFactorial PROC

push ebpmov ebp,espmov eax,[ebp+8] ; get ncmp eax,0 ; n < 0?ja L1 ; yes: continuemov eax,1 ; no: return 1jmp L2

L1: dec eax

push eax ; Factorial(n-1)call Factorial

; Instructions from this point on execute when each; recursive call returns.

mov ebx,[ebp+8] ; get nmul ebx ; eax = eax * ebx

L2: pop ebp ; return EAX

ret 4 ; clean up stackFactorial ENDP

Calculating a Factorial

12 n

n-1

ReturnMain

ebp0

11

ReturnFact

ebp1

10

ReturnFact

ebp2

9

ReturnFact

ebp3

n-2

n-3

(etc...)

Suppose we want to calculate 12!

This diagram shows the first few stack frames created by recursive calls to Factorial

Each recursive call uses 12 bytes of stack space

This is NOT how you should use recursion!!!

A loop would use less memory, have fewer procedure calls, and run much, much, faster

INVOKE DirectiveThe INVOKE directive is a replacement for Intel’s CALL instruction that lets you pass multiple arguments

Syntax: INVOKE procedureName [, argumentList]

ArgumentList is an optional comma-delimited list of procedure arguments

Arguments can be:a) immediate valuesb) integer expressionsc) variable namesd) address and ADDR expressionse) register names

INVOKE Examples.data byteVal BYTE 10 wordVal WORD 1000h.code

; direct operands:INVOKE Sub1,byteVal,wordVal

; address of variable:INVOKE Sub2,ADDR byteVal

; register name, integer expression:INVOKE Sub3,eax,(10 * 20)

; address expression (indirect operand):INVOKE Sub4,[ebx]

ADDR Operator

.datamyWord WORD ?.codeINVOKE mySub,ADDR myWord

• Returns a near or far pointer to a variable, depending on which memory model your program uses:

• Small model: returns 16-bit offset• Large model: returns 32-bit segment/offset• Flat model: returns 32-bit offset

YUCK! It is better practice to be EXPLICIT and to not rely on the assembler to "get it right" for you! Many of these directives (INVOKE, ENTER, LEAVE) hide details and inspire mistakes.

• Example:

PROC Revealed … The PROC directive declares a procedure with an optional list of named parameters.

Syntax: label PROC paramList

paramList is a list of parameters separated by commas. Each parameter has the following syntax: paramName : type

type must either be:1. one of the standard ASM types (BYTE,

SBYTE, WORD, etc.)2. a pointer (offset/address) to one of these

types

PROC Directive Alternate format permits parameter list to be on one or more separate lines:label PROC,

paramList

The parameters can be on the same line . . .param-1:type-1, param-2:type-2, . . ., param-n:type-n

Or they can be on separate lines:param-1:type-1,

param-2:type-2,

. . .,

param-n:type-n

comma required

PROC Example 1

AddTwo PROC,val1:DWORD, val2:DWORD

mov eax,val1add eax,val2

retAddTwo ENDP

• The AddTwo procedure receives two integers and returns their sum in EAX

PROC Example 2

FillArray PROC,pArray:PTR BYTE, fillVal:BYTE, arraySize:DWORD

mov ecx,arraySizemov esi,pArraymov al,fillVal

L1: mov [esi],alinc esiloop L1ret

FillArray ENDP

FillArray receives a pointer to an array of bytes, a single byte fill value that will be copied to each element of the array, and the size of the array

PROC Example 3

ReadFile PROC,pBuffer:PTR BYTE,LOCAL fileHandle:DWORD. . .

ReadFile ENDP

Swap PROC,pValX:PTR DWORD,pValY:PTR DWORD. . .Swap ENDP

PROTO DirectiveCreates a procedure prototype

Syntax: label PROTO paramList

Every procedure called by the INVOKE directive must have a prototype

A complete procedure definition can also serve as its own prototype

PROTO DirectiveStandard configuration:

1. PROTO appears at top of the program listing2. INVOKE appears in the code segment3. the procedure implementation occurs later in the

program

MySub PROTO ;procedure prototype

.codeINVOKE MySub ;procedure call

MySub PROC ;procedure implementation; Does Stuff …

MySub ENDP

PROTO ExamplePrototype for the ArraySum procedure, showing its parameter list:

ArraySum PROTO,ptrArray:PTR DWORD, ; points to the arrayszArray:DWORD ; array size

The EndAnd they all wrote assembly language

happily ever after ...

Advanced Procedures Assembly Language Programming Chapter 8.

Documents