CSC 2400 Computer Systems I Lecture 9 Deeper into Assembly
Jan 04, 2016
CSC 2400Computer Systems I
Lecture 9
Deeper into Assembly
2
Recap Programs are compiled
– preprocessor– compiler– assembler– linker
High-level languages, which are independent of the machine, go through these steps
Machine language is the output, and is dependent on the target machine
2
3
Recall: Steps in the Build Process
• To build one step at a time:
• Why build one step at a time?– Helpful for learning how to interpret error messages– Permits partial builds (described later in course)
$ gcc –E circle.c > circle.i
$ gcc –S circle.i
$ gcc –c circle.s
$ gcc circle.o –o circle
Preprocess:circle.c → circle.i
Compile:circle.i → circle.s
Assemble:circle.s → circle.o
Link:circle.o → circle
3
High-Level Language
• Make programming easier by describing operations in a semi-natural language
• Increase the portability of the code
• One line may involve many low-level operations
• Examples: C, C++, Java
count = 0;while (n > 1) { count++; if (n & 1) n = n*3 + 1; else n = n/2;}
5
Assembly Language
• Tied to the specifics of the underlying machine
• Commands and names to make the code readable and writeable by humans
• E.g., IA-32 from Intel
mov EAX, EDXand EAX, 1je .else
jmp .endif
.else:
.endif:sar EDX, 1
mov EAX, EDXadd EDX, EAXadd EDX, EAXadd EDX, 1
add ECX, 1
.loop:cmp EDX, 1jle .endloop
jmp .loop.endloop:
mov ECX, 0
mov ECX, DWORD PTR count
mov EDX, DWORD PTR n
Machine Language
• Also tied to the underlying machine
• What the computer sees and deals with
• Every instruction is a sequence of one or more numbers
• All stored in memory on the computer, and read and executed
• Unreadable by humans
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
9222 9120 1121 A120 1121 A121 7211 0000
0000 0001 0002 0003 0004 0005 0006 0007
0008 0009 000A 000B 000C 000D 000E 000F
0000 0000 0000 FE10 FACE CAFE ACED CEDE
1234 5678 9ABC DEF0 0000 0000 F00D 0000
0000 0000 EEEE 1111 EEEE 1111 0000 0000
B1B2 F1F5 0000 0000 0000 0000 0000 0000
Goals for this lecture
• Deepen understanding of… Basics of computer architecture Relationship between C and assembly language IA-32 assembly language through an example
• Why do you need to know this?
Computer Architecture
Refresher
Recall: Von Neumann Model
MAR MDR
PC IR
General Architecture for a Computer
A Typical Computer
This model provides a good study framework.
Memory
12
• The only large storage area that CPU can access directly• Hence, any program executing must be in memory
Memory Hierarchy (read Think OS, Chapter 7)
A typical memory hierarchy. The numbers are very rough approximations.
Cache PrincipleThe more frequently data is accessed,
the faster the access should be.
TEXTDATA
HEAP
STACK
Memory (Unix Process Layout)
Addresses
Instructions
DataCentral Processing Unit
(CPU)
Memory
• Stores executable machine-language instructions (TEXT) Stores data (DATA, HEAP and STACK sections)
TEXTDATA
HEAP
STACK
Central Processing Unit (CPU)
Addresses
Instructions
Data
CPU Memory
ControlUnit ALU
Registers
• Control unit Fetch, decode, and execute
• Arithmetic and logic unit Execution of low-level operations
• Registers High-speed temporary storage
Registers (Ch. 15, section 15.2)
• Small amount of storage on the CPU Can be accessed more quickly than main memory
• Instructions move data in and out of registers Loading registers from main memory Storing registers to main memory
• Instructions manipulate the register contents Registers essentially act as temporary variables For efficient manipulation of the data
• Registers are the top of the memory hierarchy Ahead of main memory, disk, tape, …
HEAP
STACK
TEXTDATA
Addresses
Instructions
Data
EIP
Registers
Condition Codes
EAX
EBX
ECX
EDX
ESI
EDI
ESP
EBP
CF ZF SF OF
CPU Registers Memory
EIP Instruction PointerESP, EBP Reserved for special useEAX Always contains return value
IA-32 Architecture
EFLAGS
HEAP
STACK
TEXTDATA
CPU – Control Unit and ALU
Addresses
Instructions
Data
CPU Memory
ControlUnit ALU
Registers
• Control unit Fetch, decode, and execute
• Arithmetic and logic unit (ALU) Execution of low-level operations
Control Unit: Instruction Decoder
• Determines what operations need to take place Translate the machine-language instruction
• Control what operations are done on what data E.g., control what data are fed to the ALU E.g., enable the ALU to do multiplication or addition E.g., read from a particular address in memory
src1 src2
dst
operation flag/carryALU
Central Processing Unit (CPU)
• Runs the loop
Fetch-Decode-Execute
Fetch NextInstructionSTART
Execute
Instruction
ExecuteInstruction
Execute
InstructionHALT
Fetch Cycle Execute Cycle
DecodeInstructionSTART
Decode Cycle
Fetch the next instruction from memory Decode the instruction and fetch the operands Execute the instruction and store the result
Fetch-Decode-Execute Cycle
• Where is the “next instruction” held in the machine? a CPU register called the Instruction Pointer(EIP) holds the address
of the instruction to be fetched next
• Fetch cycle Copy instruction from memory into a register
• Decode cycle Decode instruction and fetch operands, if necessary
• Execute cycle Execute the instruction Increment PC by the instruction length after execution
C Code vs. Assembly Code
Examples from IA-32
Kinds of Instructions• Reading and writing data
count = 0 n
• Arithmetic and logic operations Increment: count++ Multiply: n * 3 Divide: n/2 Logical AND: n & 1
• Checking results of comparisons Is (n > 1) true or false? Is (n & 1) non-zero or zero?
• Changing the flow of control To the end of the while loop (if “n ≤ 1”) Back to the beginning of the loop To the else clause (if “n & 1” is 0)
count = 0;
while (n > 1) {
count++;
if (n & 1)
n = n*3 + 1;
else
n = n/2;
}
Variables in Registers
Registers
count ECXn EDX
count = 0;
while (n > 1) {
count++;
if (n & 1)
n = n*3 + 1;
else
n = n/2;
}
mov ECX, DWORD PTR count
mov EDX, DWORD PTR n
Immediate and Register Addressing
count=0;
while (n>1) {
count++;
if (n&1)
n = n*3+1;
else
n = n/2;
}
mov ECX, 0
add ECX, 1
Read directly from the
instruction
written to a register
count ECXn EDX
General Syntax
op Dest, Src
Perform operation op on Src and Dest
Save result in Dest
Immediate and Register Addressing
count=0;
while (n>1) {
count++;
if (n&1)
n = n*3+1;
else
n = n/2;
}
mov EAX, EDXand EAX, 1
Computing intermediate value in register EAX
count ECXn EDX
mov EAX, EDXadd EDX, EAXadd EDX, EAXadd EDX, 1
Immediate and Register Addressing
count=0;
while (n>1) {
count++;
if (n&1)
n = n*3+1;
else
n = n/2;
}
Adding n twice is cheaper than multiplication!
count ECXn EDX
sar EDX, 1
Immediate and Register Addressing
count=0;
while (n>1) {
count++;
if (n&1)
n = n*3+1;
else
n = n/2;
}
Shifting right by 1 bit is cheaper than division!
count ECXn EDX
Changing Program Flow
• Cannot simply run next instruction Check result of a previous operation Jump to appropriate next instruction
• Jump instructions Load new address in instruction pointer
• Example jump instructions Jump unconditionally (e.g., “}”) Jump if zero (e.g., “n & 1”) Jump if greater/less (e.g., “n > 1”)
count=0;
while (n>1) {
count++;
if (n&1)
n = n*3+1;
else
n = n/2;
}
Jump Instructions
• Jump depends on the result of previous arithmetic instruction.
Conditional and Unconditional Jumps
• Comparison cmp compares two integers Done by subtracting the first number from the second Discards the results, but sets flags as a side effect:
– cmp EDX, 1 (computes EDX – 1)– jle .endloop (checks whether result was 0 or negative)
• Logical operation and compares two integers: – and EAX, 1 (bit-wise AND of EAX with 1)– je .else (checks whether result was 0)
• Also, can do an unconditional branch jmp– jmp .endif – jmp .loop
…
.loop:cmp EDX, 1jle .endloop
jmp .loop.endloop:
Jump and Labels: While Loop
while (n>1) {
}
Checking if EDX is less than or
equal to 1.
count ECXn EDX
mov EAX, EDXand EAX, 1je .else
jmp .endif
.else:
.endif:sar EDX, 1
mov EAX, EDXadd EDX, EAXadd EDX, EAXadd EDX, 1
add ECX, 1
.loop:cmp EDX, 1jle .endloop
jmp .loop.endloop:
mov ECX, 0
Jump and Labels: While Loop
count=0;
while (n>1) {
count++;
if (n&1)
n = n*3+1;
else
n = n/2;
}
count ECXn EDX
mov EAX, EDXand EAX, 1je .else
jmp .endif
.else:
.endif:…
Jump and Labels: If-Then-Else
if (n&1)
...
else
...
“then” block
“else” block
…
count ECXn EDX
mov EAX, EDXand EAX, 1je .else
jmp .endif
.else:
.endif:sar EDX, 1
mov EAX, EDXadd EDX, EAXadd EDX, EAXadd EDX, 1
add ECX, 1
.loop:cmp EDX, 1jle .endloop
jmp .loop.endloop:
mov ECX, 0
Jump and Labels: If-Then-Else
count=0;
while(n>1) {
count++;
if (n&1)
n = n*3+1;
else
n = n/2;
}
“then” block
“else” block
count ECXn EDX
mov EAX, EDXand EAX, 1je .else
jmp .endif
.else:
.endif:sar EDX, 1
mov EAX, EDXadd EDX, EAXadd EDX, EAXadd EDX, 1
add ECX, 1
.loop:cmp EDX, 1jle .endloop
jmp .loop.endloop:
mov ECX, 0
Code More Efficient…
count=0;
while(n>1) {
count++;
if (n&1)
n = n*3+1;
else
n = n/2;
}
Replace with “jmp loop”
count ECXn EDX
mov EAX, EDXand EAX, 1je .else
jmp .endif
.else:
.endif:sar EDX, 1
mov EAX, EDXadd EDX, EAXadd EDX, EAXadd EDX, 1
add ECX, 1
.loop:cmp EDX, 1jle .endloop
jmp .loop.endloop:
mov ECX, 0
Complete Example
count=0;
while (n>1) {
count++;
if (n&1)
n = n*3+1;
else
n = n/2;
}
count ECXn EDX
Reading IA-32 Assembly Language
• Referring to a register: EAX, eax, EBX, ebx, etc.
• Result stored in the first argument E.g. “mov EAX, EDX” moves EDX into EAX E.g., “add ECX, 1” increments register ECX
• Assembler directives: starting with a period (“.”) E.g., “.section .text” to start the text section of memory
• Comment: pound sign (“#”) E.g., “# Purpose: Convert lower to upper case”
X86 C Example
int Example(int x){ ??? }
push EBPmov EBP, ESPmov ECX, [EBP+8]xor EAX, EAXxor EDX, EDXcmp EDX, ECXjge L2
L1: add EAX, EDXinc EDXcmp EDX, ECXjl L1
L2: mov ESP,EBPpop EBPret
# ECX = x# EAX = # EDX = # if# goto L2# EAX = # EDX = # if# goto L1
Write comments
L2:
L1:
Name the Variables
int Example(int x){ ??? }
push EBPmov EBP, ESPmov ECX, [EBP+8]xor EAX, EAXxor EDX, EDXcmp EDX, ECXjge L2
L1: add EAX, EDXinc EDXcmp EDX, ECXjl L1
L2: mov ESP,EBPpop EBPret
# ECX = x# result = # i = # if# goto L2# result = # i = # if# goto L1
L2:
L1:
EAX result, EDX i
Identify the Loop
result = 0;i = 0;if (i >= x) goto L2;
L1:result += i; i++; if (i < x) goto L1;L2:
Identify the Loop
result = 0;i = 0;if (i >= x) goto L2;
L1:result += i; i++; if (i < x) goto L1;L2:
result = 0; i = 0;if (i >= x) goto L1;do { result += i; i++;} while (i < x);L1:
result = 0; i = 0;while (i < x){ result += i; i++;}
result = 0;for (i = 0; i < x; i++) result += i;
C Code
int Example(int x){ int result=0; int i; for (i=0; i<x; i++) result += i; return result;}
Exercise
int F(int x, int y){ ??? }
push EBP mov EBP,ESP mov ECX, [EBP+8] mov EDX, [EBP+12] xor EAX, EAX cmp ECX, EDX jle .L1.L2: dec ECX inc EDX inc EAX cmp ECX,EDX jg .L2.L1: inc EAX mov ESP,EBP pop EBP ret
# ECX = x# EDX = y
.L1:
.L2:
C Code
int F(int x, int y)
Conclusions
• Hardware Memory is the only storage area CPU can access directly Executables are stored on the disk Fetch-Decode-Execute Cycle for running executables
• Assembly language In between high-level language and machine code Programming the “bare metal” of the hardware Loading and storing data, arithmetic and logic operations,
checking results, and changing control flow
• To get more familiar with IA-32 assembly Generate your own assembly-language code
– gcc –masm=intel –S code.c
X86 Addressing Modes
• Two Operand Instructions
ADD Dest, Src Dest = Dest + Src
SUB Dest, Src Dest = Dest - Src
MUL Dest, Src Dest = Dest * Src
SAL Dest, Src Dest = Dest << Src
SAR Dest, Src Dest = Dest >> Src Arithmetic
SHR Dest, Src Dest = Dest >> Src Logical
XOR Dest, Src Dest = Dest ^ Src
AND Dest, Src Dest = Dest & Src
OR Dest, Src Dest = Dest | Src
Arithmetic Instructions (1)
• One Operand Instructions
INC Dest Dest = Dest + 1
DEC Dest Dest = Dest – 1
NEG Dest Dest = -Dest
NOT Dest Dest = ~Dest
Arithmetic Instructions (2)
CMP Dest, Src
Compute Dest - Src without setting Dest
TEST Dest, Src
Compute Dest & Src without setting Dest
Compare and Test Instructions
Jump Instructions• Jump depending on the result of the previous arithmetic
instruction:
Loading and Storing Data
Addresses
Instructions
Data
EIP
Registers
Object CodeProgram Data
Stack
Condition Codes
EAX
EBX
ECX
EDX
ESI
EDI
ESP
EBP
CF ZF SF OF
CPU Memory
EIP Instruction PointerESP, EBP Reserved for special useEAX Always contains return value
IA-32 Architecture
EFLAGS
IA-32 General Purpose Registers
General-purpose registers
EAXEBXECXEDXESIEDI
31 0AXBXCXDX
16-bit 32-bit
DISI
ALAHBLCLDL
BHCHDH
8 715
Byte Order in Multi-Byte Entities
• Intel is a little endian architecture Least significant byte of multi-byte entity
is stored at lowest memory address “Little end goes first”
• Some other systems use big endian Most significant byte of multi-byte entity
is stored at lowest memory address “Big end goes first”
00000101000000000000000000000000
1000100110021003
The int 5 at address 1000:
00000000000000000000000000000101
1000100110021003
The int 5 at address 1000:
Little Endian Example
Byte 0: ff
Byte 1: 77
Byte 2: 33
Byte 3: 0
int main(void) {
int i=0x003377ff, j;
unsigned char *p = (unsigned char *) &i;
for (j=0; j<4; j++)
printf("Byte %d: %x\n", j, p[j]);
}
Output on a little-endian
machine
CMP AL, 5
JLE else
INC AL
jmp endif
else:
DEC AL
endif:
C Example: One-Byte Data
char i;
…
if (i > 5) {
i++;
else
i--;
}
Global char variable i is in AL, the lower byte of the “A” register.
CMP EAX, 5
JLE else
INC EAX
JMP endif
else:
DEC EAX
endif:
C Example: Four-Byte Data
int i;
…
if (i > 5) {
i++;
else
i--;
}
Global int variable i is in EAX, the full 32 bits of the “A” register.
Loading and Storing Data
• Processors have many ways to access data Known as “addressing modes” Two simple ways seen in previous examples
• Addressing Modes Register Immediate Direct Memory Base Memory (Base or Index) plus Displacement Memory (Base and Index) plus Displacement Memory
Register Addressing
• Registers embedded in the instruction
• Examples XOR EAX, EAX MOV EBX, ECX INC BH
Immediate Addressing
• Data embedded in the instruction
• Examples ADD EAX, 3 MOV AX, -40
Direct Memory Addressing
• Memory address embedded in the instruction
• Examples MOV AL, [2000] Read the byte from memory address 2000 Load the byte value in the register AL
• Note that 2000 refers to the constant value 2000 [2000] refers to the memory location at address 2000
(Indirect) Base Memory Addressing
Indirect Memory Addressing
• Load or store from a previously-computed address Register with the base address is in the instruction
• Examples MOV AX, [BX] (register addressing) CMP DL, [BX+8] (base+displacement addressing) MOV EAX,[EBX+ESI*4+8] (base+index+displacement)
66
Indexed Addressing Example
MOV EAX, 0
MOV EBX, 0
MOV ECX, OFFSET FLAT:a
sumloop:
ADD EBX, [ECX+EAX*4]
INC EAX
CMP EAX, 20
jne sumloop
int a[20];
…
int i, sum=0;
for (i=0; i<20; i++)
sum += a[i];
EAX: iEBX: sumECX: address of a[0]
global variable
LEA: Load Effective Address
•LEA Dest, Src Src is address mode expression Set Dest to address denoted by expression
• Example LEA EAX, [EBX+4*ESI] Load into EAX the value EBX+4*ESI
• Compare to MOV EAX, [EBX+4*ESI] Load into EAX the value stored in memory at address EBX+4*ESI
Using LEA for Arithmetic Expressions
int arith (int x, int y, int z){ int t1 = x+y; int t2 = z+t1; int t3 = x+4; int t4 = y * 48; int t5 = t3 + t4; int rval = t2 * t5; return rval;}
arith:PUSH EBPMOV EBP,ESP
MOV ECX, DWORD PTR [EBP+8]MOV EDX, DWORD PTR [EBP+12]LEA EAX, [EDX+EDX*2]SAL EDX, 4LEA EAX, [ECX+4+EAX]ADD EDX, ECXADD EDX, DWORD PTR [EBP+16]IMUL EAX,EDX
POP EBPRET
Body
Setup
Finish
Understanding arithint arith (int x, int y, int z){ int t1 = x+y; int t2 = z+t1; int t3 = x+4; int t4 = y * 48; int t5 = t3 + t4; int rval = t2 * t5; return rval;}
x is at address ebp+8y is at address ebp+12z is at address ebp+16
To be explained in next lecture
MOV ECX, DWORD PTR [EBP+8] ; ECX = MOV EDX, DWORD PTR [EBP+12] ; EDX = LEA EAX, [EDX+EDX*2] ; EAX = SAL EAX, 4 ; EAX = LEA EAX, [ECX+4+EAX] ; EAX = ADD EDX, ECX ; EDX = ADD EDX, DWORD PTR [EBP+16] ; EDX = IMUL EAX,EDX ; EAX =
Data Access Methods: Summary
• Immediate addressing: data stored in the instruction itself MOV ECX, 10
• Register addressing: data stored in a register MOV ECX, EAX
• Direct addressing: address stored in instruction MOV ECX, [200]
• Indirect addressing: address stored in a register MOV ECX, [EAX] MOV ECX, [EAX+4] MOV ECX, [EAX + ESI*4 + 12]
Data Transfer Instructions•MOV Dest, Src
General move instruction
•PUSH SrcPUSH EBX # equivalent instructions
SUB ESP, 4 MOV [ESP], EBX
•POP DestPOP ECX # equivalent instructions
MOV ECX, [ESP] ADD ESP, 4
ESP 17
ESP
17EBX
ESP44
ESP
44ECX