Top Banner
The Assembly Language Level Part B – The Assembly Process
42
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: The Assembly Language Level Part B – The Assembly Process.

The Assembly Language Level

Part B – The Assembly Process

Page 2: The Assembly Language Level Part B – The Assembly Process.

Specifying numeric valuesJava

int i = 10;int j = 0x10;int k = 010;

System.out.println( i );System.out.println( j );

System.out.println( k );

What appears?

MASM

i dword10dj dword10hk dword10oel =20Dm =20Hn =20O

C/C++

int i = 10;int j = 0x10;int k = 010;#define el 20#define m 0x20#define n 020

Page 3: The Assembly Language Level Part B – The Assembly Process.

Specifying numeric valuesJava

int i = 10;int j = 0x10;int k = 010;

System.out.println( i );System.out.println( j );

System.out.println( k );

What appears?

MASM

i dword10dj dword10hk dword10oel =20Dm =20Hn =20O

C/C++

int i = 10;int j = 0x10;int k = 010;#define el 20#define m 0x20#define n 020

10168

Page 4: The Assembly Language Level Part B – The Assembly Process.

FORWARD REFERENCES(when a symbol is used before it is defined)

Page 5: The Assembly Language Level Part B – The Assembly Process.

Two-pass assembler (translator)class test { public static void main ( String args[] ) { System.out.println( "k=" + k ); f(); }

static int k = 72;

static void f ( ) { System.out.println( "in f()" ); }}

valid forward references

Page 6: The Assembly Language Level Part B – The Assembly Process.

Two-pass assembler (translator)

class test {

public static void main ( String args[] ) { System.out.println( "k=" + k ); int k = 72; }

}

invalid forward reference

Page 7: The Assembly Language Level Part B – The Assembly Process.

Two-pass assembler (translator)

• Forward reference– symbol is used before it is defined

• Solutions:1. (One-pass translators make only one pass and don’t allow

any forward references at all.)2. Make two passes (read the source file twice).

• The first pass collects symbol definitions and label locations.• The second pass uses the table built in the first pass.

3. Make one pass over the source and produce an intermediate form.• A second pass is made over this intermediate data.

Page 8: The Assembly Language Level Part B – The Assembly Process.

PASS ONE

Page 9: The Assembly Language Level Part B – The Assembly Process.

Pass one

• Purpose: build the symbol table– a table containing:

• the name of the symbol and its value• or the name of the label and its location

– ILC = Instruction Location Counter– Variable set to 0 and incremented by instruction (or data)

length for each line of code.– Note: code or data may both be labeled.

Page 10: The Assembly Language Level Part B – The Assembly Process.

Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006

Pearson Education, Inc. All rights reserved. 0-13-148521-0

Pass One: Two-Pass Assemblers

The instruction location counter (ILC) keeps track of the address where the instructions will be loaded in memory.

In this example, the statements prior to MARIA occupy 100 bytes.

Page 11: The Assembly Language Level Part B – The Assembly Process.

Pass one

• Symbol types:1. Symbol and corresponding value

• Assigned by a pseudoinstruction• Ex. bufsize equ 8192

2. Label and corresponding location;CF flagtest ebx, CF ;carry set?jz cf0 ;jump if 0print SADD(" CF:1") ;flag is onjmp nxt ;skip over else part

cf0:print SADD(" CF:0") ;flag is off

nxt:

Page 12: The Assembly Language Level Part B – The Assembly Process.

Pass one

• Employs 3 (or sometimes 4) tables:1. Symbol table2. Pseudoinstruction table3. Opcode table4. Literal table (optional)

Page 13: The Assembly Language Level Part B – The Assembly Process.

Pass one

• Employs 3 tables:1. Symbol table

• Symbol name• Value (ILC for labels; defn/value for equ)• Length of data field (especially for strings)• Relocation bits (relocatable?)• Scope of symbol

2. Pseudoinstruction table3. Opcode table4. Literal table (optional)

Page 14: The Assembly Language Level Part B – The Assembly Process.

Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006

Pearson Education, Inc. All rights reserved. 0-13-148521-0

Pass One: Two-Pass Assemblers

A symbol table for the program of Fig. 7-7.

Page 15: The Assembly Language Level Part B – The Assembly Process.

Pass one

• Employs 3 tables:1. Symbol table2. Pseudoinstruction table

db 1dw 2dd 4

3. Opcode table4. Literal table (optional)

Page 16: The Assembly Language Level Part B – The Assembly Process.

Pass one

• Employs 3 tables:1. Symbol table2. Pseudoinstruction table3. Opcode table4. Literal table (optional)

Page 17: The Assembly Language Level Part B – The Assembly Process.

Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006

Pearson Education, Inc. All rights reserved. 0-13-148521-0

Pass one: Two-Pass Assemblers

A few excerpts from the opcode table for a Pentium 4 assembler.

Page 18: The Assembly Language Level Part B – The Assembly Process.

Pass one

• Employs 3 tables:1. Symbol table2. Pseudoinstruction table3. Opcode table4. Literal table (optional)

• for pseudoimmediate instructions– when there is not any support for immediate instructions– only loads from memory are allowed

Page 19: The Assembly Language Level Part B – The Assembly Process.

Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006

Pearson Education, Inc. All rights reserved. 0-13-148521-0

Pass One (part 1)

Pass one of a simple assembler.

. . .

Page 20: The Assembly Language Level Part B – The Assembly Process.

Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006

Pearson Education, Inc. All rights reserved. 0-13-148521-0

Pass One (part 2)

Pass one of a simple assembler.

. . .

. . .

(Of course, a line might contain more than one literal!)

Page 21: The Assembly Language Level Part B – The Assembly Process.

Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006

Pearson Education, Inc. All rights reserved. 0-13-148521-0

Pass One (part 3)

Pass one of a simple assembler.. . .

Page 22: The Assembly Language Level Part B – The Assembly Process.

PASS TWO

Page 23: The Assembly Language Level Part B – The Assembly Process.

Pass two

• Purpose:– to generate object code– to optionally generate assembly listing

• Reads in each line, 1 line at a time (from original or intermediate code).

• Processes each line.– writes out binary object code

Page 24: The Assembly Language Level Part B – The Assembly Process.

MASM listing (.lst) file 00000000 .data ;insert variables below

= "<hit return>" prompt equ "<hit return>" ;prompt string

00000000 .code ;insert executable instructions below

00000000 main PROC ;program execution begins here

00000000 B8 00000001 mov eax, 1 ;set regs values

00000005 BB 00000002 mov ebx, 2

0000000A B9 00000003 mov ecx, 3

0000000F BA 00000004 mov edx, 4

00000014 BE 00000005 mov esi, 5

00000019 BF 00000006 mov edi, 6

0000001E E8 00000036 call dump ;show contents of regs

00000000 1 .data

00000000 3C 68 69 74 20 1 ??0019 db prompt, 0

72 65 74 75 72

6E 3E 00

00000000 1 .data?

00000000 1 ??001A db 132 dup (?)

00000023 1 .code

0000003C C6 80 00000000 R 1 mov BYTE PTR [??001A+eax], 0

00

0000004D B8 00000000 R mov eax, input(prompt) ;prompt the user

exit ;end of program

00000059 main ENDP

Page 25: The Assembly Language Level Part B – The Assembly Process.

00000000 .code ;insert executable instructions below

00000000 main PROC ;program execution begins here

00000000 B8 00000001 mov eax, 1 ;set regs values

00000005 BB 00000002 mov ebx, 2

0000000A B9 00000003 mov ecx, 3

0000000F BA 00000004 mov edx, 4

00000014 BE 00000005 mov esi, 5

00000019 BF 00000006 mov edi, 6

Page 26: The Assembly Language Level Part B – The Assembly Process.

00000000 .code ;insert executable instructions below

00000000 main PROC ;program execution begins here

00000000 B8 00000001 mov eax, 1 ;set regs values

00000005 BB 00000002 mov ebx, 2

0000000A B9 00000003 mov ecx, 3

0000000F BA 00000004 mov edx, 4

00000014 BE 00000005 mov esi, 5

00000019 BF 00000006 mov edi, 6

Page 27: The Assembly Language Level Part B – The Assembly Process.

00000000 .data ;insert variables below

= "<hit return>" prompt equ "<hit return>" ;prompt string

00000000 1 .data

00000000 3C 68 69 74 20 1 ??0019 db prompt, 0

72 65 74 75 72

6E 3E 00

ASCII & MASM listing file

Page 28: The Assembly Language Level Part B – The Assembly Process.

Pass two

Possible errors:1. Symbol used but not defined2. Multiply defined symbol3. Unrecognized opcode4. Too few/many operands5. Bad (binary/octal/decimal/hex) #6. Illegal register use (ex. Branch to reg)7. END missing

Page 29: The Assembly Language Level Part B – The Assembly Process.

Pass Two (part 1)Pass two of a simple assembler.

Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006

Pearson Education, Inc. All rights reserved. 0-13-148521-0

. . .

Read one entry from the intermediate file.

(Questionable limitations (16 bytes).)

Page 30: The Assembly Language Level Part B – The Assembly Process.

Pass two of a simple assembler.

Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006

Pearson Education, Inc. All rights reserved. 0-13-148521-0

Pass Two (part 2). . .

Page 31: The Assembly Language Level Part B – The Assembly Process.

SYMBOL TABLE ORGANIZATION

Page 32: The Assembly Language Level Part B – The Assembly Process.

Symbol table

• Stores <name,value> pairs.• How do we find the value associated with a

particular symbol?• If we store the symbol table in an unordered

array, then linear search requires:– best case: 1– worst case: n– average: n/2 search effort on average– O(n)

Page 33: The Assembly Language Level Part B – The Assembly Process.

Symbol table

• Stores <name,value> pairs.• How do we find the value associated with a

particular symbol?• If we store the symbol table in a sorted array,

then binary search requires:– best case: 1– worst case: log2 (n)

– O(log2 n)

Page 34: The Assembly Language Level Part B – The Assembly Process.

Symbol table

• Stores <name,value> pairs.• How do we find the value associated with a

particular symbol?• If we store the symbol table in a hash table,

then search requires:– best case = worst case = average case = O(1) iff we

have a perfect hash function• H:S-->I (H maps Strings onto the Ints)

Page 35: The Assembly Language Level Part B – The Assembly Process.

Symbol table

• Hash function H:S-->I– H maps Strings onto the Ints– We store our symbols in a table of size k.– Possible hash functions:

ks

ks

ii

ii

%

%

ksi

ksi

ii

ii

%*

%*

Page 36: The Assembly Language Level Part B – The Assembly Process.

Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006

Pearson Education, Inc. All rights reserved. 0-13-148521-0

The Symbol Table (1)

Hash coding. (a) Symbols, values, and the hash codes derived from the symbols.

We need a method to cope with the situation when the hash scheme is less than perfect (i.e., when a collision occurs).

Page 37: The Assembly Language Level Part B – The Assembly Process.

Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006

Pearson Education, Inc. All rights reserved. 0-13-148521-0

The Symbol Table (1)

Hash coding. (a) Symbols, values, and the hash codes derived from the symbols.

We need a method to cope with the situation when the hash scheme is less than perfect (i.e., when a collision occurs).

Chaining: One solution is to make a linked list of symbols that hash to the same value.

Page 38: The Assembly Language Level Part B – The Assembly Process.

Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006

Pearson Education, Inc. All rights reserved. 0-13-148521-0

The Symbol Table (2)

Hash coding. (b) Eight-entry hash table with linked lists of symbols and values.

Page 39: The Assembly Language Level Part B – The Assembly Process.

Analysis of hashing with chaining

• If all symbols hash to the same position (i.e., to only one position), what happens?

• Load factor– Given a hash table T with

• m slots that stores• n elements,• we define a load factor f for T as

f = n / m.» Average number of elements in a chain.

Page 40: The Assembly Language Level Part B – The Assembly Process.

Java and hash tables

• http://download.oracle.com/javase/6/docs/api/java/util/Hashtable.html

This example creates a hashtable of names & numbers. It uses names as keys:Hashtable<String, Integer> tbl = new Hashtable< String, Integer >();tbl.put( "fred", 1 );tbl.put( "ethel", 29 );tbl.put( "barney", 3 );

To retrieve a number, use the following code:Integer v = tbl.get( "ethel" );if (v != null) { System.out.println( "ethel = " + v ); }

Note: What’s going on with 1, 29, and 3, and Integer?

Page 41: The Assembly Language Level Part B – The Assembly Process.

C++ and hash tablesSee http://en.wikipedia.org/wiki/Hash_map_%28C%2B%2B%29

#include <hash_map>

struct eqstr { bool operator() ( const char* s1, const char* s2 ) const { return strcmp(s1,s2) == 0; //true when eq }};…hash_map< const char*, int, hash<const char*>, eqstr > tbl;…tbl["fred"] = 1;tbl["ethel"] = 29;tbl["barney"] = 3;…int v = tbl["ethel"];

Or one may useiterator find ( const key_type& k )

orsize_type count ( const key_type& k ) const

to check.

Page 42: The Assembly Language Level Part B – The Assembly Process.

Comparison (Java vs. C++)import java.util.Hashtable;

…Hashtable<String, Integer>

tbl = new Hashtable< String, Integer >();…tbl.put( "fred", 1 );tbl.put( "ethel", 29 );tbl.put( "barney", 3 );…Integer v = tbl.get( "ethel" );if (v != null) { System.out.println( "ethel = " + v ); }

#include <hash_map>

struct eqstr { bool operator() ( const char* s1, const char* s2 ) const { return strcmp(s1,s2)==0; }};…hash_map< const char*, int, hash<const char*>, eqstr >

tbl;…tbl["fred"] = 1;tbl["ethel"] = 29;tbl["barney"] = 3;…int v = tbl["ethel"];//can/should use find or count first to check for// existence

Why?

Nicer syntax.

Objects only!

Objects or primitive types.