See: P&H Appendix A1-2, A.3-4 and 2.12 Anne Bracy CS 3410 Computer Science Cornell University The slides are the product of many rounds of teaching CS 3410 by Professors Weatherspoon, Bala, Bracy, McKee, and Sirer.
See:P&HAppendixA1-2,A.3-4and2.12
AnneBracyCS3410
ComputerScienceCornellUniversity
The slides are the product of many rounds of teaching CS 3410 by Professors Weatherspoon, Bala, Bracy, McKee, and Sirer.
• Compiler:createsassemblyfiles
• Assembler:createsobjectfiles(=machinecode)
• Linker: joinsobjectfilesintooneexecutable
• Loader: bringsexecutableintomemoryandstartsexecutingaprocess
✔
calc.c
math.c
io.s
calc.s
math.s
Compiler
Csourcefiles
assemblyfiles libc.o
libm.o
io.o
calc.o
math.o
Assembler
obj files
calc.exe
Linker
executableprogram
Executingin
Memory
loader
process
existsondisk
Howdowe(ashumansorcompiler)programontopofagivenISA?
Assemblylanguageà binary machinecodeInput=Program:• MIPSinstructions• Programdata(strings,variables,etc)
Output=objectfile:.ofileinUnix,.obj inWindows,containingMIPSinstructionsinexecutableform
addi r5, r0, 10muli r5, r5, 2addi r5, r5, 15
001000000000010100000000000010100000000000000101001010000100000000100000101001010000000000001111
Assemblylanguageisusedtospecifyprogramsatalow-level
WillIprograminassembly?A:Ido...• ForCS3410(andsomeCS4410/4411)• Forkernelhacking,devicedrivers,GPU,etc.• Forperformance(butcompilersaregettingbetter)• Forhighlytimecriticalsections• Forhardwarewithouthighlevellanguages• Fornew&advancedinstructions:rdtsc,debugregisters,performancecounters,synchronization,...
Input:• assemblyinstructions• psuedo-instructions• dataandlayoutdirectives
Output:ObjectFile
Slightlyhigherlevelthanplainassemblye.g:takescareofdelayslots
(willreorderinstructionsorinsertnops)
Arithmetic/Logical• ADD,ADDU,SUB,SUBU,AND,OR,XOR,NOR,SLT,SLTU• ADDI,ADDIU,ANDI,ORI,XORI,LUI,SLL,SRL,SLLV,SRLV,SRAV,SLTI,SLTIU
• MULT,DIV,MFLO,MTLO,MFHI,MTHIMemoryAccess• LW,LH,LB,LHU,LBU,LWL,LWR• SW,SH,SB,SWL,SWR
Controlflow• BEQ,BNE,BLEZ,BLTZ,BGEZ,BGTZ• J,JR,JAL,JALR,BEQL,BNEL,BLEZL,BGTZL
Special• LL,SC,SYSCALL,BREAK,SYNC,COPROC
Input:• assemblyinstructions• psuedo-instructions• dataandlayoutdirectives
Output:Objectfile
Slightlyhigherlevelthanplainassemblye.g:takescareofdelayslots
(willreorderinstructionsorinsertnops)
Pseudo-InstructionsNOP#donothing• SLLr0,r0,0
MOVEreg,reg #copybetweenregs• ADDr2,r0,r1#copiescontentsofr1tor2
LIreg,imm #loadimmediate(upto32bits)LAreg,label#loadaddress(32bits)Blabel#unconditionalbranchBLTreg,reg,label#branchlessthan• SLTr1,rA,rB #r1=1ifR[rA]<R[rB];o.w.r1=0• BNEr1,r0,label#gotoaddresslabelifr1!=r0;i.t.rA <rB
Input:• assemblyinstructions• psuedo-instructions• dataandlayoutdirectives
Output:Objectfile
Slightlyhigherlevelthanplainassemblye.g:takescareofdelayslots
(willreorderinstructionsorinsertnops)
Programsconsistofsegmentsusedfordifferentpurposes• Text:holdsinstructions• Data:holdsstaticallyallocated
programdatasuchasvariables,strings,etc.
addr1,r2,r3
ori r2,r4,3
...
“cornell cs”
13
25data
text
Assemblyfilesconsistofamixof• Instructions• pseudo-instructions• assembler(data/layout)directives on
howtolayoutvaluesinmemory
AssembledtoanObjectFile• Header• TextSegment• DataSegment• RelocationInformation• SymbolTable• DebuggingInformation
.text
.ent mainmain: la $4, Larray
li $5, 15...li $4, 0jal exit
.end main
.dataLarray:
.long 51, 491, 3991
Assemblyisalow-leveltask• Needtoassembleassemblylanguageintomachinecodebinary.Requires– Assemblylanguageinstructions– pseudo-instructions– AndSpecifylayoutanddatausingassemblerdirectives
• Modern(Harvard VonNeumann)processorsstorebothdataandinstructionsinmemory…butkeptinseparatesegments…andhasseparatecaches
Putitalltogether:Anexampleofcompilingaprogramfromsourcetoassemblytomachineobjectcode.
add100.c add100.s
Compiler
Csourcefiles
assemblyfiles
add100.o
Assembler
obj filesadd100
Linkerexecutableprogram
Executingin
Memory
loader
process
existsondisk
int n=100;intmain(int argc,char*argv[]){
int i;intm=n;int sum=0;
for(i =1;i <=m;i++)sum+=i;
printf ("Sum1to%dis%d\n",n,sum);}#Compile[csug03] mipsel-linux-gcc –S add1To100.cexportPATH=${PATH}:/courses/cs3410/mipsel-linux/bin:/courses/cs3410/mips-sim/binorsetenv PATH${PATH}:/courses/cs3410/mipsel-linux/bin:/courses/cs3410/mips-sim/bin
$L2: lw $2,24($fp)lw $3,28($fp)slt $2,$3,$2bne $2,$0,$L3lw $3,32($fp)lw $2,24($fp)addu $2,$3,$2sw $2,32($fp)lw $2,24($fp)addiu $2,$2,1sw $2,24($fp)b $L2
$L3: la $4,$str0lw $5,28($fp)lw $6,32($fp)jal printfmove $sp,$fplw $31,44($sp)lw $fp,40($sp)addiu $sp,$sp,48j $31
.data
.globl n
.align 2 n: .word 100
.rdata
.align 2$str0: .asciiz
"Sum 1 to %d is %d\n".text.align 2.globl main
main: addiu $sp,$sp,-48sw $31,44($sp)sw $fp,40($sp)move $fp,$spsw $4,48($fp)sw $5,52($fp)la $2,nlw $2,0($2)sw $2,28($fp)sw $0,32($fp)li $2,1sw $2,24($fp)
Example: Add 1 to 100
prologue
epilogue
printf
$L2: lw $2,24($fp)lw $3,28($fp)slt $2,$3,$2bne $2,$0,$L3lw $3,32($fp)lw $2,24($fp)addu $2,$3,$2sw $2,32($fp)lw $2,24($fp)addiu $2,$2,1sw $2,24($fp)b $L2
$L3: la $4,$str0lw $5,28($fp)lw $6,32($fp)jal printfmove $sp,$fplw $31,44($sp)lw $fp,40($sp)addiu $sp,$sp,48j $31
.data
.globl n
.align 2 n: .word 100
.rdata
.align 2$str0: .asciiz
"Sum 1 to %d is %d\n".text.align 2.globl main
main: addiu $sp,$sp,-48sw $31,44($sp)sw $fp,40($sp)move $fp,$spsw $4,48($fp)sw $5,52($fp)la $2,nlw $2,0($2)sw $2,28($fp)sw $0,32($fp)li $2,1sw $2,24($fp)
Example: Add 1 to 100
prologue
epilogue
printf
$v0
$v0$v1
$v0=100m=100sum=0
i=1
i=1m=100
if(m<i)100<1
v0=1(i)v1=0(sum)
v0=1(0+1)
i=1sum=1
i=2(1+1)i=2
$a0$a1$a2
strm=100sum
$a0$a1
# Assemble[csug01] mipsel-linux-gcc –c add1To100.s
# Link[csug01] mipsel-linux-gcc –o add1To100 add1To100.o ${LINKFLAGS}# -nostartfiles –nodefaultlibs# -static -mno-xgot -mno-embedded-pic -mno-abicalls -G 0 -DMIPS -Wall
# Load[csug01] simulate add1To100Sum 1 to 100 is 5050MIPS program exits with status 0 (approx. 2007 instructions in 143000 nsec at 14.14034 MHz)
int n=100;int main(int argc,char*argv[]){
int i,m=n,sum=0;int*A=malloc(4*m+4);for(i =1;i <=m;i++){sum+=i;A[i]=sum;}printf ("Sum1to%dis%d\n",n,sum);
}
Variables Visibility Lifetime Location
Function-Local
Global
Dynamic
int *trouble(){ int a; …return &a;
}char *evil() { char s[20];gets(s); return s;
}int *bad() { s = malloc(20); … free(s); … return s;
}
//“addr of”somethingonthestack!//invalidafterreturn
//bufferoverflow
//freed(i.e.adangling)pointer
calc.c
math.c
io.s
calc.s
math.s
Compiler
Csourcefiles
assemblyfiles libc.o
libm.o
io.o
calc.o
math.o
Assembler
obj files
calc.exe
Linker
executableprogram
Executingin
Memory
loader
process
existsondisk
vector* v = malloc(8);v->x = prompt(“enter x”);v->y = prompt(“enter y”);int c = pi + tnorm(v);print(“result %d”, c);
calc.c
int tnorm(vector* v) {return abs(v->x)+abs(v->y);}
math.c
global variable: pientry point: promptentry point: printentry point: malloc
lib3410.o
systemreserved
stack
systemreserved
code(text)
staticdata
dynamicdata(heap)
Compiller producesassemblyfiles• (containMIPSassembly,pseudo-instructions,directives,etc.)
Assemblerproducesobjectfiles• (containMIPSmachinecode,missingsymbols,somelayoutinformation,etc.)
Linkerproducesexecutablefile• (containsMIPSmachinecode,nomissingsymbols,somelayoutinformation)
Loaderputsprogramintomemoryandjumpstofirstinstruction• (machinecode)
Compiler outputisassemblyfiles
Assembler outputisobj files• Howdoestheassemblerresolvereferences/labels?• Howdoestheassemblerresolveexternalreferences?
Linker joinsobjectfilesintooneexecutable• Howdoesthelinkercombineseparatelycompiledfiles?• Howdoeslinkerresolveunresolvedreferences?• Howdoeslinkerrelocatedataandcodesegments
Loader bringsitintomemoryandstartsexecution• Howdoestheloaderstartexecutingaprogram?• Howdoestheloaderhandlesharedlibraries?
calc.c
math.c
io.s
calc.s
math.s
Compiler
Csourcefiles
assemblyfiles
io.o
calc.o
math.o
Assembler
obj files
.o=Linux
.objWindows
Outputofassembler:objectfiles• Binarymachinecode,butnotexecutable
Eachfileassembledseparately• Howdoesassemblerhandleforwardreferences?
Howdoestheassemblerhandlelocalreferences?
Two-pass assembly• Firstpassthroughwholeprogram:allocateinstructions,layoutdata,determineaddresses
• Secondpass:emitinstructionsanddata,usinglabeloffsetsfrom1st pass
One-pass (orbackpatch)assembly• Onepassthroughwholeprogram:emittinstructions,emit0forjumpstolabelsnotyetdetermined(keeptrackofthese)
• Backpatch,fillin0offsetsaslabelsaredefined
Example:bne $1,$2,Lsll $0,$0,0
L:addiu $2,$3,0x2
Theassemblerwillchangethistobne $1,$2,+1sll $0,$0,0addiu $2,$3,$0x2
Finalmachinecode0X14220001 #bne0x00000000#sll0x24620002#addiu
000101000010001000000000000000010000000000000000000000000000000000100100011000100000000000000010
Outputofassembler:objectfiles• Binarymachinecode,notexecutable• Howdoesassemblerhandleforwardreferences?• Mayrefertoexternalsymbols• Eachobjectfilehasillusionofitsownaddressspace
– Addresseswillneedtobefixedlater
math.c math.o .o=Linux.objWindows
e.g..text(code)startsataddr 0x00000000.datastarts@addr 0x10000000
needa“symboltable”
math.s
Howdoestheassemblerhandleexternalreferences?
Globallabels: Externallyvisible“exported”symbols• Canbereferencedfromotherobjectfiles• Exportedfunctions,globalvariables
Locallabels: Internalvisibleonlysymbols• Onlyusedwithinthisobjectfile• staticfunctions,staticvariables,looplabels,…
e.g.pi(fromacoupleofslides ago)
e.g.staticfoostaticbarstaticbaz
e.g.$str$L0$L2
Header• Sizeandpositionofpiecesoffile
TextSegment• instructions
DataSegment• staticdata(local/globalvars,strings,constants)
DebuggingInformation• linenumberà codeaddressmap,etc.
SymbolTable• External(exported)references• Unresolved(imported)references
ObjectFile
int pi=3;int e=2;staticint randomval =7;
externchar*username;externint printf(char*str,…);
int square(int x){…}staticint is_prime(int x){…}int pick_prime(){…}int pick_random(){
returnrandomval;}
math.cgcc -S …math.cgcc -c …math.sobjdump --disassemblemath.oobjdump --syms math.o
CompilerAssemblerglobal
local(tocurrentfile)
external(defined inanother file)
globallocal
csug01 ~$ mipsel-linux-objdump --disassemble math.omath.o: file format elf32-tradlittlemipsDisassembly of section .text:
00000000<pick_random>:0: 27bdfff8 addiu sp,sp,-84: afbe0000 sw s8,0(sp)8: 03a0f021 move s8,spc: 3c020000 lui v0,0x010: 8c420008 lw v0,8(v0)14: 03c0e821 move sp,s818: 8fbe0000 lw s8,0(sp)1c: 27bd0008 addiu sp,sp,820: 03e00008 jr ra24: 00000000 nop
00000028<square>:28: 27bdfff8 addiu sp,sp,-82c: afbe0000 sw s8,0(sp)30: 03a0f021 move s8,sp34: afc40008 sw a0,8(s8)…
csug01 ~$ mipsel-linux-objdump --disassemble math.omath.o: file format elf32-tradlittlemipsDisassembly of section .text:
00000000<pick_random>:0: 27bdfff8 addiu sp,sp,-84: afbe0000 sw s8,0(sp)8: 03a0f021 move s8,spc: 3c020000 lui v0,0x010: 8c420008 lw v0,8(v0)14: 03c0e821 move sp,s818: 8fbe0000 lw s8,0(sp)1c: 27bd0008 addiu sp,sp,820: 03e00008 jr ra24: 00000000 nop
00000028<square>:28: 27bdfff8 addiu sp,sp,-82c: afbe0000 sw s8,0(sp)30: 03a0f021 move s8,sp34: afc40008 sw a0,8(s8)…
Address instruction Mem[8]=instruction0x03a0f021(moves8,sp)
prologue
body
epilogue
symbol
resolved(fixed)later
csug01 ~$ mipsel-linux-objdump --syms math.omath.o: file format elf32-tradlittlemips
SYMBOL TABLE:00000000 l df *ABS* 00000000 math.c00000000 l d .text 00000000 .text00000000 l d .data 00000000 .data00000000 l d .bss 00000000 .bss00000000 l d .mdebug.abi32 00000000 .mdebug.abi3200000008 l O .data 00000004 randomval00000060 l F .text 00000028 is_prime00000000 l d .rodata 00000000 .rodata00000000 l d .comment 00000000 .comment00000000 g O .data 00000004 pi00000004 g O .data 00000004 e00000000 g F .text 00000028 pick_random00000028 g F .text 00000038 square00000088 g F .text 0000004c pick_prime00000000 *UND* 00000000 username00000000 *UND* 00000000 printf
csug01 ~$ mipsel-linux-objdump --syms math.omath.o: file format elf32-tradlittlemips
SYMBOL TABLE:00000000 l df *ABS* 00000000 math.c00000000 l d .text 00000000 .text00000000 l d .data 00000000 .data00000000 l d .bss 00000000 .bss00000000 l d .mdebug.abi32 00000000 .mdebug.abi3200000008 l O .data 00000004 randomval00000060 l F .text 00000028 is_prime00000000 l d .rodata 00000000 .rodata00000000 l d .comment 00000000 .comment00000000 g O .data 00000004 pi00000004 g O .data 00000004 e00000000 g F .text 00000028 pick_random00000028 g F .text 00000038 square00000088 g F .text 0000004c pick_prime00000000 *UND* 00000000 username00000000 *UND* 00000000 printf
Address l:localg:global
segmentsize
segment
Staticlocalfunc@addr=0x60size=0x28bytes
f:funcO:obj
externalreference
• Compiler:createsassemblyfiles
• Assembler: createsobjectfiles(=machinecode)
• Linker: joinsobjectfilesintooneexecutable
• Loader: bringsexecutableintomemoryandstartsexecutingaprocess
✔
✔
Howdowelinktogetherseparatelycompiledandassembledmachineobjectfiles?
calc.c
math.c
io.s
calc.s
math.s
Compiler
Csourcefiles
assemblyfiles libc.o
libm.o
io.o
calc.o
math.o
Assembler
obj files
calc.exe
Linker
executableprogram
Executingin
Memory
loader
process
existsondisk
Linker combinesobjectfilesintoanexecutablefile• Relocateeachobject’stextanddatasegments• Resolveas-yet-unresolvedsymbols• Recordtop-levelentrypointinexecutablefile
Endresult:aprogramondisk,readytoexecute• E.g. ./calc Linux
./calc.exe Windowssimulatecalc ClassMIPSsimulator
.
main.o...
0C000000210350001b80050C8C040000210470020C000000
...00 T main00 D uname*UND* printf*UND* pi40,JAL, printf4C,LW/gp, pi50,JAL, square
math.o...
210320400C0000001b3014023C04000034040000
...20 T square00 D pi*UND* printf*UND* uname28,JAL, printf30,LUI, uname34,LA, uname
printf.o...
3C T printf
.text
Symbo
ltbl
Relocatio
ninfo
Externalreferencesneedtoberesolved(fixed)
Steps1) FindUNDsymbolsin
symboltable2) Relocatesegmentsthat
collide
e.g.uname@0x00pi@0x00square@0x00main@0x00
main.o...
0C000000210350001b80050C8C040000210470020C000000
...00 T main00 D uname*UND* printf*UND* pi40,JAL, printf4C,LW/gp, pi50,JAL, square
math.o...
210320400C0000001b3014023C04000034040000
...20 T square00 D pi*UND* printf*UND* uname28,JAL, printf30,LUI, uname34,LA, uname
printf.o...
3C T printf
...210320400C40023C1b3014023C04100034040004
...0C40023C210350001b80050c8C048004210470020C400020
...102010002104033022500102
...
Entry:0040 0100text:0040 0000data:1000 0000
calc.exe
000000030077616B
2 1
BA
3
1
2
3
00400000
00400100
00400200
1000000010000004
LUI1000ORI0004
unamepi
math
main
printf
.text
Symbo
ltbl
Relocatio
ninfo
LW$4,-32764($gp)$4=pi
JALsquare
JALprintfLAuname
Header• locationofmainentrypoint(ifany)
TextSegment• instructions
DataSegment• staticdata(local/globalvars,strings,constants)
RelocationInformation• Instructionsanddatathatdependonactualaddresses• Linkerpatchesthesebitsafterrelocatingsegments
SymbolTable• Exportedandimportedreferences
DebuggingInformation
ObjectFile
Unix• a.out• COFF:CommonObjectFileFormat• ELF:ExecutableandLinkingFormat• …
Windows• PE:PortableExecutable
Allsupportbothexecutableandobjectfiles
• Compiler:createsassemblyfiles
• Assembler: createsobjectfiles(=machinecode)
• Linker: joinsobjectfilesintooneexecutable
• Loader: bringsexecutableintomemoryandstartsexecutingaprocess
✔
✔
✔
calc.c
math.c
io.s
calc.s
math.s
Compiler
Csourcefiles
assemblyfiles libc.o
libm.o
io.o
calc.o
math.o
Assembler
obj files
calc.exe
Linker
executableprogram
Executingin
Memory
loader
process
existsondisk
Loader readsexecutablefromdiskintomemory• Initializesregisters,stack,argumentstofirstfunction• Jumpstoentry-point
PartoftheOperatingSystem(OS)
StaticLibrary:Collectionofobjectfiles(think:likeaziparchive)
Q:Buteveryprogramcontainsentirelibrary!A:Linkerpicksonlyobjectfilesneededtoresolveundefinedreferencesatlinktime
e.g.libc.a containsmanyobjects:• printf.o,fprintf.o,vprintf.o,sprintf.o,snprintf.o,…• read.o,write.o,open.o,close.o,mkdir.o,readdir.o,…• rand.o,exit.o,sleep.o,time.o,….
Q:Buteveryprogramstillcontainspartoflibrary!A:sharedlibraries• executablefilesallpointtosinglesharedlibrary ondisk
• finallinking(andrelocations)donebytheloader
Optimizations:• Librarycompiledatfixednon-zeroaddress
• Jumptableineachprograminsteadofrelocations• Canevenpatchjumpson-the-fly
Directcall:00400010 <main>:
...jal 0x00400330...jal 0x00400620...jal 0x00400330...
00400330 <printf>:...
00400620 <gets>:...
Drawbacks:Linkerorloadermustediteveryuseofasymbol(callsite,globalvar use,…)
Idea:Putallsymbolsinasingle“globaloffsettable”
Codedoeslookupasneeded
00400010 <main>:...jal 0x00400330...jal 0x00400620...jal 0x00400330...
00400330 <printf>:...
00400620 <gets>:...
GOT:globaloffsettable
0x00400330#printf0x00400620#gets
0x00400010#main
Indirectcall:
00400010 <main>:...jal 0x00400330...jal 0x00400620...jal 0x00400330...
00400330 <printf>:...
00400620 <gets>:...
GOT:globaloffsettable
0x00400330#printf0x00400620#gets
0x00400010#main
Indirectcall: # data segment
# global offset table# to be loaded# at -32712($gp)# printf = 4+(-32712)+$gp# gets = 8+(-32712)+$gp
048
lw $t9,-32708($gp)jalr $t9
lw $t9,-32704($gp)jalr $t9
lw $t9,-32708($gp)jalr $t9
00400010 <main>:...jal 0x00400330...jal 0x00400620...jal 0x00400330...
00400330 <printf>:...
00400620 <gets>:...
.got
0x00400330#printf0x00400620#gets
0x00400010#main
Indirectcall: # data segment
# global offset table# to be loaded# at -32712($gp)# printf = 4+(-32712)+$gp# gets = 8+(-32712)+$gp
.word
.word.word
lw $t9,-32708($gp)jalr $t9
lw $t9,-32704($gp)jalr $t9
lw $t9,-32708($gp)jalr $t9
Indirectcallwithon-demanddynamiclinking:00400010 <main>:
...# load address of prints# from .got[1]lw t9, -32708(gp)
# now call itjalr t9...
.got .word 00400888 # open.word 00400888 # prints.word 00400888 # gets.word 00400888 # foo
Indirectcallwithon-demanddynamiclinking:00400010 <main>:
...# load address of prints# from .got[1]lw t9, -32708(gp)# also load the index 1li t8, 1# now call itjalr t9...
.got .word 00400888 # open.word 00400888 # prints.word 00400888 # gets.word 00400888 # foo
...00400888 <dlresolve>:
# t9 = 0x400888# t8 = index of func that# needs to be loaded
# load that func... # t7 = loadfromdisk(t8)
# save func’s address so# so next call goes direct... # got[t8] = t7
# also jump to funcjr t7# it will return directly # to main, not here
Windows:dynamicallyloadedlibrary(DLL)• PEformat
Unix:dynamicsharedobject(DSO)• ELFformat
UnixalsosupportsPositionIndependentCode(PIC)– Programdeterminesitscurrentaddresswheneverneeded(noabsolutejumps!)
– Localdata:accessviaoffsetfromcurrentPC,etc.– Externaldata:indirectionthroughGlobalOffsetTable(GOT)
– …whichinturnisaccessedviaoffsetfromcurrentPC
Staticlinking• Bigexecutablefiles(all/mostofneededlibrariesinside)
• Don’tbenefitfromupdatestolibrary• Noload-timelinking
Dynamiclinking• Smallexecutablefiles(justpointtosharedlibrary)• Libraryupdatebenefitsallprogramsthatuseit• Load-timecosttodofinallinking
– Butdll codeisprobablyalreadyinmemory– Andcandothelinkingincrementally,on-demand