The true story_of_hello_world

Post on 20-May-2015

1992 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

龙乐乐同学的hello world背后的故事

Transcript

PReview the True Story of Hello World?

Are you kidding?

Web Cookie:Google nativeclient--Native code for web apps

Native Cookie:Structure and Interpretation of Computer Programs

Releated topics

1. Loader and Linker2. ELF3. LLVM 4. LIBrary5. More behind the scene

Wrenchesreadelf objdump nm file string strace strip ld

NASA done this after Bush cancelled the Back to moon plan

news updated

You can interrupt me at any time(rather than Q&A section),

this is more conivenent for context reference。

I am learning,correct me if necessary.

可以在随时打断我,这样方便引用上下文,以免在Q&A截断重复。

我也只是入门,请纠正出现的错误。

pls read the note before the lecture or along the lecture if you have a computer before hand now.请提前阅读讲稿附带的笔记, 如果还没有阅读,如果手头现在有台机器,也可以跟随演讲进行。

The True Story of Hello World(or at least a good part of it)

original storyzh_CN

original author:Antônio Augusto M. Fröhlich

Offtopic:about the authorProf. Dr. Antônio Augusto Fröhlich LISHA Software/Hardware

Integration Lab Federal University of Santa Catarina Course taught

Object-Oriented Programming System Programming Operating Systems Computational Biology (parallel programming)

Hello World in C

1 #include <stdio.h> 2 int main(int argc, char *argv[]) 3 {

4 printf("Hello world!\n");

5 return 0;

6 }

header file directives

the so-called entry point

C library call printf(buffered) tell OS:everything is ok

Let's compile,link,and run it as a beginner

#compile -c get object file(Optimization in GCC and here)%gcc -Os -c hello.c#linking%ld --dynamic-linker /lib/ld-linux.so.2 /usr/lib/crt1.o /usr/lib/crti.o /usr/lib/crtn.o -lc hello.o -o hello#run%./hello

How about LLVM?

# compile the C file into a native executable: % llvm-gcc hello.c -o hello#compile the C file into a LLVM bitcode file:% llvm-gcc -O3 -emit-llvm hello.c -c -o hello.bc#run the program using the just-in-time compiler: % lli hello.bc

LLVM Getting StartedWriting your own toy compilerzh_CN 使用Flex Bison 和LLVM编写自己的编译器

%objdump -hrt hello.o 1 hello.o: file format elf32-i386 2 Sections: 3 Idx Name Size VMA LMA File off Algn 4 0 .text 00000027 00000000 00000000 00000034 2**2 5 CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE 6 1 .data 00000000 00000000 00000000 0000005c 2**2 7 CONTENTS, ALLOC, LOAD, DATA 8 2 .bss 00000000 00000000 00000000 0000005c 2**2 9 ALLOC 10 3 .rodata.str1.1 0000000e 00000000 00000000 0000005c 2**0 11 CONTENTS, ALLOC, LOAD, READONLY, DATA 12 4 .comment 00000024 00000000 00000000 0000006a 2**0 13 CONTENTS, READONLY 14 5 .note.GNU-stack 00000000 00000000 00000000 0000008e 2**0 15 CONTENTS, READONLY 16 SYMBOL TABLE: 17 00000000 l df *ABS* 00000000 hello.c 18 00000000 l d .text 00000000 .text 19 00000000 l d .data 00000000 .data 20 00000000 l d .bss 00000000 .bss 21 00000000 l d .rodata.str1.1 00000000 .rodata.str1.1 22 00000000 l d .note.GNU-stack 00000000 .note.GNU-stack 23 00000000 l d .comment 00000000 .comment 24 00000000 g F .text 00000027 main 25 00000000 *UND* 00000000 __printf_chk 26 RELOCATION RECORDS FOR [.text]: 27 OFFSET TYPE VALUE 28 00000012 R_386_32 .rodata.str1.1 29 00000019 R_386_PC32 __printf_chk

%readelf -l hello 1 Elf file type is EXEC (Executable file) 2 Entry point 0x80482e0 3 There are 7 program headers, starting at offset 52 4 Program Headers: 5 Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align 6 PHDR 0x000034 0x08048034 0x08048034 0x000e0 0x000e0 R E 0x4 7 INTERP 0x000114 0x08048114 0x08048114 0x00013 0x00013 R 0x1 8 [Requesting program interpreter: /lib/ld-linux.so.2] 9 LOAD 0x000000 0x08048000 0x08048000 0x003d6 0x003d6 R E 0x1000 10 LOAD 0x0003d8 0x080493d8 0x080493d8 0x000e8 0x000e8 RW 0x1000 11 DYNAMIC 0x0003d8 0x080493d8 0x080493d8 0x000c8 0x000c8 RW 0x4 12 NOTE 0x000128 0x08048128 0x08048128 0x00020 0x00020 R 0x4 13 GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0x4 14 Section to Segment mapping: 15 Segment Sections... 16 00 17 01 .interp 18 02 .interp .note.ABI-tag .hash .dynsym .dynstr .gnu.version .gnu.version_r .rel.dyn .rel.plt .init .plt .text .fini .rodata 19 03 .dynamic .got .got.plt .data 20 04 .dynamic 21 05 .note.ABI-tag 22 06

More explained one

An example of an LKM with various ELF sections

Executable and Linking Format (ELF)%file helloELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.15, not stripped1,AMD-64ELF 64-bit LSB executable, x86-64, version 1 (SYSV), for GNU/Linux 2.6.8, dynamically linked (uses shared libs), not stripped2,RISC_6000 executable (RISC System/6000 V3.1) or obj module not stripped 3,UltraSPARCELF 32-bit MSB executable, SPARC32PLUS, V8+ Required, version 1 (SYSV), dynamically linked (uses shared libs), not stripped4,loongsonELF 32-bit LSB executable, MIPS, MIPS-I version 1 (SYSV), for GNU/Linux 2.4.1, dynamically linked (uses shared libs), not stripped

above file format data is collected from unix-center.net

Sections and Segments

Hello World in assembly%gcc -Os -S hello.c -o - 1 .file "hello.c" 2 .section .rodata.str1.1,"aMS",@progbits,1 3 .LC0: 4 .string "Hello world!\n" 5 .text 6 .globl main 7 .type main, @function 8 main: 9 leal 4(%esp), %ecx 10 andl $-16, %esp 11 pushl -4(%ecx) 12 pushl %ebp 13 movl %esp, %ebp 14 pushl %ecx 15 subl $12, %esp 16 pushl $.LC0 17 pushl $1 18 call __printf_chk 19 movl -4(%ebp), %ecx 20 xorl %eax, %eax 21 leave 22 leal -4(%ecx), %esp 23 ret 24 .size main, .-main 25 .ident "GCC: (Ubuntu 4.4.3-4ubuntu5) 4.4.3" 26 .section .note.GNU-stack,"",@progbits

Look into%objdump -s helloContents of section .interp: 8048114 2f6c6962 2f6c642d 6c696e75 782e736f /lib/ld-linux.so 8048124 2e3200 .2. Contents of section .dynstr: 80481c0 006c6962 632e736f 2e36005f 494f5f73 .libc.so.6._IO_s 80481d0 7464696e 5f757365 64005f5f 6c696263 tdin_used.__libc 80481e0 5f737461 72745f6d 61696e00 5f5f7072 _start_main.__pr 80481f0 696e7466 5f63686b 005f5f67 6d6f6e5f intf_chk.__gmon_ 8048200 73746172 745f5f00 474c4942 435f322e start__.GLIBC_2. 8048210 3000474c 4942435f 322e332e 3400 0.GLIBC_2.3.4. .......Contents of section .rodata: 80483c0 03000000 01000200 48656c6c 6f20776f ........Hello wo 80483d0 726c6421 0a00 rld!.. Contents of section .comment: 0000 4743433a 20285562 756e7475 20342e34 GCC: (Ubuntu 4.4 0010 2e332d34 7562756e 74753529 20342e34 .3-4ubuntu5) 4.4 0020 2e3300 .3.

entry pointer

%ld -o hello_ld hello.o -lc ld: warning: cannot find entry symbol _start; defaulting to 0000000008048074%./hello_ldbash: ./hello_ld: No such file or directory%gcc -nostdlib -o hello-nostdlib hello.c/usr/bin/ld: warning: cannot find entry symbol _start; defaulting to 00000000080480b8/tmp/ccmKwryA.o: In function `main':hello.c:(.text+0x11): undefined reference to `puts'collect2: ld returned 1 exit status

So what does ld forget here?

difference between hello and hello-ld%nm hello 1 080493d8 d _DYNAMIC 2 080494a4 d _GLOBAL_OFFSET_TABLE_ 3 080483c4 R _IO_stdin_used 4 080494c0 A __bss_start 5 080494bc D __data_start 6 w __gmon_start__ 7 0804837a T __i686.get_pc_thunk.bx 8 080493d6 d __init_array_end 9 080493d6 d __init_array_start 10 08048310 T __libc_csu_fini 11 08048320 T __libc_csu_init 12 U __libc_start_main@@GLIBC_2.0 13 U __printf_chk@@GLIBC_2.3.4 14 080494c0 A _edata 15 080494c0 A _end 16 080483a8 T _fini 17 080483c0 R _fp_hw 18 08048278 T _init 19 080482e0 T _start 20 080494bc W data_start 21 08048380 T main

%nm hello-ld 1 080491e4 d _DYNAMIC 2 08049284 d _GLOBAL_OFFSET_TABLE_ 3 08049294 A __bss_start 4 U __printf_chk@@GLIBC_2.3.4 5 08049294 A _edata 6 08049294 A _end 7 U _start 8 080481ac T main

%strace ./hello 1 execve("./hello", ["./hello"], [/* 43 vars */]) = 0 2 brk(0) = 0x848a000 3 uname({sys="Linux", node="lele-laptop", ...}) = 0 4 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) 5 mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7757000 6 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) 7 open("/etc/ld.so.cache", O_RDONLY) = 3 8 fstat64(3, {st_mode=S_IFREG|0644, st_size=170053, ...}) = 0 9 mmap2(NULL, 170053, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb772d000 10 close(3) = 0 11 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) 12 open("/lib/tls/i686/cmov/libc.so.6", O_RDONLY) = 3 13 read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0000m\1\0004\0\0\0"..., 512) = 512 14 fstat64(3, {st_mode=S_IFREG|0755, st_size=1405508, ...}) = 0 15 mmap2(NULL, 1415592, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb84000 16 mprotect(0xcd7000, 4096, PROT_NONE) = 0 17 mmap2(0xcd8000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x153) = 0xcd8000 18 mmap2(0xcdb000, 10664, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xcdb000 19 close(3) = 0 20 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb772c000 21 set_thread_area({entry_number:-1 -> 6, base_addr:0xb772c6c0, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0 22 mprotect(0xcd8000, 8192, PROT_READ) = 0 23 mprotect(0xea6000, 4096, PROT_READ) = 0 24 munmap(0xb772d000, 170053) = 0 25 fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0 26 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7756000 27 write(1, "Hello world!\n", 13) = 13 28 exit_group(0) = ?

Layout of Hello World

Internal Symbols

printf("(end of text)etext:%p\n",&etext); printf("(end of data)edata:%p\n",&edata); printf("(end of segments)end:%p\n",&end); printf("(__executable_start):%p\n",&__executable_start);

(end of text)etext:0x80486b8(end of data)edata:0x804a028(end of segments)end:0x804c060(__executable_start):0x8048000

Child and Parent(wait4)%strace -e trace=process -f sh -c "hello" > /dev/null 1 execve("/bin/sh", ["sh", "-c", "hello"], [/* 43 vars */]) = 0 2 clone(Process 6067 attached 3 child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xb781a938) = 6067 4 [pid 6066] wait4(-1, Process 6066 suspended 5 <unfinished ...> 6 [pid 6067] execve("/usr/bin/hello", ["hello"], [/* 43 vars */]) = 0 7 [pid 6067] exit_group(0) = ? 8 Process 6066 resumed 9 Process 6067 detached 10 <... wait4 resumed> [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 6067 11 --- SIGCHLD (Child exited) @ 0 (0) --- 12 exit_group(0)

Can we do it without c library?%more hello-nostd.cvoid _start() { /* exit system call */ asm("movl $1,%eax;" "xorl %ebx,%ebx;" "int $0x80" );}int main(int argc, char *argv[]){ char* str="Hello world!\n"; return 0;}%gcc -nostdlib hello-nostd.c -o hello-nostd

asm(call main): http://blog.ksplice.com/2010/03/libc-free-world/

Freak out1. gcc -v2. ld --verbose3. objdump -d (man objdump)

Web References

1. A Whirlwind Tutorial on Creating Really Teensy ELF Executables for Linux 2. Hello from a libc-free world!(part one and part two)3. bash: ./hello-ld: No such file or directory4. Linkers and Loaders (Linux Journal) 5. Structure and Interpretation of Computer Programs6. Linux Standard Base (LSB)

Books1. Loader and Linker John R. Levine - 20002. 程序员的自我修养 --链接、装载与库 俞甲子/石凡/潘爱

民 2009 3. Computer Systems: A Programmer's Perspective Randal

E.Bryant / David O'Hallaron 2003

top related