Top Banner
Compilers and Software Compilers and Software Security Security Gaurav S. Kc [email protected] http://www.cs.columbia.edu/~gskc Programming Systems Lab Tuesday, 22 nd April 2003
36

Compilers and Software Security Gaurav S. Kc [email protected] gskc Programming Systems Lab Tuesday, 22 nd April 2003.

Dec 16, 2015

Download

Documents

Leona Walton
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Compilers and Software Security Gaurav S. Kc gskc@cs.columbia.edu gskc Programming Systems Lab Tuesday, 22 nd April 2003.

Compilers and Software SecurityCompilers and Software Security

Gaurav S. [email protected]

http://www.cs.columbia.edu/~gskc

Programming Systems Lab

Tuesday, 22nd April 2003

Page 2: Compilers and Software Security Gaurav S. Kc gskc@cs.columbia.edu gskc Programming Systems Lab Tuesday, 22 nd April 2003.

OutlineOutline

SecurityRuntime Management of ProcessesVulnerabilities and Attack TechniquesCompilers 4115Security ResearchConclusion

Page 3: Compilers and Software Security Gaurav S. Kc gskc@cs.columbia.edu gskc Programming Systems Lab Tuesday, 22 nd April 2003.

SecuritySecurity

What does security mean?– Focus: Security of resources

• No unauthorised access (using Authentication)• Availability for authorised users (no DoS)

– Also: Security of data during transit• Protection from eavesdropping• Protection from malformation• Solutions: PKI for encryption, digital signatures

for non-repudiation

Page 4: Compilers and Software Security Gaurav S. Kc gskc@cs.columbia.edu gskc Programming Systems Lab Tuesday, 22 nd April 2003.

Security: Models & ThreatsSecurity: Models & Threats

Social aspects of security failure– 3Bs: Burglary, Bribery, Brutality– Social Engineering

Threats to Security During Transit– Man-in-the-middle attack

• Identity spoofing / Masquerading• Packet sniffing• Communication replay

Page 5: Compilers and Software Security Gaurav S. Kc gskc@cs.columbia.edu gskc Programming Systems Lab Tuesday, 22 nd April 2003.

Threats to Application SecurityThreats to Application Security

Trojan HorsesMalicious security breaking program disguised as something benign like a screen saver or game program– Keystroke loggers & powerful remote-control utility like Back Orifice– Abnormal system behaviour, e.g. open server socket, CTRL-ALT-

DEL signal handler– Zombie nodes, awaiting instructions for conducting D.DoS

Computer VirusesExecutable code that, when run by someone, infects or attaches itself to other executable code in a computer in an effort to reproduce itself– Can be malicious, erase files, lock up systems– Boot Sector, File, Macro, Multipartite, Polymorphic, Stealth– Anti-virus: search for known signature in suspect files

Page 6: Compilers and Software Security Gaurav S. Kc gskc@cs.columbia.edu gskc Programming Systems Lab Tuesday, 22 nd April 2003.

Threats to Application Security 2Threats to Application Security 2

Internet WormsA worm is a self-replicating program that does not alter files, but resides in active memory and duplicates itself by means of computer networks

– Morris Worm (RTM) exploited fingerd, sendmail, weak passwords

– Code Red exploited a (publicised) vulnerability in Microsoft IIS

– Code Red II had a Trojan payload

– Nimda: Swiss Army knife of worms – worm, virus, trojan!Spread via its own e-mail engine, IIS servers that it scanned, and shared disks on corporate networks.

Common Trait:Well-crafted input data can let you take control of a computer

– WinNuke: for rebooting remote Win95 machine :)

Page 7: Compilers and Software Security Gaurav S. Kc gskc@cs.columbia.edu gskc Programming Systems Lab Tuesday, 22 nd April 2003.

SecurityRuntime Management of ProcessesVulnerabilities and Attack TechniquesCompilers 4115Security ResearchConclusion

Page 8: Compilers and Software Security Gaurav S. Kc gskc@cs.columbia.edu gskc Programming Systems Lab Tuesday, 22 nd April 2003.

Process RuntimeProcess Runtime

Program

Stack

Heap

char *env[]

char *argv[]

int argc

.bss

.data

.text

argv[]

runtime stack

runtime heap

env[]0xbfffffff

int main(int argc, char *argv[], char *env[]) {

return 0;

}

kernel space

0x08048000

x86– 32-bit von Neumann machine

– 232 ≈ 4GB memory locations Breakdown of process space stack

– <= 0xbfffffff, Grows downwards

– Environment variables, Program parameters

– Automatically allocated stack variables

– Activation records heap

– Dynamic allocation

– Explicitly through malloc, free

0x00000000

0xffffffff

Page 9: Compilers and Software Security Gaurav S. Kc gskc@cs.columbia.edu gskc Programming Systems Lab Tuesday, 22 nd April 2003.

Process Runtime 2Process Runtime 2

char *env[]

char *argv[]

int argc

.bss

.data

.text

argv[]

runtime stack

runtime heap

env[]0xbfffffff

Block Started by Segment

// static & global uninitialised data

Data Section

// static & global initialised data

Text Section

// executable machine code

kernel space

0x08048000

.bss– assembler directive for IBM 704 assembler– runtime allocation of space– RWX

.data– compile-time space allocation,

and initialisation values– RWX

.text– program code– runtime DLLs– RO, X

.rodata– RO, X– constantsconst int x = 4;“hello, world”

0x00000000

0xffffffff

Page 10: Compilers and Software Security Gaurav S. Kc gskc@cs.columbia.edu gskc Programming Systems Lab Tuesday, 22 nd April 2003.

Activation RecordsActivation Records

Subroutines– functions and procedures– abstraction of computation– structured programming concept

Stack frame, Function frame, Activation frame– Block of stack space reserved for duration of function

Logical stack frames are crucial for implementing subroutines– Each frame contains information related to the context of the

given function. Grows downwards for each nested invocation.

Reserved registers– %eip (next instruction), %esp, %ebp (fixed offsets)

Page 11: Compilers and Software Security Gaurav S. Kc gskc@cs.columbia.edu gskc Programming Systems Lab Tuesday, 22 nd April 2003.

Activation Records 2Activation Records 2

void function(char *s, float y, int x) { int a; int b; char buffer[SIZE]; int c; strcpy(buffer, s); return;}

#define SIZE 9int main(void) { function(“yep”, 2.f, 93); return 0;}

function parametersreturn addressold frame pointerautomatic variables

int xfloat ychar *s

ret. addr: 0x0abcdef0old fp: 0x4fedcba8

int aint b

char buffer[SIZE]int c

PC

FPSP

Source function Visualisation of the

runtime stack frame

-40(%ebp)

-16(%ebp)

-12(%ebp)

8(%ebp)

12(%ebp)

16(%ebp)

-44(%ebp)

Page 12: Compilers and Software Security Gaurav S. Kc gskc@cs.columbia.edu gskc Programming Systems Lab Tuesday, 22 nd April 2003.

Activation Records 3Activation Records 3

prologue

epilogue

function: pushl %ebp movl %esp, %ebp subl $56, %esp subl $8, %esp pushl 8(%ebp) leal -40(%ebp), %eax pushl %eax call strcpy addl $16, %esp leave ret

.LC0: .string “yep”

main: ... pushl $93 pushl $0x40000000 pushl $.LC0 call function ...

void function(char *s, float y, int x) { int a; int b; char buffer[SIZE]; int c; strcpy(buffer, s); return;}

#define SIZE 9int main(void) { function(“yep”, 2.f, 93); return 0;}

s

bufferfunction body

Source function Assembly equivalent Building the stack frame

char *s

float y

int x

Page 13: Compilers and Software Security Gaurav S. Kc gskc@cs.columbia.edu gskc Programming Systems Lab Tuesday, 22 nd April 2003.

SecurityRuntime Management of ProcessesVulnerabilities and Attack TechniquesCompilers 4115Security ResearchConclusion

Page 14: Compilers and Software Security Gaurav S. Kc gskc@cs.columbia.edu gskc Programming Systems Lab Tuesday, 22 nd April 2003.

VulnerabilitiesVulnerabilities

C: Low level, high level systems languageEfficient execution, Usable for real-time

solutionsPointers and Arrays

– Pointer to (null-terminated?) block of memoryLack of bounds checking

– Buffer overflow causes havoc

Page 15: Compilers and Software Security Gaurav S. Kc gskc@cs.columbia.edu gskc Programming Systems Lab Tuesday, 22 nd April 2003.

Attack TechniquesAttack Techniques

Criteria for successful attack– Locate a buffer that has an unsafe operation applied to it– Well-crafted input data to trigger the overflow

Buffer overrun vulnerabilities– Stack-based: Stack-smashing attack– Heap-based: Function pointers, C++ virtual pointers,

Exception handlers (CodeRed)

FormatString exploits– %n format converter for *printf family of functions – writes #bytes output so far to %n argument (int *)

printf(“\x70\xf7\xff\xbf%%n”); //0xbffff770 := 4

Page 16: Compilers and Software Security Gaurav S. Kc gskc@cs.columbia.edu gskc Programming Systems Lab Tuesday, 22 nd April 2003.

Smashing the StackSmashing the Stack

void function(char *s, float y, int x) { int a; int b; char buffer[SIZE]; int c; ... ; strcpy(buffer, s); ...}

PC

int xfloat ychar *s

ret. addr: 0x0abcdef0old fp: 0x4fedcba8

int aint b

char buffer[SIZE]int c

Stacksmashing attack

•Buffer overrun•Code injection•Return address overwritten

0xBadAdda0.........

(“/bin/sh”)exec

To overflow (automatic) stack buffer, one would need:– Shellcode, i.e. characters representing machine code (obtain from gdb, as)– Memory location of injected shellcode (typically buffer address)

Can approximate to make up for lack of precise information– nop instructions at the beginning of the shellcode– overwrite locations around 0(%ebp)with shellcode address

suid installed programs. Shellcode: shell, export xterm display

Page 17: Compilers and Software Security Gaurav S. Kc gskc@cs.columbia.edu gskc Programming Systems Lab Tuesday, 22 nd April 2003.

Heap-Based AttacksHeap-Based Attacks

Function pointer– Higher address: function pointer– Lower address: buffer

C++ Pointer to vtable– Higher address: virtual pointer– Lower address: buffer

char buffer[ ];

int (* f) (void)

.bss

class ABC { char buffer[10]; virtual void print() { cout << buffer; } void set(char *s) { strcpy(buffer, s); }};

int main(int argc, char *argv[]) { static char buffer[10]; static int (*f)(void) = exit; // gets(buffer); strcpy(buffer, argv[1]); (*f)();

ABC *abc = new ABC(); abc->set(argv[1]); abc->print();

}

char buffer[ ];

void *vptr

C++ object

Page 18: Compilers and Software Security Gaurav S. Kc gskc@cs.columbia.edu gskc Programming Systems Lab Tuesday, 22 nd April 2003.

SecurityRuntime Management of ProcessesVulnerabilities and Attack TechniquesCompilers 4115Security ResearchConclusion

Page 19: Compilers and Software Security Gaurav S. Kc gskc@cs.columbia.edu gskc Programming Systems Lab Tuesday, 22 nd April 2003.

Compilers 4115Compilers 4115

GCC: GNU Compiler Collection– Just a wrapper for different phases

• cpp: C preprocessorprogram.c program.i

• cc1: C compiler properprogram.i program.s

• as: Assembler (a.out, ELF relocatable files)program.s program.o

• ld: Link editor (ELF executables)program.o program

Page 20: Compilers and Software Security Gaurav S. Kc gskc@cs.columbia.edu gskc Programming Systems Lab Tuesday, 22 nd April 2003.

GCCGCC

Command line optionsgcc –save-temps (-pipe) –Wall –O0 –dr –v –static-I$HOME/include –L$HOME/lib-lsocket –lm -lpthread

Standard libraries/lib/libc.so.6, /lib/ld-linux.so.2

Standard library header files/usr/include

Page 21: Compilers and Software Security Gaurav S. Kc gskc@cs.columbia.edu gskc Programming Systems Lab Tuesday, 22 nd April 2003.

Other toolsOther tools

GNU Debugger: gdbGNU Binutils

– objcopy: add/remove ELF sections – readelf,objdump: print ELF information

Miscellaneous– ldd: list dynamic dependencies (DLLs)– strace: trace syscall invocations

Page 22: Compilers and Software Security Gaurav S. Kc gskc@cs.columbia.edu gskc Programming Systems Lab Tuesday, 22 nd April 2003.

SecurityRuntime Management of ProcessesVulnerabilities and Attack TechniquesCompilers 4115Security ResearchConclusion

Page 23: Compilers and Software Security Gaurav S. Kc gskc@cs.columbia.edu gskc Programming Systems Lab Tuesday, 22 nd April 2003.

Security ResearchSecurity Research

Know thy enemy– Monitor the attacker’s behaviour and tactics– In a constrained resource environment

Honeypots– Illusion of an “easy target” to lure attackers

Jail– Sandboxed environment using chroot– All necessary files are available locally

Virtual machines Sandboxes with limited syscalls

Page 24: Compilers and Software Security Gaurav S. Kc gskc@cs.columbia.edu gskc Programming Systems Lab Tuesday, 22 nd April 2003.

Automatic Defence MechanismsAutomatic Defence Mechanisms

Face thy enemy – Applications fortified with runtime checks

Stackguard, Memguard, .NET cl.exe /gs – “canary” word to detect Stack-smashing– READONLY stack frame– .NET C/C++ compiler protects 0(%ebp),4(%ebp)

Libsafe, Libverify– “safe” implementation of standard libraries– runtime backup/checking of return address

Page 25: Compilers and Software Security Gaurav S. Kc gskc@cs.columbia.edu gskc Programming Systems Lab Tuesday, 22 nd April 2003.

Defence through DiversityDefence through Diversity

Code Diversity– Code randomisation for diversity– Security through obscurity even for open-

source software– No more: breach once, breach everywhere

Compiler-based Protection– Secure the stack data– Potentially vulnerable heap data

Page 26: Compilers and Software Security Gaurav S. Kc gskc@cs.columbia.edu gskc Programming Systems Lab Tuesday, 22 nd April 2003.

CasperCasper

Paper: Casper: Compiler-assisted securing of programs at runtime

Via added runtime checks as part of function invocations

Add protection codeProtect what: control data in stack framesWhat from: most stack-smashing attacksAvailable as patches:

• Compiler: gcc-2.95• Debugger: gdb-5.2.1

Page 27: Compilers and Software Security Gaurav S. Kc gskc@cs.columbia.edu gskc Programming Systems Lab Tuesday, 22 nd April 2003.

Casper in ActionCasper in Action

Similar in nature to Stackguard, but with much smaller overhead

XOR property: idempotent when applied twice. Simplest form of encryption / obfuscation of data

int xfloat ychar *s

ret. addr: 0x0abcdef0old fp: 0x4fedcba8

int aint b

char buffer[SIZE]int c

Casper protection

•Mask original return address value when entering function•Unmask and restore the original return address value when returning from function•Overwritten value will be “restored” to invalid code address

PC

ret. addr := 32-bit XOR ret. addr

Page 28: Compilers and Software Security Gaurav S. Kc gskc@cs.columbia.edu gskc Programming Systems Lab Tuesday, 22 nd April 2003.

Get the Processor InvolvedGet the Processor Involved

Paper: Countering Code-Injection Attacks With Instruction-Set Randomization

Machine instruction translation – unique per process Reversible mapping

machine instruction ↔ garbage bit sequence1. Post-compilation stage

• Encode all executable sections with key• Store codec key in file header

2. Modified von Neumann: fetch, decrypt, decode, execute• decrypt: “Processor” restores each block of bytes to valid,

original instruction• Injected code gets probabilistically transformed to garbage bit-

sequence that cannot be decoded

Page 29: Compilers and Software Security Gaurav S. Kc gskc@cs.columbia.edu gskc Programming Systems Lab Tuesday, 22 nd April 2003.

Binary Encryption and ExecutionBinary Encryption and Execution

SOURCE CODE

MACHINE EXECUTABLE

FILE

compile

key

ENCRYPTED EXECUTABLE

FILE

key

encryptvia objcopy

fetch

decrypt

Page 30: Compilers and Software Security Gaurav S. Kc gskc@cs.columbia.edu gskc Programming Systems Lab Tuesday, 22 nd April 2003.

Binary Encryption and Execution 2Binary Encryption and Execution 2

Bochs Pentium emulator is the “modified machine”– Support for hidden register %gav– Interrupt routine handler saves %gav to process

structure Linux 2.2.14

– Kernel recognises new register– Support for register in process structure

as and objcopy for program encryption and codec storage

code

Page 31: Compilers and Software Security Gaurav S. Kc gskc@cs.columbia.edu gskc Programming Systems Lab Tuesday, 22 nd April 2003.

Future WorkFuture Work

Randomised ISA on real machine– Programmable Transmeta chips– Dynamo: Dynamic optimiser of native code

Activation records – automatically managed, randomised layout

Heap smashing techniques– break type-system– corrupt malloc data, Diversified research– Languages, Compilers: C++, Sun CC, Visual C++– Other architectures: Solaris, Alpha (DLX ;-)

Page 32: Compilers and Software Security Gaurav S. Kc gskc@cs.columbia.edu gskc Programming Systems Lab Tuesday, 22 nd April 2003.

ConclusionConclusion

Security– Process Security

Runtime Management of Processes– Stack, Heap, Activation Records

Vulnerabilities and Attack Techniques– Buffer overrun. Stacksmashing. Pointer overwriting.

Compilers 4115– GCC, GDB, Binutils

Security Research– Monitoring. Runtime protection

Page 33: Compilers and Software Security Gaurav S. Kc gskc@cs.columbia.edu gskc Programming Systems Lab Tuesday, 22 nd April 2003.

ReferencesReferences

1. The Bochs Pentium emulatorhttp://bochs.sourceforge.net/

2. Aleph One. Smashing The Stack For Fun And Profithttp://www.phrack.org/show.php?p=49&a=14

3. Arash Baratloo, N. Singh, T. TsaiTransparent Run-Time Defense Against Stack Smashing Attacks

4. Crispin Cowan, M. Barringer, et al.FormatGuard: Automatic Protection From printf format string vulnerabilities

5. Crispin Cowan, Calton Pu, et al.StackGuard: Automatic Adaptive Detection and Prevention of Buffer-Overflow Attacks

6. Gaurav S. Kc, Stephen A. Edwards, Gail E. Kaiser, Angelos KeromytisCasper: Compiler-assisted securing of programs at runtime

7. Gaurav S. Kc, Angelos D. Keromytis, Vassilis PrevelakisCountering Code-Injection Attacks With Instruction-Set Randomization

Page 34: Compilers and Software Security Gaurav S. Kc gskc@cs.columbia.edu gskc Programming Systems Lab Tuesday, 22 nd April 2003.

Optimisation of Tail-RecursionOptimisation of Tail-Recursion

int factorial(int n) { if (1 >= n) return 1; return n*factorial(n-1);}int val = factorial(x);

int factorial(int n, int v) { if (1 >= n) return v; return factorial(n-1, v*n);}int val = factorial(x, 1);

factorial:

...

pushl n-1

call factorial

...

factorial:

...

n := n-1

v := v*n

goto factorial

C source code Assembly

back

Page 35: Compilers and Software Security Gaurav S. Kc gskc@cs.columbia.edu gskc Programming Systems Lab Tuesday, 22 nd April 2003.

x86 Processorx86 Processor

Dual integer pipeline

Hidden register %eip does not always fetch the “next” instruction

back

Page 36: Compilers and Software Security Gaurav S. Kc gskc@cs.columbia.edu gskc Programming Systems Lab Tuesday, 22 nd April 2003.

Binary Encryption Code: GNU Binary Encryption Code: GNU asasif [ ! $1 ] ; then echo "usage: $0 <ELF_executable_image> [key]"; exit; fi

if [ ! $2 ] ; then XOR_KEY="0x$RANDOM"; else XOR_KEY=$2; fi

# file names

NEW_FILE="$1.$XOR_KEY"

ORG_FILE=$1

INTERMEDIATE="$XOR_KEY.o"

# modified binary

OBJCOPY=/home/gskc/usr/binutils-2.13.2/bin/objcopy

# create an intermediate ELF object file with an .xor.stuff section

as -o $INTERMEDIATE <<EOF

.section .xor.stuff

.long $XOR_KEY

EOF

# merge the .xor.stuff section into the specified file

$OBJCOPY --encrypt-xor-key $XOR_KEY --add-section .xor.stuff=$INTERMEDIATE $ORG_FILE $NEW_FILE

# clean up

rm -f $INTERMEDIATE

back