Top Banner
Bitblaze Alex Bazhanyuk, @Abazhanyuk “RE” school, DefCon-UA, 2012
54
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 4

Bitblaze

Alex Bazhanyuk, @Abazhanyuk

“RE” school, DefCon-UA, 2012

Page 2: 4

Motivation

● Service● Kernel space: kernel, drivers● Mobile● Embedded

Page 3: 4

Goal

● vulnerabilities● malware

Page 4: 4

What we have?

● Taint analysis● Symbolic execution● Solver:stp, smt, z3● Data-flow analysis, special case for

graph analysis

Page 5: 4

Theory of taint analysis

Page 6: 4

The one of the data tracing

Taint sources:

Network, Keyboard, Memory, Disk, Function outputs

• Taint propagation: a data flow technique

Shadow memory

Whole-system

Across register/memory/disk/swapping

Page 7: 4

Fundamentals of taint analysis

Page 8: 4

Taint propagation

•If an operation uses the value of some tainted object, say X, to derive a value

for another, say Y, then object Y becomes tainted. Object X tainted the object Y

•Taint operator t

•X→ t(Y)

•Taint operator is transitive

X → t(Y) and Y→ t(Z), then X→ t(Z)

Page 9: 4

Static Taint Analysis

Analysis performed over multiple paths of a program

* Typically performed on a control flow graph (CFG):

statements are nodes, and there is an edge between nodes if there is a possible transfer of control.

Page 10: 4
Page 11: 4
Page 12: 4

BitBlaze: Binary Analysis Infrastructure

Automatically extracting security-related properties from

binary code

Build a unified binary analysis platform for security

- Static analysis + Dynamic analysis + Symbolic Analysis

- Leverages recent advances in program analysis, formal methods, binary instrumentation…

Solve security problems via binary analysis

• More than a dozen different security applications

• Over 25 research publications

Page 13: 4

BitBlaze

http://bitblaze.cs.berkeley.edu/

TEMU,VINE

Rudder, Panorama, Renovo

Page 14: 4

TEMU

Page 15: 4

Confines TEMU

Only gcc-3.4

Qemu 0.9.1 - TEMU

Qemu 0.10 - TCG(Tiny Code Generator)-TODO

Qemu 0.10 =! Qemu 1.01

Page 16: 4

Create trace with the network service:

$sudo ./tracecap/temu -m 256 -net nic,vlan=0 -net tap,vlan=0,script=/etc/qemu-ifup -monitor stdio winxp.img

(qemu) load_plugin tracecap/tracecap.so

general/trace_only_after_first_taint is enabled.

general/log_external_calls is disabled.

general/write_ops_at_insn_end is enabled.

general/save_state_at_trace_stop is disabled.

tracing/tracing_table_lookup is enabled.

tracing/tracing_tainted_only is disabled.

tracing/tracing_single_thread_only is disabled.

tracing/tracing_kernel is disabled.

tracing/tracing_kernel_tainted is disabled.

tracing/tracing_kernel_partial is disabled.

network/ignore_dns is disabled.

Enabled: 0x00 Proto: 0x00 Sport: 0 Dport: 0 Src: 0.0.0.0 Dst: 0.0.0.0

Loading plugin options from: SRC_PATH/tracecap/ini/hook_plugin.ini

Loading plugins from: SRC_PATH/shared/hooks/hook_plugins

tracecap/tracecap.so is loaded successfully!

(qemu) enable_emulation

Emulation is now enabled

(qemu) taint_nic 1

(qemu) tracebyname "FileZilla_server.exe" /traces/file_zilla

(Посылаем данные по сети)

(qemu) Time of first tainted data: 1289487729.962905

(qemu) trace_stop

Page 17: 4

VINE

Page 18: 4

The Vine Intermediate Language

Page 19: 4

fc32dcec: rep stos %eax,%es:(%edi) R@eax[0x00000000][4](R) T0 R@ecx[0x00000002][4](RCW) T0 M@0xfb7bfff8[0x00000000][4](CW) T1 {15 (1231, 69624) (1231, 69625) (1231, 69626) (1231, 69627) }

fc32dcec: rep stos %eax,%es:(%edi) R@eax[0x00000000][4](R) T0 R@ecx[0x00000001][4](RCW) T0 M@0xfb7bfffc[0x00000000][4](CW) T1 {15 (1231, 69628) (1231, 69629) (1231, 69630) (1231, 69631) }

fc32dcee: mov %edx,%ecx R@edx[0x0000015c][4](R) T0 R@ecx[0x00000000][4](W) T0

fc32dcf0: and $0×3,%ecx I@0×00000000[0x00000003][1](R) T0 R@ecx[0x0000015c][4](RW) T0

fc32dcf5: andl $0×0,-0×4(%ebp) I@0×00000000[0x00000000][1](R) T0 M@0xfb5ae738[0x00000002][4](RW) T0

fc32dcf9: jmp 0x00000000fc32c726 J@0×00000000[0xffffea2d][4](R) T0

fc32c726: cmpl $0×0,-0×58(%ebp) I@0×00000000[0x00000000][1](R) T0 M@0xfb5ae6e4[0x00000000][4](R) T0

fc32c72a: je 0x00000000fc32c369 J@0×00000000[0xfffffc3f][4](R) T0

fc32c369: mov 0xc(%ebp),%eax M@0xfb5ae748[0x81144e70][4](R) T0 R@eax[0x00000000][4](W) T0

fc32c36c: xor %edx,%edx R@edx[0x0000015c][4](R) T0 R@edx[0x0000015c][4](RW) T0

fc32c36e: cmp %edx,%eax R@edx[0x00000000][4](R) T0 R@eax[0x81144e70][4](R) T0

fc32c370: je 0x00000000fc32c3d0 J@0×00000000[0x00000060][4](R) T0

fc32c372: cmp %dl,-0x1d(%ebp) R@dl[0x00000000][1](R) T0 M@0xfb5ae71f[0x00000000][1](R) T0

fc32c375: jne 0x00000000fc32c3d0 J@0×00000000[0x0000005b][4](R) T0

fc32c377: cmp %dl,-0x1a(%ebp) R@dl[0x00000000][1](R) T0 M@0xfb5ae722[0x00000001][1](R) T0

fc32c37a: jne 0x00000000fc32c3a7 J@0×00000000[0x0000002d][4](R) T0

Example of disasm:

Page 20: 4

Taint

T0 - means that the statement did not tainted.

T1 - means that the instruction tainted in curly brackets can be seen that there tainted and what it depends.

Here's an example of:

fc32dcec: rep stos% eax,% es: (% edi) R @ eax [0x00000000] [4] (R) T0 R @ ecx [0x00000001] [4] (RCW) T0 M @ 0xfb7bfffc [0x00000000] [4] (CW) T1 {15 (1231, 628) (1231, 629) (1231, 630) (1231, 631)}

One can see that 4 bits of information tainted and they depend on the offset: 628, 629, 630, 631. 1231 - this number (name).

Page 21: 4

appreplay

./vine-1.0/trace_utils/appreplay -trace font.trace -ir-out font.trace.il -assertion-on-var false-use-post-var false

where:

appreplay - ocaml script that we run;

-trace - the way to the trace;

-ir-out - the path to which we write IL code.

-assertion-on-var false-use-post-var false - flags that show the format of IL code for this to false makes it more readable text.

Page 22: 4

Example of IL code:

Begins with the declaration of variables:

INPUT - it's free memory cells, those that are tested in the very beginning (back in temu), input into the program from an external source.

var cond_000017_0x4010ce_00_162:reg1_t;

var cond_000013_0x4010c3_00_161:reg1_t;

var cond_000012_0x4010c0_00_160:reg1_t;

var cond_000007_0x4010b6_00_159:reg1_t;

var INPUT_10000_0000_62:reg8_t;

var INPUT_10000_0001_63:reg8_t;

var INPUT_10000_0002_64:reg8_t;

var INPUT_10000_0003_65:reg8_t;

var mem_arr_57:reg8_t[4294967296]; – memory as an array

var mem_35:mem32l_t;

var R_EBP_0:reg32_t;

var R_ESP_1:reg32_t;

var R_ESI_2:reg32_t;

var R_EDI_3:reg32_t;

var R_EIP_4:reg32_t;

var R_EAX_5:reg32_t;

var R_EBX_6:reg32_t;

var R_ECX_7:reg32_t;

var R_EDX_8:reg32_t;

var EFLAGS_9:reg32_t;

var R_CF_10:reg1_t;

var R_PF_11:reg1_t;

var R_AF_12:reg1_t;

var R_ZF_13:reg1_t;

var R_SF_14:reg1_t;

var R_OF_15:reg1_t;

var R_CC_OP_16:reg32_t;

var R_CC_DEP1_17:reg32_t;

var R_CC_DEP2_18:reg32_t;

var R_CC_NDEP_19:reg32_t;

var R_DFLAG_20:reg32_t;

var R_IDFLAG_21:reg32_t;

var R_ACFLAG_22:reg32_t;

var R_EMWARN_23:reg32_t;

var R_LDT_24:reg32_t;

var R_GDT_25:reg32_t;

var R_CS_26:reg16_t;

var R_DS_27:reg16_t;

var R_ES_28:reg16_t;

var R_FS_29:reg16_t;

var R_GS_30:reg16_t;

var R_SS_31:reg16_t;

var R_FTOP_32:reg32_t;

var R_FPROUND_33:reg32_t;

var R_FC3210_34:reg32_t;

Page 23: 4

label pc_0x40143c_1: –is the name of which is the address of instruction

/*Filter IRs:*/

Now comes the filter initialization:

{

/*Initializers*/

R_EAX_5:reg32_t = 0×73657930:reg32_t;

{

var idx_144:reg32_t;

var val_143:reg8_t;

idx_144:reg32_t = 0x12fef0:reg32_t;

val_143:reg8_t = INPUT_10000_0000_62:reg8_t;

mem_arr_57[idx_144:reg32_t + 0:reg32_t]:reg8_t =

cast((val_143:reg8_t & 0xff:reg8_t) >> 0:reg8_t)L:reg8_t;

}

{

var idx_146:reg32_t;

var val_145:reg8_t;

idx_146:reg32_t = 0x12fef1:reg32_t;

val_145:reg8_t = INPUT_10000_0001_63:reg8_t;

mem_arr_57[idx_146:reg32_t + 0:reg32_t]:reg8_t =

cast((val_145:reg8_t & 0xff:reg8_t) >> 0:reg8_t)L:reg8_t;

}

And the filter itself changes.

/*ASM IR:*/

{var T_32t0_58:reg32_t;var T_32t1_59:reg32_t;var T_32t2_60:reg32_t;var T_32t3_61:reg32_t;T_32t2_60:reg32_t = R_ESP_1:reg32_t;T_32t1_59:reg32_t = T_32t2_60:reg32_t + 0x1c8:reg32_t;T_32t3_61:reg32_t =((cast(mem_arr_57[T_32t1_59:reg32_t +

0:reg32_t]:reg8_t)U:reg32_t<< 0:reg32_t|cast(mem_arr_57[T_32t1_59:reg32_t +

1:reg32_t]:reg8_t)U:reg32_t<< 8:reg32_t)|cast(mem_arr_57[T_32t1_59:reg32_t +

2:reg32_t]:reg8_t)U:reg32_t<< 0×10:reg32_t)|cast(mem_arr_57[T_32t1_59:reg32_t +

3:reg32_t]:reg8_t)U:reg32_t<< 0×18:reg32_t;R_EAX_5:reg32_t = T_32t3_61:reg32_t;

}

Page 24: 4

Weakest precondition

Vine:● CFG → DPG(DDG) → GCL → (WP form)

DPG - Program-Dependence Graphs

DDG - Data-Dependence Graphs

GCL - Guarded Command Language

WP(S,R) – S is formula, R is condition.

Page 25: 4
Page 26: 4

What is STP and what it does?

STP - a solver for bit-vector expressions.

This is a separate project independent of the

http://sites.google.com/site/stpfastprover/bitblaze

To produce STP code from IL code:

./vine-1.0/utils/wputil trace.il -stpout stp.code

where the input is IL code, and the output is STP code.

Page 27: 4

STP program example

bv : BITVECTOR(10);

a : BOOLEAN;

QUERY(

0bin01100000[5:3]=(0bin1111001@bv[0:0])[4:2]

AND

0bin1@(IF a THEN 0bin0 ELSE 0bin1 ENDIF)=(IF a THEN 0bin110 ELSE 0bin011 ENDIF)[1:0]

);

Page 28: 4

STP Example code:

Begin with the declaration of variables:

R_EBX_6_16 : BITVECTOR(32);

INPUT_10000_0003_65_7 : BITVECTOR(8);

INPUT_10000_0002_64_6 : BITVECTOR(8);

INPUT_10000_0001_63_5 : BITVECTOR(8);

mem_arr_57_8 : ARRAY BITVECTOR(64) OF BITVECTOR(8);

INPUT_10000_0000_62_4 : BITVECTOR(8);

% end free variables.

The very expression of the form question (assert)

ASSERT( 0bin1 =

(LET R_EAX_5_232 =

0hex73657930

IN

(LET idx_144_233 =

0hex0012fef0

IN

(LET val_143_234 =

INPUT_10000_0000_62_4

IN

(LET mem_arr_57_393 =

(mem_arr_57_8 WITH [(0bin00000000000000000000000000000000 @ BVPLUS(32, idx_144_233,0hex00000000))] := (val_143_234;0hexff)[7:0])

…….

IN

(cond_000017_0x4010ce_00_162_392;0bin1))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))));

It's a question:

This expression is not true?

QUERY (FALSE);

And to give a counter example:

COUNTEREXAMPLE;

Page 29: 4

STP example

How to ask for a decision at STP:

./stp stp.code

Example of STP output:

ASSERT( INPUT_10000_0001_63_5 = 0×00 );

ASSERT( INPUT_10000_0002_64_6 = 0×00 );

ASSERT( INPUT_10000_0000_62_4 = 0×61 );

ASSERT( INPUT_10000_0003_65_7 = 0×00 );

Invalid.

Page 30: 4

Solver

● SeL4● Hash: md5, …● Physics

Page 31: 4

Diagram from basic method

Trace, alloc-file, state

TEMU

softwareinput data

Vine

appreplay

wputil

stp input data

TraceIL code

Stp code

Page 32: 4

SASV Components:

Temu (tracecap: start/stop tracing. Various additions to tracecap(hooks etc.))

Vine (appreplay, wputil)

STP

IDA plugins:

- DangerFunctions – finds calls to malloc,strcpy,memcpy etc.

- IndirectCalls – indirect jumps, indirect calls.

- ida2sql (zynamics) –idb in the mysql db. (http://blog.zynamics.com/2010/06/29/ida2sql-exporting-ida-databases-to-mysql/)

Iterators – wrapper for temu, vine, stp.

Various publishers – for DeviceIoControl etc.

Page 33: 4

How does SASV work?

Page 34: 4

SASV

Pattern:

Min Goal: 100% coverage of the danger code

Max Goal: 100% coverage of the all code

Page 35: 4

SASV basic algorithm

1. Work of IDA plugins -> danger places

2. Publisher -> invoke analyzing code

3. TEMU -> trace

4. Trace -> appreplay -> IL

5. IL -> change path algo -> IL’

6. IL’ -> wputil -> STP_prorgam’

7. STP_prorgam’ -> STP -> data for n+1 iteration

8. Goto #2

Page 36: 4

Diagram for new path in graph

Trace, alloc-file, state

TEMU

softwareinput data

Vine

appreplay

wputil

stpNew input data

Trace

IL code

Stp’ code

Changer, symbolic execution

IL’ code

Next Iteration

Page 37: 4

Complex system

New input data

SASVBlackbox

fuzzer

1)

2)

SASV Blackboxfuzzer

Coverage

input data

Set of input data

input data New input data

3)

SASV BlackboxfuzzerCoverage

input data Set of new input data

New input data

Page 38: 4

Diagram from basic method

Trace, alloc-file, state

TEMU

softwareinput data

Vine

appreplay

wputil

stp input data

TraceIL code

Stp code

Page 39: 4

Bitblaze vs blackbox-fuzzerOptipng

PNG format file:

- 7309 iterations. Unique addresses:

12313. Time: 73:32 hours. Coverage:

36.27%.

- size first input data: 2.3Kb.

BMP format file:

- 2368 iterations. Unique addresses:

9620. Time: 12:35 hours. Coverage:

28.34%.

- size first input data: 1.1Kb.

GIF format file:

- 16884 iterations. Unique addresses:

10361. Time: 112:21 hours. Coverage:

30.52%.

- size first input data: 0.9Kb, 2.3Kb, 15Kb.

1 byte change:

PNG format file:

- initial set: 3654

- time for one iteration: 10 sec

- number of iterations: 43200

GIF format file:

- initial set: 8442

- time for one iteration: 3 sec

- number of iterations: 86400

BMP format file:

- initial set: 2368

- time for one iteration: 3 sec

- number of iterations: 86400

2 byte change:

PNG format file:

- initial set: 3654

- time for one iteration: 10 sec

- number of iterations: 86400

GIF format file:

- initial set: 8442

- time for one iteration: 3 sec

- number of iterations: 172800

BMP format file:

- initial set: 2368

- time for one iteration: 3 sec

- number of iterations: 172800

Page 40: 4

Result

Blackbox-fuzzer – 1

Bitblaze – 0

Blackbox-fuzzer + BitBlaze = 3

Page 41: 4

CFG

Page 42: 4

Call graph

Page 43: 4

Disadvantages

The target of vulnerability is difficult question.

Performance – speed of tracing in TEMU is AWFUL

Page 44: 4
Page 45: 4

Get rid of that damned QEMU!

Move taint propagation to Hypervisor!

Damn good idea!

But a lot of code to port/rewrite

Page 46: 4

S2E

● Symbolic execution● Architecture Hypothesis for S²E Android support

Page 47: 4

BAP

● Dynamic: PIN-tools● Static: llvm

Page 48: 4

S2E + SASV

S2E=Qemu+Klee

Klee=LLVM+Stp

Input data => taint analysis (new concept)

Support Arm

Support Qemu 0.12

Page 49: 4

LLVM

LLVM ELLCC - The Embedded LLVM Compiler Collection

Page 50: 4

Data-flow analysis

Page 51: 4

Integrity graph

Page 52: 4

Data Flow Analysis Schema

Page 53: 4

State machine