Top Banner
Constraint-Based Analysis CS 8803 FPL Oct 24, 2012 (Slides courtesy of Alex Aiken) 1
39

Constraint-Based Analysis CS 8803 FPL Oct 24, 2012 (Slides courtesy of Alex Aiken) 1.

Jan 19, 2016

Download

Documents

Agatha Hill
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Constraint-Based Analysis CS 8803 FPL Oct 24, 2012 (Slides courtesy of Alex Aiken) 1.

Constraint-Based Analysis

CS 8803 FPLOct 24, 2012

(Slides courtesy of Alex Aiken)

1

Page 2: Constraint-Based Analysis CS 8803 FPL Oct 24, 2012 (Slides courtesy of Alex Aiken) 1.

2

void f(state *x, state *y) {result = spin_trylock(&x->lock); spin_lock(&y->lock);…if (!result) spin_unlock(&x->lock);spin_unlock(&y->lock);

}

Code Example

Path Sensitivity

result

(!result)Pointers &

Heap

(&x->lock);

(&x->lock);

(&y->lock);

(&y->lock); Inter-

procedural

Flow Sensitivityspin_tryloc

kspin_lock

spin_unlock

Locked

Unlocked

Error

unlock

lock un

loc

k

lock

Page 3: Constraint-Based Analysis CS 8803 FPL Oct 24, 2012 (Slides courtesy of Alex Aiken) 1.

3

Saturn• What?

– SAT-based approach to static bug detection

• How? – SAT-based approach– Program constructs Boolean constraints– Inference SAT solving

• Why SAT?– Lots of reasons, but for now:– Program states naturally expressed as bits– The theory for bits is SAT– Efficient solvers widely available

Page 4: Constraint-Based Analysis CS 8803 FPL Oct 24, 2012 (Slides courtesy of Alex Aiken) 1.

4

Intuition

• Analyzing in one direction is problematic– Forwards or backwards– Consider null dereference analysis

• No null ptr assignments: forwards is best• No dereferences: backwards is best

• Constraints– Give a global picture of the program– Allow more efficient order of solution

Page 5: Constraint-Based Analysis CS 8803 FPL Oct 24, 2012 (Slides courtesy of Alex Aiken) 1.

5

Straight-line Code

void f(int x, int y) {

int z = x & y ; assert(z == x);

}

x31 … x0y31 … y0

==

x31y31 … x0y0

Bitwise-AND

R

y&xz

==

;

Page 6: Constraint-Based Analysis CS 8803 FPL Oct 24, 2012 (Slides courtesy of Alex Aiken) 1.

6

Straight-line Code

void f(int x, int y) {

int z = x & y; assert(z == x);

}

R

Query: Is-Satisfiable( )

Answer: Yes

x = [00…1] y = [00…0]

Negated assertion is satisfiable.

Therefore, the assertion may fail.

Page 7: Constraint-Based Analysis CS 8803 FPL Oct 24, 2012 (Slides courtesy of Alex Aiken) 1.

7

Control Flow – Preparation

• Approach– Assumes loop free program– Unroll loops, drop backedges

• May miss errors that are deeply buried– Bug finding, not verification– Many errors surface in a few iterations

• Advantages– Simplicity, reduces false positives

Page 8: Constraint-Based Analysis CS 8803 FPL Oct 24, 2012 (Slides courtesy of Alex Aiken) 1.

8

if (c)

Control Flow – Example

• Merges– preserve path sensitivity– select bits based on the values of incoming

guards

G = c, x: [a31…a0]

G = c, x: [b31…b0]

G = c c, x: [v31…v0]

where vi = (cai)(cbi)

c x = a;

x = b;res =

x;

c if (c) x = a; else x = b; res = x;

true

Page 9: Constraint-Based Analysis CS 8803 FPL Oct 24, 2012 (Slides courtesy of Alex Aiken) 1.

9

Pointers – Overview

• May point to different locations…– Thus, use points-to sets

p: { l1,…,ln }

• … but path sensitive – Use guards on points-to relationships

p: { (g1, l1), …, (gn, ln) }

Page 10: Constraint-Based Analysis CS 8803 FPL Oct 24, 2012 (Slides courtesy of Alex Aiken) 1.

10

G = c, p: { (true, y) }

Pointers – Example

G = true, p: { (true, x) }p = &x;if (c) p = &y;res = *p; G = true, p: { (c, y); (c, x)}

if (c) res = y;

else if (c) res = x;

Page 11: Constraint-Based Analysis CS 8803 FPL Oct 24, 2012 (Slides courtesy of Alex Aiken) 1.

11

Pointers – Recap

• Guarded Location Sets { (g1, l1), …, (gn, ln) }

• Guards– Condition under which points-to relationship

holds– Collected from statement guards

• Pointer Dereference– Conditional Assignments

Page 12: Constraint-Based Analysis CS 8803 FPL Oct 24, 2012 (Slides courtesy of Alex Aiken) 1.

12

Not Covered

• Other Constructs– Structs, …

• Modeling of the environment

• Optimizations– several to reduce size of formulas– some form of program slicing important

Page 13: Constraint-Based Analysis CS 8803 FPL Oct 24, 2012 (Slides courtesy of Alex Aiken) 1.

13

What can we do with Saturn?int f(lock_t *l) {

lock(l);…unlock(l);

}

if (l->state == Unlocked)

l->state = Locked;

else

l->state = Error;

if (l->state == Locked)

l->state = Unlocked;

else

l->state = Error;

Locked

Unlocked

Error

unlock

lock un

loc

k

lock

Page 14: Constraint-Based Analysis CS 8803 FPL Oct 24, 2012 (Slides courtesy of Alex Aiken) 1.

14

General FSM Checking

• Encode FSM in the program– State Integer– Transition Conditional Assignments

• Check code behavior– SAT queries

Page 15: Constraint-Based Analysis CS 8803 FPL Oct 24, 2012 (Slides courtesy of Alex Aiken) 1.

15

How are we doing so far?

• Precision:

• Scalability: – SAT limit is 1M clauses– About 10 functions

• Solution:– Divide and conquer– Function summaries

Page 16: Constraint-Based Analysis CS 8803 FPL Oct 24, 2012 (Slides courtesy of Alex Aiken) 1.

16

Function Summaries (1st try)

• Function behavior can be summarized with a set of state transitions

• Summary:*l: Unlocked Unlocked

Locked Error

int f(lock_t *l){

lock(l);…

…unlock(l);return 0;

}

Page 17: Constraint-Based Analysis CS 8803 FPL Oct 24, 2012 (Slides courtesy of Alex Aiken) 1.

17

int f(lock_t *l){

lock(l);…if (err) return -1;…unlock(l);return 0;

}

A Difficulty

• Problem – two possible output

states– distinguished by

return value(retval == 0)…

• Summary1. (retval == 0) *l: Unlocked Unlocked

Locked Error2. (retval == 0) *l: Unlocked Locked

Locked Error

Page 18: Constraint-Based Analysis CS 8803 FPL Oct 24, 2012 (Slides courtesy of Alex Aiken) 1.

18

FSM Function Summaries

• Summary representation (simplified):{ Pin, Pout, R }

• User gives:– Pin: predicates on initial state– Pout: predicates on final state– Express interprocedural path sensitivity

• Saturn computes:– R: guarded state transitions– Used to simulate function behavior at call site

Page 19: Constraint-Based Analysis CS 8803 FPL Oct 24, 2012 (Slides courtesy of Alex Aiken) 1.

19

int f(lock_t *l){

lock(l);…if (err) return -1;…unlock(l);return 0;

}

Lock Summary (2nd try)

• Output predicate:– Pout = { (retval == 0) }

• Summary (R):1. (retval == 0) *l: Unlocked Unlocked

Locked Error2. (retval == 0) *l: Unlocked Locked

Locked Error

Page 20: Constraint-Based Analysis CS 8803 FPL Oct 24, 2012 (Slides courtesy of Alex Aiken) 1.

20

Lock checker for Linux

• Parameters:– States: { Locked, Unlocked, Error }– Pin = {}

– Pout = { (retval == 0) }

• Experiment:– Linux Kernel 2.6.5: 4.8MLOC– ~40 lock/unlock/trylock primitives– 20 hours to analyze

• 3.0GHz Pentium IV, 1GB memory

Page 21: Constraint-Based Analysis CS 8803 FPL Oct 24, 2012 (Slides courtesy of Alex Aiken) 1.

21

Double Locking/Unlockingstatic void sscape_coproc_close(…) {

spin_lock_irqsave(&devc->lock, flags);if (…)

sscape_write(devc, DMAA_REG, 0x20);…

}

static void sscape_write(struct … *devc, …) {spin_lock_irqsave(&devc->lock, flags);…

}

Page 22: Constraint-Based Analysis CS 8803 FPL Oct 24, 2012 (Slides courtesy of Alex Aiken) 1.

22

Ambiguous Return State

int i2o_claim_device(…) {down(&i2o_configuration_lock);if (d->owner) {

up(&i2o_configuration_lock);return –EBUSY;

}if (…) {

return –EBUSY;}…

}

Page 23: Constraint-Based Analysis CS 8803 FPL Oct 24, 2012 (Slides courtesy of Alex Aiken) 1.

23

Bugs

Type Bugs False Pos. % Bugs

Double Locking 134 99 57%

Ambiguous State 45 22 67%

Total 179 121 60%

Previous Work: MC (31), CQual (18), <20% Bugs

Page 24: Constraint-Based Analysis CS 8803 FPL Oct 24, 2012 (Slides courtesy of Alex Aiken) 1.

24

Function Summary Database• 63,000 functions in Linux

– More than 23,000 are lock related– 17,000 with locking constraints on entry– Around 9,000 affects more than one

lock– 193 lock wrappers– 375 unlock wrappers– 36 with return value/lock state

correlation

• Available on the web . . .

Page 25: Constraint-Based Analysis CS 8803 FPL Oct 24, 2012 (Slides courtesy of Alex Aiken) 1.

25

Another Checker

• Memory leaks– Common, esp. in error handling code– Hard to find– Problematic in long running applications

• Current techniques– Escape analysis– Ownership types– Region based analysis…

Page 26: Constraint-Based Analysis CS 8803 FPL Oct 24, 2012 (Slides courtesy of Alex Aiken) 1.

26

Simple Leak

char *f() {char *p;p = (char*)malloc(…);…if (err) return NULL;…return p;

}

Page 27: Constraint-Based Analysis CS 8803 FPL Oct 24, 2012 (Slides courtesy of Alex Aiken) 1.

27

Scenario 1 – Malloc Wrapperschar *f() {char *p;p = (char*)strdup(…);…if (err) return NULL;…return p;

}

Page 28: Constraint-Based Analysis CS 8803 FPL Oct 24, 2012 (Slides courtesy of Alex Aiken) 1.

28

Scenario 2 – External Referenceschar *f(struct *s) {char *p;p = (char*)malloc(…);s->name = p;if (err) return NULL;…return p;

}

Page 29: Constraint-Based Analysis CS 8803 FPL Oct 24, 2012 (Slides courtesy of Alex Aiken) 1.

29

Scenario 3 – Function Calls

char *f(struct state *s) {char *p;p = (char*)malloc(…);g(s, p);if (err) return NULL;…return p;

}

void g(s, p) { s->name = p;}

Page 30: Constraint-Based Analysis CS 8803 FPL Oct 24, 2012 (Slides courtesy of Alex Aiken) 1.

30

Scenario 4 – Data dependencyvoid f(int len) {char fastbuf[10], *p;if (len < 10) p = fastbuf;else p = (char *)malloc(len);…if (p != fastbuf) free(p);

}

Page 31: Constraint-Based Analysis CS 8803 FPL Oct 24, 2012 (Slides courtesy of Alex Aiken) 1.

31

Requirements

• Track points-to relationships precisely

• Infer escaping functions– ones that create external references to

objects passed in via parameters

• Infer allocation functions

Page 32: Constraint-Based Analysis CS 8803 FPL Oct 24, 2012 (Slides courtesy of Alex Aiken) 1.

32

Analysis Part I – Points-to Rule• PointsTo(p, l)

– condition under which p points to l

(p) = { (g0, l0), …, (gn-1, ln-1) }

PointsTo(p, l) = gi (if li = l) false (otherwise)

Page 33: Constraint-Based Analysis CS 8803 FPL Oct 24, 2012 (Slides courtesy of Alex Aiken) 1.

33

Analysis Part II – EscapeVia

• EscapeVia(l, p, X)– the condition under which location l escapes

via pointer p, excluding references in set X

• Access Roots– Every object in the function body is accessed

through one of the following “roots”• Parameters (p1…pn)• The Return Value (ret_val)• Global Variables• Local Variables• Heap Allocated Objects

Page 34: Constraint-Based Analysis CS 8803 FPL Oct 24, 2012 (Slides courtesy of Alex Aiken) 1.

34

Analysis Part II – EscapeVia

• Never escape through local variables

Root(p) Locals X EscapeVia(l, p, X) = false

• Always escape through global variables

RootOf(p) GlobalsEscapeVia(l, p, X) = PointsTo(p, l)

Page 35: Constraint-Based Analysis CS 8803 FPL Oct 24, 2012 (Slides courtesy of Alex Aiken) 1.

35

• Escaping through parameters/return RootOf(p) (Params { ret_val }) – X EscapeVia(l, p, X) = PointsTo(p, l)

• Escaping via another allocated location

RootOf(p) NewLocs – XEscapeVia(l, p, X) = PointsTo(p, l)

Escaped(p,X {RootOf(l)})

Analysis Part II – EscapeVia

Page 36: Constraint-Based Analysis CS 8803 FPL Oct 24, 2012 (Slides courtesy of Alex Aiken) 1.

36

Analysis Part III – Escape/Leak• Escape Condition

Escaped(l, X) = p EscapedVia(l, p, X)

• Leak ConditionLeaked(l, X) = Escaped(l, X)

• Leak CheckerFor all new locations l, there is a leak if

Satisfiable(Leaked(l, {}))

Page 37: Constraint-Based Analysis CS 8803 FPL Oct 24, 2012 (Slides courtesy of Alex Aiken) 1.

37

Results

LOC (K)

# Alloc Func.

# Bugs FP (%)

Samba 404 80 83 8.79%

OpenSSL 296 101 117 0.85%

BinUtils 909 91 136(66)

3.55%

OpenSSH

36 19 29(10) 0%

Total 1,646 291 365 3.69%

Page 38: Constraint-Based Analysis CS 8803 FPL Oct 24, 2012 (Slides courtesy of Alex Aiken) 1.

38

Why SAT? (Revisited …)

• Moore’s Law

• Uniform modeling of constructs as bits

• Constraints– Local specification– Global solution

• Incremental SAT solving– makes multiple queries efficient

Page 39: Constraint-Based Analysis CS 8803 FPL Oct 24, 2012 (Slides courtesy of Alex Aiken) 1.

39

Why SAT? (Cont.)

• Path sensitivity is important– To find bugs– To reduce false positives– Much easier to model precisely with SAT

• Compositionality is important– Function summaries critical for

scalability– Easy to construct with SAT queries