Top Banner
Saturn Overview 1 An Overview of the Saturn Project
50
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Saturn Overview1 An Overview of the Saturn Project.

Saturn Overview 1

An Overview of the Saturn Project

Page 2: Saturn Overview1 An Overview of the Saturn Project.

The Three-Way Trade-Off

• Precision– Modeling programs accurately enough to be useful

• Scalability– Saying anything at all about large programs

• Human Effort– How much work must the user do?– Either giving specifications, or interpreting results

Saturn Overview 2

Today’s focus

Not so much about this . . .

Page 3: Saturn Overview1 An Overview of the Saturn Project.

Saturn Overview 3

Precision

int f(int x) {

. . . . . . }

Intraprocedural analysis with minimal abstraction.

Ff

A(Fg)

A(Fh)

Primary abstraction is done at function boundaries.

[A(Ff), A(Fg), A(Fh)] A(Ff)

[A(Ff), A(Fg), A(Fh)]

[A(Ff), A(Fg), A(Fh)]

formula

Page 4: Saturn Overview1 An Overview of the Saturn Project.

Saturn Overview 4

Scalability

• Design constraint:

SAT formula size ~ function size

• Analyze one function at a time

• Parallel implementation– Server sends functions to

clients to analyze– Typically use 50-100 cores to

analyze Linux

Page 5: Saturn Overview1 An Overview of the Saturn Project.

Summaries

• Abstract at function boundaries– Compute a summary for function’s behavior

• Summaries should be small– Ideally linear in the size of the function’s interface

• Summaries are our primary form of abstraction– Saturn delays abstraction to function boundaries

Slogan: Analysis design is summary design!

Saturn Overview 5

Page 6: Saturn Overview1 An Overview of the Saturn Project.

Expressiveness

• Analyses written in Calypso

• Logic programs– Express traversals of the program– E.g., backwards/forwards propagation

• Constraints– For when we don’t know traversal order

• Written ~40,000 lines of Calypso code

Saturn Overview 6

Page 7: Saturn Overview1 An Overview of the Saturn Project.

Availability

• An open source project– BSD license

• All Calypso code available for published experiments

saturn.stanford.edu

Saturn Overview 7

Page 8: Saturn Overview1 An Overview of the Saturn Project.

People

Saturn Overview 8Brian

Hackett

Alex Aiken Suhabe Bugrara Isil Dillig

Thomas Dillig

Peter Hawkins

Yichen Xie(past)

Page 9: Saturn Overview1 An Overview of the Saturn Project.

Outline

• Saturn overview

• An example analysis– Intraprocedural– Interprocedural

• What else can you do?

• Survey of results

Saturn Overview 9

Page 10: Saturn Overview1 An Overview of the Saturn Project.

Saturn Architecture

C Program

C Frontend

C Syntax Databases

CalypsoInterpreter

Calypso analyses Constraint Solvers

Summary Databases Summary Reports UI10Saturn Overview

Page 11: Saturn Overview1 An Overview of the Saturn Project.

Parsing and C Frontend

Source Code

Build Interceptor

Preprocessed Source Code

CIL frontend

Abstract Syntax Tree Databases

Other possible frontends

11Saturn Overview

Page 12: Saturn Overview1 An Overview of the Saturn Project.

Calypso

• General purpose logic programming language– Pure– Prolog-like syntax

• Bottom-up evaluation– Magic sets transformation

• Also a (minor) moon of Saturn

12Saturn Overview

Page 13: Saturn Overview1 An Overview of the Saturn Project.

Helpful Features

• Strong static type and mode checking

• Permanent data (sessions) – stored as Berkeley DB databases– Sessions are just a named bundle of predicates

• Support for unit-at-a-time analysis

13Saturn Overview

Page 14: Saturn Overview1 An Overview of the Saturn Project.

Extensible Interpreter

Logic ProgramInterpreter

SAT Solver#sat predicate, …

LP Solver

DOT graph package

UI package14Saturn Overview

Page 15: Saturn Overview1 An Overview of the Saturn Project.

Scalability

• Interpreter is not very efficient

• OK, it’s slow

• But can run distributed analyses – 50-100 CPUs

• Scalability is more important than raw speed– Can run intensive analyses of the entire Linux kernel

(>6MLOC) in a few hours.

15Saturn Overview

Page 16: Saturn Overview1 An Overview of the Saturn Project.

Cluster Architecture

Master Node

Worker Node 1

Worker Node 100

Calypso DB

Calypso DB

Databases

16Saturn Overview

Page 17: Saturn Overview1 An Overview of the Saturn Project.

Job Scheduling

Saturn Overview 17

Dynamically track dependencies between jobs

Rerun jobs if new dependencies found

•Optimistic concurrency control

Job = a function body

Iterate to fixpoint for circular dependencies

Page 18: Saturn Overview1 An Overview of the Saturn Project.

Constraint Solvers

Calypso Analyses

Alias Analysis

Function Pointer Analysis

C Syntax Predicates

CFG Construction

Memory Model

NULL checker

Typestate verifier

18Saturn Overview

Page 19: Saturn Overview1 An Overview of the Saturn Project.

Check that a thread does not:– acquire the same lock twice– release the lock twice

Otherwise the application may deadlock or crash.

The Paradigmatic Locking Analysis

19Saturn Overview

Page 20: Saturn Overview1 An Overview of the Saturn Project.

Specification

Saturn Overview 20

locked unlocked

error

unlock

lock

unlock

lock

Page 21: Saturn Overview1 An Overview of the Saturn Project.

• We assume – one locking function lock(l) – one unlocking function unlock(l).

• We analyze one function at a time – produce locking summary describing the

FSM transitions associated with a given lock.

Basic Setup

21Saturn Overview

Page 22: Saturn Overview1 An Overview of the Saturn Project.

An Example Function & Summary

Saturn Overview 22

f( . . ., lock *L, . . .) { lock(L); . . . unlock(L);}

L: unlocked -> unlocked locked -> error

• Summaries are input state -> output state• The net effect of the function on the lock

• Summary size is independent of function size•Bounded by the square of the number of states

Page 23: Saturn Overview1 An Overview of the Saturn Project.

type lockstate ::= locked | unlocked | error.

• Predicates to describe lock states on nodes and edges of the CFG:

predicate node_state(P:pp,L:t_trace,S:lockstate,G:g_guard).

predicate edge_state(P:pp,L:t_trace,S:lockstate,G:g_guard).

Program point pp is a unique id for each point in the program

Trace t_trace is a unique name for a memory location

Guard g_guard is a boolean constraint

Lock States

23Saturn Overview

Page 24: Saturn Overview1 An Overview of the Saturn Project.

1. Initialize lock states at function entry

2. Join operator:– Combine edges to produce successor’s

node_state

3. Transfer functions for every primitive:– assignments– tests– function calls

The Intraprocedural Analysis

24Saturn Overview

Page 25: Saturn Overview1 An Overview of the Saturn Project.

Initializing a Lock

• Use fresh boolean variable

• Interpretation: is true ) L is locked– : is true ) L is unlocked

• Enforces that L cannot be both locked and unlocked simultaneously

25Saturn Overview

Page 26: Saturn Overview1 An Overview of the Saturn Project.

Notation

26Saturn Overview

(lock, state, guard)

At program point P, the lock is in state if guard is true.

P

Page 27: Saturn Overview1 An Overview of the Saturn Project.

node_state(P0,L,locked,LG):- entry(P0), is_lock(L), fresh_variable(L, LG).

node_state(P0,L,unlocked,UG):-

entry(P0),node_state(P0,L,locked,LG),

#not(LG, UG).

Initialization Rules

27Saturn Overview

f( . . ., lock *L, . . .) { . . .}

(L, locked, LG)(L, unlocked, UG)

Allocates new boolean variable associated with lock L.

P0

Page 28: Saturn Overview1 An Overview of the Saturn Project.

1. Initialize lock states at function entry

2. Join operator:– Combine edges to produce successor’s

node_state

3. Transfer functions for every primitive:– assignments– tests– function calls

The Intraprocedural Analysis

28Saturn Overview

Page 29: Saturn Overview1 An Overview of the Saturn Project.

node_state(P,L,S,G) :-

edge_state(P,L,S,_),

\/edge_state(P,L,S,EG):#or_all(EG,G).

Joins

Note: There is no abstraction in the

join . . .29Saturn Overview

(L, locked, F1) (L, locked, F2)

if (…)

(L, locked, F1ÇF2)

Page 30: Saturn Overview1 An Overview of the Saturn Project.

1. Initialize lock states at function entry

2. Join operator:– Combine edges to produce successor’s

node_state

3. Transfer functions for every primitive:– assignments– function calls– etc.

The Intraprocedural Analysis

30Saturn Overview

Page 31: Saturn Overview1 An Overview of the Saturn Project.

Assignments do not affect lock state:

edge_state(P1,L,S,G) :-

assign(P0,P1,_), node_state(P0,L,S,G).

Assignments

31Saturn Overview

X = E;

P0

P1

(L, S, G)

(L,S, G)

Page 32: Saturn Overview1 An Overview of the Saturn Project.

• Function summaries are the building blocks of interprocedural analysis.

• Generating a function summary requires:– Predicates encoding relevant facts– A session to store these predicates.

Interprocedural Analysis Basics

32Saturn Overview

Page 33: Saturn Overview1 An Overview of the Saturn Project.

1. Generating function summaries

2. Using function summaries– How do we retrieve the summary of a

callee?

– How do we map facts associated with a callee to the namespace of the currently analyzed function?

Interprocedural Analysis Outline

33Saturn Overview

Page 34: Saturn Overview1 An Overview of the Saturn Project.

session sum_locking(FN:string) containing[lock_trans].

predicate lock_trans(L: t_trace, S0: lockstate, S1: lockstate).

Summary Declaration

34Saturn Overview

sum_lockingDeclares a persistent database sum_locking (function name)holding lock_trans facts

Page 35: Saturn Overview1 An Overview of the Saturn Project.

Summaries for lock and unlock:

sum_locking("lock")->lock_trans(*arg0,locked,error) :- .

sum_locking("lock")->lock_trans(*arg0,unlocked,locked) :- .

sum_locking("unlock")->lock_trans(*arg0,unlocked,error) :- .

sum_locking("unlock")->lock_trans(*arg0,locked,unlocked) :-. *arg0 is the memory

location modified by lock and unlock

Summary Generation: Primitives

35Saturn Overview

Page 36: Saturn Overview1 An Overview of the Saturn Project.

sum_locking(F)->lock_trans(L, S0, S1) :-

current_function(F),entry(P0), node_state(P0, L, S0 , G0), exit(P1), node_state(P1, L, S1, G1), #and(G0, G1, G), guard_satisfiable(G).

Summary Generation: Other Functions

36Saturn Overview

F( . . ., lock *L, . . .) { . . .

}

P0

P1

(L, S0, G0)

(L, S1, G1)

if SAT(G1 Æ G2), then . . .

F: S0 ! S1

h

Page 37: Saturn Overview1 An Overview of the Saturn Project.

call_transfer(I, L, S0, S1, G) :- direct_call(I, F), call(P0, _, I),sum_locking(F)->lock_trans(CL, S0, S1),

instantiate(s_call{I}, P0, CL, L, G).

Summary Application Rule

37Saturn Overview

G( . . .) { F(. . .)

}

F: S0 ! S1

P0 (S0, L, G)

(S1, L, G)

Page 38: Saturn Overview1 An Overview of the Saturn Project.

Applications

• Bug finding

• Verification

• Software Understanding

38Saturn Overview

Page 39: Saturn Overview1 An Overview of the Saturn Project.

Saturn Bug Finding

• Early work– Locking

• Scalable Error Detection using Boolean Satisfiability. POPL 2005

– Memory leaks• Context- and Path-Sensitive Memory Leak Detection. FSE 2005

– Scripting languages• Static Detection of Security Vulnerabilities in Scripting Languages.

15th USENIX Security Symposium, 2006

• Recent work– Inconsistency Checking

• Static Error Detection Using Semantic Inconsistency Inference. PLDI 2007

39Saturn Overview

Page 40: Saturn Overview1 An Overview of the Saturn Project.

Examples: Null pointer dereferences

Application KLOC Warnings Bugs False Alarms FA Rate

Openssl-0.9.8b 339 55 47 6 11.30%

Samba-3.0.23b 516 68 46 19 29.20%

Openssh-4.3p2 155 9 8 1 11.10%

Pine-4.64 372 150 119 28 19.00%

Mplayer-1.0pre8 762 119 89 28 23.90%

Sendmail-8.13.8 365 9 8 1 11.10%

Linux-2.6.17.1 6200 373 299 66 18.10%

Total 8793 783 616 149 19.50%

40Saturn Overview

Page 41: Saturn Overview1 An Overview of the Saturn Project.

Lessons Learned

• Saturn-based tools improve bug-finding– Multiple times more bugs than previous results– Lower false positive rate

• Why?– “Sounder” than previous bug finding tools

• bit-level modeling, handling casts, aliasing, etc.

– Precise• Fully intraprocedurally path-sensitive• Partially interprocedurally path-sensitive

Saturn Overview 41

Page 42: Saturn Overview1 An Overview of the Saturn Project.

Lessons Learned (Cont.)

• Design of function summary is key to scalability and precision

• Summary-based analysis only looks at the relevant parts of the heap for a given function

• Programmers write functions with simple interfaces

Saturn Overview 42

Page 43: Saturn Overview1 An Overview of the Saturn Project.

Saturn Verification

• Unchecked user pointer dereferences– Important OS security property– Also called “probing” or “user/kernel pointers”

• Precision requirements– Context-sensitive– Flow-sensitive– Field-sensitive– Intraprocedurally path-sensitive

43Saturn Overview

Page 44: Saturn Overview1 An Overview of the Saturn Project.

Current Results for Linux-2.6.1

• 6.2 MLOC with 91,543 functions

• Verified 616 / 627 system call arguments – 98.2%– 11 false alarms

• Verified 851,686 / 852,092 dereferences– 99.95%– 406 false alarms

44Saturn Overview

Page 45: Saturn Overview1 An Overview of the Saturn Project.

Preliminary Lessons Learned

• Bug finders can be sloppy: ignore functions or points-edges that inhibit scalability or precision

• Soundness substantially more difficult than finding bugs

• Lightweight, sparsely placed annotations – Have programmers add some information– Makes verification tractable– Only 22 annotations need for user pointer analysis

45Saturn Overview

Page 46: Saturn Overview1 An Overview of the Saturn Project.

Saturn for Software Understanding

• A program analysis is a code search engine

• Generic question: Do programmers ever do X?– Write an analysis to find out– Run it on lots of code– Classify the results– Write a paper . . .

46Saturn Overview

Page 47: Saturn Overview1 An Overview of the Saturn Project.

Examples

• Aliasing is used in very stylized ways, at least in C– Cursors into data structures– Parent/child pointers– And 7 other idioms

How is Aliasing Used in Systems Software? FSE 2006

• Do programmers take the address of function ptrs?– Answer: Almost never.– Allows simpler analysis of function pointers

47Saturn Overview

Page 48: Saturn Overview1 An Overview of the Saturn Project.

Other Things We’ve Thought About

• Shape analysis– We notice the lack of shape information

• Interprocedural path-sensitivity– Needed for some common programming patterns

• Proving correctness of Saturn analyses

48Saturn Overview

Page 49: Saturn Overview1 An Overview of the Saturn Project.

Related Work

• Lots– All bug finding and verification tools of the last

10 years

• Particularly, though– Systems using logic programming (bddbddb)– ESP– Metal– CQual– Blast

Saturn Overview 49

Page 50: Saturn Overview1 An Overview of the Saturn Project.

saturn.stanford.edu

Saturn Overview 50