Characterizing and Reasoning about Security Vulnerabilities Shuo Chen Center for Reliable and High-Performance Computing Coordinated Science Laboratory.

Characterizing and Reasoning about Characterizing and Reasoning about Security VulnerabilitiesSecurity Vulnerabilities

Shuo ChenShuo ChenCenter for Reliable and High-Performance ComputingCenter for Reliable and High-Performance Computing

Coordinated Science LaboratoryCoordinated Science LaboratoryUniversity of Illinois at Urbana-Champaign University of Illinois at Urbana-Champaign

Preliminary Examination, May 4Preliminary Examination, May 4thth, 2004, 2004

Committee Chair: Prof. Ravishankar K. IyerCommittee Chair: Prof. Ravishankar K. IyerCommittee: Prof. Vikram Adve Committee: Prof. Vikram Adve

Prof. Jose Meseguer Prof. Jose Meseguer Prof. David Nicol Prof. David Nicol

Significance of Software Implementation ErrorsSignificance of Software Implementation Errors Bugtraq: 70% of security vulnerabilities due to Bugtraq: 70% of security vulnerabilities due to

implementation errors.implementation errors.

Access Validation

Error10%

Boundary Condition

Error21%

Failure to Handle

Exceptional Conditions

11%

Unknown6%

Design Error18%

Environment Error1%

Input Validation

Error23%

Origin Validation

Error3%

Race Condition

Error2%

Configuration Error5%

What I Have DoneWhat I Have Done Analyzed CERT and Bugtraq reports and the corresponding Analyzed CERT and Bugtraq reports and the corresponding

application source code.application source code. Developed a new FSM representation to decompose each security Developed a new FSM representation to decompose each security

vulnerability to a series of elementary activities (primitive FSMs), vulnerability to a series of elementary activities (primitive FSMs), each indicating a simple predicate. each indicating a simple predicate.

The FSM analysis showedThe FSM analysis showed– Many vulnerabilities (Many vulnerabilities ( 66%) due to 66%) due to

pointer taintedness: user input value pointer taintedness: user input value used as a pointer value (which should used as a pointer value (which should be transparent to users).be transparent to users).

– A significant portion of vulnerabilities A significant portion of vulnerabilities (( 33.6%) due to errors in library 33.6%) due to errors in library functions or incorrect invocations of functions or incorrect invocations of library functionslibrary functions

Format String 7%

Globbing2%

Heap Corruption

8%

Integer Overflow

6%

Buffer Overflow

44%

Other33%

The FSM modeling led to a formal reasoning approach to The FSM modeling led to a formal reasoning approach to examine pointer taintedness in applications.examine pointer taintedness in applications.

Formal Analysis of Pointer TaintednessFormal Analysis of Pointer Taintedness Pointer Taintedness: a pointer value, including a return : a pointer value, including a return

address, is derived directly or indirectly from user input. address, is derived directly or indirectly from user input. (formally defined using equational logic) (formally defined using equational logic)

Provides a unifying perspective for reasoning about a Provides a unifying perspective for reasoning about a significant number of security vulnerabilities.significant number of security vulnerabilities.

The notion of pointer taintedness enables:The notion of pointer taintedness enables:– Static analysis: reasoning about the possibility of pointer taintedness Static analysis: reasoning about the possibility of pointer taintedness

by source code analysis; by source code analysis; – Runtime checking: inserting assertions in object code to check Runtime checking: inserting assertions in object code to check

pointer taintedness at runtime; pointer taintedness at runtime; – Hardware architecture-based support to detect pointer taintedness.Hardware architecture-based support to detect pointer taintedness.

Current focus: extraction of security specifications of library Current focus: extraction of security specifications of library functions based on pointer taintedness semantics. functions based on pointer taintedness semantics.

Publications of My ResearchPublications of My Research Papers:

– J. Xu, S. Chen, Z. Kalbarczyk, R. K. Iyer. "An Experimental Study of Security Vulnerabilities Caused by Errors". DSN 2001.

– S. Chen, J. Xu, R. K. Iyer, K. Whisnant. "Modeling and Analyzing the Security Threat of Firewall Data Corruption Caused by Instruction Transient Errors". DSN 2002.

– S. Chen, Z. Kalbarczyk, J. Xu, R. K. Iyer. "A Data-Driven Finite State Machine Model for Analyzing Security Vulnerabilities". DSN 2003.

– S. Chen, K. Pattabiraman, Z. Kalbarczyk, R. K. Iyer, “Formal Reasoning of Various Categories of Widely Exploited Security Vulnerabilities Using Pointer Taintedness Semantics”, IFIP Information Security Conference, 2004.

Security Vulnerability Report– S. Chen and J. Xu, “Bugtraq ID 6255: NULL HTTPD Heap

Corruption Vulnerability”, the Bugtraq List.

A Finite State Machine Approach for A Finite State Machine Approach for Analyzing Security VulnerabilitiesAnalyzing Security Vulnerabilities

Overview of the StudyOverview of the Study An analysis of security vulnerability databases (CERT and An analysis of security vulnerability databases (CERT and

Bugtraq)Bugtraq) Examination of security vulnerabilities at the application Examination of security vulnerabilities at the application

source-code levelsource-code level A security vulnerability usually consists of a series of A security vulnerability usually consists of a series of

vulnerabilities in multiple elementary activities. Each can be vulnerabilities in multiple elementary activities. Each can be represented by a primitive FSM, indicating a simple represented by a primitive FSM, indicating a simple predicate.predicate.

Provide formalism in reasoning and describing security Provide formalism in reasoning and describing security vulnerabilities.vulnerabilities.

Usefulness of the formalism: discovery of the HTTP daemon Usefulness of the formalism: discovery of the HTTP daemon heap overflow vulnerability.heap overflow vulnerability.

Observation from Data AnalysisObservation from Data Analysis

Vulnerability ID and Vulnerability ID and NameName

Assigned Assigned CategoryCategory

Description in Bugtraq Description in Bugtraq ReportReport

Elementary Elementary ActivityActivity

#3163:#3163: Sendmail signed Sendmail signed integer overflowinteger overflow

Input Input validation errorvalidation error

A negative input integer is A negative input integer is accepted as an array indexaccepted as an array index

Get an input Get an input integerinteger

#5493:#5493: FreeBSD System FreeBSD System Call Signed Integer Call Signed Integer VulnerabilityVulnerability

Boundary Boundary condition errorcondition error

A negative value supplied for A negative value supplied for the argument allows the argument allows exceeding the boundary of an exceeding the boundary of an arrayarray

Use the integer Use the integer as the index to as the index to an arrayan array

##39583958: : RSYNC Signed RSYNC Signed Array Index Remote Code Array Index Remote Code Execution VulnerabilityExecution Vulnerability

Access Access validation errorvalidation error

A remotely supplied signed A remotely supplied signed value is used as an array value is used as an array index, allowing the index, allowing the corruption of a function corruption of a function pointer or a return address.pointer or a return address.

Execute a code Execute a code referred by a referred by a function function pointer or a pointer or a return addressreturn address

Same vulnerabilities can be classified in different categories. Why? Because of the existence of multiple elementary activities.

Primitive FSMPrimitive FSM

We use We use Primitive FSM (pFSM)Primitive FSM (pFSM) to depict an elementary to depict an elementary activity, which specifies a predicate (SPEC) that should activity, which specifies a predicate (SPEC) that should be guaranteed in order to ensure security.be guaranteed in order to ensure security.

IMP

L_A

CC

EP

T

IMPL_REJECT

SPEC_REJECT

SPEC_ACCEPT

SPEC Check State

Reject State

Accept State

Size(PostD

ata)<length(in

put)contentLen<0

contentLen>=0

length(input) <= Size(PostData)

Op 1: Read user input from a socket into a heap buffer

get (contentLen, input)

Calloc PostData[1024+contentLen] Copy input from the socket

B->fd=AB->bk=C

B->fd and B->bk changed

Heap structure corrupted *

When buf is freed, execute B->fd->bk = B->bkB->fd and B->bk

unchanged

A function pointer corrupted *

pFree changed -

pFree unchanged -

Execute pFree when function free is called

Attacker’s malicious code is executed

Calloc is called

- Load pFree to the memory during program initialization

pFSM1

pFSM2

pFSM3

pFSM4

Op 2: Free the buffer

Op 3: Manipulate the function pointer

NULL HTTPD Heap Corruption Vulnerabilities (Bugtraq #5774, #6255)NULL HTTPD Heap Corruption Vulnerabilities (Bugtraq #5774, #6255)

length(input)>Size(PostData)

contentLen<0

contentLen>=0

length(input) <= Size(PostData)

get (contentLen, input)contentLen is an integer,input: string to be read from a socket

Calloc PostData[1024+contentLen]

Copy input from the socket to PostData by recv() call

?

pFSM1

pFSM2

0: Get contentLen //Negative ??1: PostData = calloc(contentLen +1024, sizeof(char));x=0; rc=0; 2: pPostData= PostData; 3: do { 4: rc=recv(sock, pPostData, 1024, 0); 5: if (rc==-1) { 6: closeconnect(sid,1); 7: return; 8: } 9: pPostData+=rc;10: x+=rc;11: }while ((rc==1024) || (x<contentLen));

Op 1: Read User Data from a Socket to a Heap BufferOp 1: Read User Data from a Socket to a Heap Buffer

Sendmail Debugging Function Signed Integer Sendmail Debugging Function Signed Integer Overflow (Bugtraq #3163)Overflow (Bugtraq #3163)

Operation 1:Write integer i to tTvect[x]

addr_setuid unchanged

tTvect[x]=i

Operation 2:Manipulate the function pointer

addr_setuid changed

Execute code referred by addr_setuid

convert str_i and str_x to integer i and x

( integer represented by str_x) > 231

x 100

x > 100

?

Execute malicious code

get text strings str_x and str_i

?

x < 0 or x > 100

0 x 100

Function pointer is tainted *

Load the function pointer

( integer represented

by str_x) 2 31

pFSM1

pFSM2

pFSM3

Modeled VulnerabilitiesModeled Vulnerabilities

Signed Integer OverflowSigned Integer Overflow Heap CorruptionHeap Corruption Stack OverflowStack Overflow Format String VulnerabilitiesFormat String Vulnerabilities File Race ConditionsFile Race Conditions Some Input Validation VulnerabilitiesSome Input Validation Vulnerabilities

Formal Reasoning of Security Formal Reasoning of Security Vulnerabilities by Pointer Vulnerabilities by Pointer

Taintedness SemanticsTaintedness Semantics

Pointer Taintedness Caused VulnerabilitiesPointer Taintedness Caused Vulnerabilities Format string vulnerability Format string vulnerability

– Taint an argument pointer of functions such as Taint an argument pointer of functions such as printf, printf, fprintf, sprintf fprintf, sprintf andand syslog. syslog.

Stack smashing Stack smashing – Taint a return address.Taint a return address.

Heap corruption Heap corruption – Taint the free-chunk doubly-linked list of the heap.Taint the free-chunk doubly-linked list of the heap.

Glibc globbing vulnerabilities Glibc globbing vulnerabilities – User input resides in a location that is used as a pointer User input resides in a location that is used as a pointer

by the parent function of by the parent function of glob().glob().

Example of Format String VulnerabilityExample of Format String Vulnerability

In vfprintf(), if (fmt points to “%n”) then **ap = (character count)

Vulnerable code: recv(buf); printf(buf); /* should be printf(“%s”,buf) */

\xdd \xcc \xbb \xaa %d %d %d %n

……

%n%n

%d%d

%d%d

%d%d

0xaabbccdd0xaabbccdd

fmt: format string pointer

ap: argument pointer

High

Low

Sta

ck g

row

th

*ap is a tainted value.

ap: argument pointer

fmt: format string pointer

Taintedness Semantics Taintedness Semantics (Memory Model)(Memory Model)

• A store represents a snapshot of the memory state at a point in the program execution. • For each memory location, we can evaluate two properties: content and taintedness (true/false).• Operations on memory locations:

•The fetch operation Ftch(S,A) gives the content of the memory address A in store S•The location-taintedness operation LocT(S,A) gives the taintedness of the location A in store S

• Operations on expressions:•The evaluation operation Eval(S,E) evaluates expression E in store S•The expression-taintedness operation ExpT(S,E) computes the taintedness of expression E in store S

Axioms of Axioms of EvalEval and and ExpTExpT operations operationsEval(S, I) = I // I is an integer constantEval(S, ^ E1) = Ftch(S, Eval(S,E1))Eval(S, E1 + E2) = Eval(S, E1) + Eval(S, E2)Eval(S, E1 - E2) = Eval(S, E1) - Eval(S, E2) … …ExpT (S, I) = falseExpT(S, ^ E1) = LocT(S,Eval(S,E1)) ExpT(S,E1 + E2) = ExpT(S,E1) or ExpT((S,E2)ExpT(S,E1 - E2) = ExpT(S,E1) or ExpT((S,E2)… …

E.g., is the expression (^100)–2 tainted?ExpT(S, (^100)–2) = ExpT(S, (^100)) or ExpT(S, 2) = LocT(S,100) or false = LocT(S,100)

Note: ^ is the dereference operator, ^100 gives the content in the location 100

Semantics of Language LSemantics of Language L Extend the semantics proposed by Extend the semantics proposed by Goguen and Malcolm Goguen and Malcolm The following operations (arithmetic/logic) are defined:The following operations (arithmetic/logic) are defined:

– +, -, *, /, %, !, &&, ||, !=, ==, ……+, -, *, /, %, !, &&, ||, !=, ==, …… The following instructions are defined:The following instructions are defined:

– mov [Exp1] <- Exp2mov [Exp1] <- Exp2– branch (Condition) Labelbranch (Condition) Label – call FuncName(Exp1,Exp2,…)call FuncName(Exp1,Exp2,…)

Axioms defining Axioms defining movmov instruction semantics instruction semantics– Specify the effects of applying Specify the effects of applying movmov instruction on a store instruction on a store– Allow taintedness to propagate from Exp2 to [Exp1].Allow taintedness to propagate from Exp2 to [Exp1].

Axioms defining the semantics of Axioms defining the semantics of recvrecv (similarly, (similarly, scanfscanf, , recvfromrecvfrom))– Specify the memory locations tainted by the recv call.

Extracting Function Specifications Extracting Function Specifications by Theorem Proverby Theorem Prover

C source code of a library function

Code in language L

Automatically translated to Language L

Critical instruction – indirect writesFor each mov [^ E1] <- E2, generate

theorems:a) E1 should not be taintedb) The mov instruction should not taint any

location outside the buffer pointed by E1

Theorem generation

ITP theorem prover

A set of sufficient conditions that imply the validity of the theorems. They are the security specifications of the analyzed function.

Example: strcpy()Example: strcpy()

char * strcpy (char * dst, char * src) { char * res;0: res =dst; while (*src!=0) {1: *dst=*src; dst++; src++; }2: *dst=0; return res;}

0: mov [res] <- ^ dst

lbl(#while#6)

branch (^ ^ src is 0) #ex#while#6

1: mov [^ dst] <- ^ ^ src

mov [dst] <- (^ dst) + 1

mov [src] <- (^ src) + 1

branch true #while#6

lbl(#ex#while#6)

2: mov [^ dst] <- 0

mov [ret] <- ^ res

Translate to Language L

a) Suppose S1 is the store before Line L1, then LocT(S1,dst) = false b) If S0 is the store before Line L0, and S2 is the store after Line L1, then

I < Eval(S0, ^dst) or Eval(S0, ^dst+dstsize) I => LocT(S2,I) = LocT(S0, I)

c) Suppose S3 is the store before Line L2, then LocT(S3,dst) = false

Theorem generation

Theorem prover

Specifications Suggested by Specifications Suggested by Theorem ProverTheorem Prover

Specifications that are extracted by Specifications that are extracted by the theorem proving approachthe theorem proving approach– srclensrclen <= <= dstsizedstsize– The buffers The buffers srcsrc and and dstdst do not do not

overlap in such a way that the buffer overlap in such a way that the buffer dstdst covers the NULL-terminator of covers the NULL-terminator of the the srcsrc string. string.

– The buffers The buffers dstdst and and srcsrc do not cover do not cover the function frame of strcpy.the function frame of strcpy.

– Initially, Initially, dst dst is not taintedis not tainted

Documented in Linux man page

Not documented

Suppose when function strcpy() is called, the Suppose when function strcpy() is called, the sizesize of of destination buffer (dst) is destination buffer (dst) is dstsizedstsize, the , the lengthlength of user of user input string (src) is input string (src) is srclensrclen

Example Scenario Example Scenario

Destination buffer should not cover the function frame of strcpy.

char input[240];void foo( ) { int offset; char buf[200]; scanf(“%s”, input ); offset = 200 – strlen( input ); strcpy( buf + offset , input );}

bufbuf

strcpy

foo

res res

buf

buf+offsetHigh

Low

Sta

ck g

row

th

Return Addr.Return Addr.

Frame PointerFrame Pointer

srcsrc

dstdst

indexindex

Are the extracted specifications possible to be violated in application code?

Other ExamplesOther Examples A simplied version of A simplied version of printf()printf()

– 55 lines of C code55 lines of C code– Four security specifications are extracted, including one Four security specifications are extracted, including one

indicating format string vulnerabilityindicating format string vulnerability Function Function free()free() of a heap management system of a heap management system

– 36 lines of C code36 lines of C code– Seven security specifications are extracted, including several Seven security specifications are extracted, including several

specifications indicating heap corruption vulnerabilities.specifications indicating heap corruption vulnerabilities. Socket read functions of Apache HTTPD and NULL Socket read functions of Apache HTTPD and NULL

HTTPDHTTPD– The Apache function is proved to be free of pointer taintedness.The Apache function is proved to be free of pointer taintedness.– Two (known) vulnerabilities are exposed in the theorem proving Two (known) vulnerabilities are exposed in the theorem proving

process. process.

SummarySummary FSM representation: decompose each FSM representation: decompose each

vulnerability to multiple simple predicates (with vulnerability to multiple simple predicates (with real vulnerability examples)real vulnerability examples)

A common characteristic of many predicates: A common characteristic of many predicates: their violations result in pointer taintednesstheir violations result in pointer taintedness

Defined a memory model to reason about Defined a memory model to reason about pointer taintednesspointer taintedness

Developed a theorem proving approach to Developed a theorem proving approach to extract security specifications from library extract security specifications from library functionsfunctions

Future DirectionsFuture Directions Develop a VCGen (verification condition generator) to Develop a VCGen (verification condition generator) to

facilitate theorem proving. (in progress)facilitate theorem proving. (in progress) Apply the pointer taintedness analysis to a substantial Apply the pointer taintedness analysis to a substantial

number of commonly used library functions to extract number of commonly used library functions to extract their security specifications. their security specifications.

Compiler techniques for inserting “guarding code” to Compiler techniques for inserting “guarding code” to check unproved properties at runtime.check unproved properties at runtime.

Explore the possibility of building the taintedness Explore the possibility of building the taintedness notion into virtual machines.notion into virtual machines.

Architecture supports for pointer taintedness detection. Architecture supports for pointer taintedness detection. A module working with RSE (Reliability and Security A module working with RSE (Reliability and Security Engine).Engine).

Backup SlidesBackup Slides

Format String VulnerabilityFormat String Vulnerability

int vfprintf (FILE *s, const char *format, va_list ap){ char * p; … … *(int *) va_arg (ap, void *) = count; … … }int printf (const char *format, ...){ … … count = vfprintf (stdout, format, arg); … …}int i,j;int main(){ char buf[100]; *(unsigned int *)buf=&i; *(buf+4)=0; strcat(buf,"%d%d%d12345%n"); printf(buf);}

format

%n

string 12345

%d%d%d

0x08049978

vfprintf

printf

ap

format

s

p

p

ap

ap

12-byte gap

High

Low

Sta

ck g

row

th

ReturnAddr of Printf

FramePointer of Printf

arg

count

main buf

ReturnAddr of Vfprintf

FramePointer of Vfprintf

the addr of i

buf = \x78 \x99 \x04 \x08 %d %d %d ‘1’ ‘2’ ‘3’ ‘4’ ‘5’ %n

Elementary Activity 1 ofElementary Activity 1 of Sendmail Sendmail VulnerabilityVulnerability

?

pFSM1

a

Elementary Activity 1: get user input Get strings str_x and str_i, convert them to integers x and i

(integer represented by str_x) > 231

(integer represented

by str_x) 2 31Convert str_x and str_i to integers x and i

Get str_x and str_i


pFSM2

Elementary Activity 2: assign debug level

Convert str_x and str_i to integers x and i

x<0 or x>100

0x 100

x >100

x 100

tTvect[x]=i

A function pointer (psetuid) is corrupted


?

pFSM3

Elementary Activity 3: manipulation of function pointer psetuid A function pointer

(psetuid) is corrupted

starting sendmail program

Load psetuid to the memory psetuid is changed

psetuid is unchanged

Execute the code referred by psetuid

Execute malicious code

Appropriateness of DereferenceAppropriateness of Dereference A data value x is appropriate to be dereferenced if and A data value x is appropriate to be dereferenced if and

only if one of the following condition is true, assuming only if one of the following condition is true, assuming Y,Z are integer constants:Y,Z are integer constants:– x is &foo (foo is a program variable) x is &foo (foo is a program variable) – x is malloc(Y) x is malloc(Y) – If there exist values a, b and c that are appropriate to If there exist values a, b and c that are appropriate to

dereference, (recursive definition) dereference, (recursive definition) and x = a + b – c + Zand x = a + b – c + Z

Theorems to prove for indirect write Theorems to prove for indirect write mov [Ê1] <- E2mov [Ê1] <- E2– E1 should be appropriate to dereferenceE1 should be appropriate to dereference– If E2 is not appropriate to dereference, then [Ê1] should not If E2 is not appropriate to dereference, then [Ê1] should not

be appropriate to dereference.be appropriate to dereference.

About Equational LogicAbout Equational LogicA logic defined by equations. Equations are used to rewrite symbolic terms (by replacing the term on the left of the equation with the term on the right of the term). Emphasize on its executability.

Define the natural number (NAT):Operators: 0 : a constant of NAT s_ : NAT -> NAT (successor operator) _+_ : NAT NAT -> NAT (addition operator)Equations: 0 + N = N (s M) + N = M + (s N)Example: (s s s 0) + (s s 0) = (s s 0) + (s s s 0) = (s 0) + (s s s s 0) = 0 + (s s s s s 0) = s s s s s 0 Intuitively, this represents “3 + 2 = 5”

Semantics of Semantics of movmov and and recvrecv Axioms of mov instructionAxioms of mov instruction

Ftch((S ; mov [E1] <- E2),X) = Eval(S,E2) if (Eval(S,E1) is X) .

Ftch((S ; mov [E1] <- E2),X) = Ftch(S,X) if not (Eval(S,E1) is X) .

LocT((S ; mov [E1] <- E2),X) = ExpT(S,E2) if (Eval(S,E1) is X) .

LocT((S ; mov [E1] <- E2),X) = LocT(S,X) if not (Eval(S,E1) is X) .

Semantics of Semantics of recvrecv (similarly, (similarly, scanfscanf, , recvfromrecvfrom))

– LocT(S ; call recv (sock , buf , len, flag), A) = true if Eval(S,buf) <= A and A < Eval(S, buf + len) .

– LocT(S ; call recv (sock , buf , len, flag), A) = LocT(S, A) otherwise .

Related WorkRelated Work Security ModelingSecurity Modeling

– Sheyner and Wing: Attack graphsSheyner and Wing: Attack graphs– Ortalo and Deswarte: Markov modelsOrtalo and Deswarte: Markov models

Static code analysisStatic code analysis– Buffer overflow detection: Wagner, many othersBuffer overflow detection: Wagner, many others– Format string detection: CQUAL, SPLINTFormat string detection: CQUAL, SPLINT– Assembly code verification: Proof-Carrying CodeAssembly code verification: Proof-Carrying Code– Generic (annotation based): SPLINT, Eau ClaireGeneric (annotation based): SPLINT, Eau Claire

Taintedness analysisTaintedness analysis– Perl runtimePerl runtime– CQUAL and SPLINT: taintedness of program variables. CQUAL and SPLINT: taintedness of program variables.

» A symbol gets tainted only if an explicit C statement passes a tainted value to it by A symbol gets tainted only if an explicit C statement passes a tainted value to it by assignment, argument passing or function return. No underlying memory model. assignment, argument passing or function return. No underlying memory model.

» Not sufficient to detect real pointer taintedness vulnerabilities.Not sufficient to detect real pointer taintedness vulnerabilities.

Position My WorkPosition My Work

Security Specs

Library Functions

Application Code

e.g.,src_len < dst_size (strcpy)src and dst do not overlap (strcpy)Do not free a stack bufferDo not double free a bufferFirst argument of printf cannot come from user… …

Existing static analysis tools

My work

Presentation OutlinePresentation Outline A Brief Description of FSM Approach of A Brief Description of FSM Approach of

Modeling and Analyzing Security Modeling and Analyzing Security VulnerabilitiesVulnerabilities

Real Examples of Pointer TaintednessReal Examples of Pointer Taintedness Definition of Pointer Taintedness in Definition of Pointer Taintedness in

Equational LogicEquational Logic Extraction of Function Specifications by Extraction of Function Specifications by

Theorem ProvingTheorem Proving Summary and Future DirectionsSummary and Future Directions

Extraction of Security Specs of Library Extraction of Security Specs of Library Functions using Pointer TaintednessFunctions using Pointer Taintedness

A formal approach to reason about potential vulnerabilities in A formal approach to reason about potential vulnerabilities in library source code.library source code.

Reasoning based on a hypothetical memory model: a boolean Reasoning based on a hypothetical memory model: a boolean property property taintednesstaintedness associated with each memory location. associated with each memory location.

The semantics of pointer taintedness defined in equational The semantics of pointer taintedness defined in equational logic.logic.

A theorem prover employed to extract security specifications of A theorem prover employed to extract security specifications of library functions. library functions.

Security specifications extracted by the analysis:Security specifications extracted by the analysis:– expose different classes of known security vulnerabilities, expose different classes of known security vulnerabilities,

such as format string, heap corruption and buffer overflow such as format string, heap corruption and buffer overflow vulnerabilities; vulnerabilities;

– indicate function invocation scenarios that may expose new indicate function invocation scenarios that may expose new vulnerabilities.vulnerabilities.

Observations from Data Analysis (cont.)Observations from Data Analysis (cont.)

Exploiting a vulnerability involves multiple Exploiting a vulnerability involves multiple vulnerable vulnerable operationsoperations on several objects. on several objects.

Exploits must pass through multiple Exploits must pass through multiple elementaryelementary activitiesactivities, each providing an opportunity for , each providing an opportunity for performing a security check.performing a security check.

For each elementary activity, the vulnerability data For each elementary activity, the vulnerability data and corresponding code inspections allow us to and corresponding code inspections allow us to define a define a predicatepredicate, which if violated, will result in a , which if violated, will result in a security vulnerability.security vulnerability.

Characterizing and Reasoning about Security Vulnerabilities Shuo Chen Center for Reliable and High-Performance Computing Coordinated Science Laboratory.

Documents

pointer taintedness

notion of pointer taintedness

pointer value

security vulnerability

fsm analysis

fsm modeling

formal reasoning approach

source code analysis