Top Banner
Theorem Proving and Data Structure Verification Charles Bouillaguet Viktor Kuncak, MIT Computer Science and Artificial Intelligence Lab Spring-Summer 2006
48

Theorem Proving and Data Structure Verification Charles Bouillaguet Viktor Kuncak, MIT Computer Science and Artificial Intelligence Lab Spring-Summer 2006.

Jan 03, 2016

Download

Documents

Allison Baldwin
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Theorem Proving and Data Structure Verification Charles Bouillaguet Viktor Kuncak, MIT Computer Science and Artificial Intelligence Lab Spring-Summer 2006.

Theorem Proving and Data Structure

Verification

Charles BouillaguetViktor Kuncak,

MIT Computer Science and Artificial Intelligence Lab

Spring-Summer 2006

Page 2: Theorem Proving and Data Structure Verification Charles Bouillaguet Viktor Kuncak, MIT Computer Science and Artificial Intelligence Lab Spring-Summer 2006.

Starters

You You MUSTMUST ask ask

questionsquestions

(c) Jean Daligault

Page 3: Theorem Proving and Data Structure Verification Charles Bouillaguet Viktor Kuncak, MIT Computer Science and Artificial Intelligence Lab Spring-Summer 2006.

• Can cause program crashes

• Looping

Inconsistent data structures

next

prev

next

prev

next

prev

Unexpected outcome of operations

Removing two instead of one element

Page 4: Theorem Proving and Data Structure Verification Charles Bouillaguet Viktor Kuncak, MIT Computer Science and Artificial Intelligence Lab Spring-Summer 2006.

Implementing Data structure is hard

• Often small but complex code

• Lots of pointers

• Unbounded, dynamic allocation

• Complex shape invariants

• Trees, directed acyclic graphs, parent pointers

• Properties involving arithmetic (ordering)

• Need strong invariant to guarantee correctness

• e.g. lookup in ordered tree need sortedness

Page 5: Theorem Proving and Data Structure Verification Charles Bouillaguet Viktor Kuncak, MIT Computer Science and Artificial Intelligence Lab Spring-Summer 2006.

How to obtain reliable data structure implementations?

• Formal method:

• Build a proof that the program is correct

• for all possible program executions (sound)

• Verified properties:

• Program do not crash

• Data structure invariants are preserved

• Data structure content is correctly updated

• Goal: automated verification

Page 6: Theorem Proving and Data Structure Verification Charles Bouillaguet Viktor Kuncak, MIT Computer Science and Artificial Intelligence Lab Spring-Summer 2006.

Abstract view of a program• Sequence of instructions

• Instructions mutate the program state

• State = values of the variables

• We want to avoid some states

• {x = null} before y = x.next

• {n = 0} before m = k / n

• for a program point, defines set of acceptable states we want to stay in to avoid crashes

• try to prove property: all instructions yield an acceptable state

Page 7: Theorem Proving and Data Structure Verification Charles Bouillaguet Viktor Kuncak, MIT Computer Science and Artificial Intelligence Lab Spring-Summer 2006.

Not enough

• Avoiding crashes is good

• Checking that the program does what it is supposed to do is better !

• for some def. of “checking”, undecidable problem

• ie no algorithm can do it in all cases

• We don’t care :-)

• How to do that

• see rest of the presentation ?

Page 8: Theorem Proving and Data Structure Verification Charles Bouillaguet Viktor Kuncak, MIT Computer Science and Artificial Intelligence Lab Spring-Summer 2006.

Checking correction of behavior

• Procedure have contracts

• Contracts have 2 parts

• a require clause (precondition):

• property of the state at call site (what the procedures needs)

• an ensure clause (postcondition):

• property of the returned state (what the procedure provides)

• Contracts (specification) describe what the code (implementation) is supposed to do

Page 9: Theorem Proving and Data Structure Verification Charles Bouillaguet Viktor Kuncak, MIT Computer Science and Artificial Intelligence Lab Spring-Summer 2006.

Examples

• Euclidean division

• Requires a state where 0 ≤ m ≤ n

• Ensures a state where n = q m + r (m≠0 ⋅ ∧ 0⇒ ≤r<m)

• Removal in a linked list

• Requires a state where

• following field next from object first defines a list (no loop, unique parent...)

• object o is in content of the list

• Ensures a state where

• the list is still a list (looks simple, but tricky in practice)

• new content = old content - {o}

Page 10: Theorem Proving and Data Structure Verification Charles Bouillaguet Viktor Kuncak, MIT Computer Science and Artificial Intelligence Lab Spring-Summer 2006.

Assume-guarantee verification

• Check that code does what contract says !

• All state satisfying precondition must result in state satisfying postcondition

divdiv0<m<n

div’div’0<m<n

correct

Incorrect

set of correct states(post-condition)

set of resulting states

set of input states(pre-condition)

some resulting state do not satisfy post-

condition

all resulting states satisfy the post-

condition

Page 11: Theorem Proving and Data Structure Verification Charles Bouillaguet Viktor Kuncak, MIT Computer Science and Artificial Intelligence Lab Spring-Summer 2006.

How to check that ?

• Set of states = logic formula

• free variables = variables of the program

• set inclusion = implication

• To verify procedure:

• compute φ (depends on F and pre)

4. check that φ post⇒

φpost φ⇒post

¬(φ⇒post)

FFpre

φ

post

(φ ≡ {x | φ})

Page 12: Theorem Proving and Data Structure Verification Charles Bouillaguet Viktor Kuncak, MIT Computer Science and Artificial Intelligence Lab Spring-Summer 2006.

Computing reachable states

• How to compute set of reachable states ?

• Strongest Post-condition

• smallest set containing all the possibles outcomes of F starting from precondition ϑ.

sp(F, ϑ)post

FFpre

Page 13: Theorem Proving and Data Structure Verification Charles Bouillaguet Viktor Kuncak, MIT Computer Science and Artificial Intelligence Lab Spring-Summer 2006.

• Other possibility: Weakest Precondition

• symmetric: biggest set of states from which post-condition is enforced (depends on F and post)

Alternative verification approach

FF

F’F’

correct

Incorrect

post

wp(F, post)

pre

post

wp(F’, post)

pre⇒ wp(F, post)

¬(pre⇒ wp(F’, post))

all states satisfying precondition lead to states

satisfying postcondition

some state statisfying precondition lead to

postcondition violation

pre

Page 14: Theorem Proving and Data Structure Verification Charles Bouillaguet Viktor Kuncak, MIT Computer Science and Artificial Intelligence Lab Spring-Summer 2006.

• Instruction by instruction

• wp(wp(II11; I; I22, post, post) = wp() = wp(II1, 1, wp(wp(II22, post, post))))

• Starting from the end

•F = I1; ... ; In-1 ; In

Weakest Precondition computation

post...

IInnIIn-1n-1IIn-2n-2II11

wp(In ,post)wp(In-1;In ,post)wp(F,post)

Page 15: Theorem Proving and Data Structure Verification Charles Bouillaguet Viktor Kuncak, MIT Computer Science and Artificial Intelligence Lab Spring-Summer 2006.

• some instructions are easy (assignment)

• some are hard

• if

• loops

• procedure calls

• exceptions

WP for individual instructions

multiple possible multiple possible outcomesoutcomes

can completely can completely change statechange state

Page 16: Theorem Proving and Data Structure Verification Charles Bouillaguet Viktor Kuncak, MIT Computer Science and Artificial Intelligence Lab Spring-Summer 2006.

Loop problems

• example loop

• Problems

• Properties of x and y after n iterations ?

• Depends on properties after (n-1) iterations

• Depends on properties after (n-2) iterations

• ....

• And n only known at run-time

• possibly unbounded number of iterations

Page 17: Theorem Proving and Data Structure Verification Charles Bouillaguet Viktor Kuncak, MIT Computer Science and Artificial Intelligence Lab Spring-Summer 2006.

Loop problems

• example loop

• prog. state after the loop ?

?

uncontrained - 0 knowledge

initial

cannot simulate an unbounded number of iterations by symbolic execution

Page 18: Theorem Proving and Data Structure Verification Charles Bouillaguet Viktor Kuncak, MIT Computer Science and Artificial Intelligence Lab Spring-Summer 2006.

Solution

• Describe state mutations by loop invariant

• Property I of state

• true at the beginning of any number of loop iterations

• therefore true when loop end

• constrain state space during and after loop

• by induction, reduces to

• I(state0)

• ∀n. I(staten) ⇒ I(staten+1)

∀k. I(xk, yk)

II hold initially hold initially

II is inductive is inductive+

Page 19: Theorem Proving and Data Structure Verification Charles Bouillaguet Viktor Kuncak, MIT Computer Science and Artificial Intelligence Lab Spring-Summer 2006.

Back to example

I(i, xi, yi)

5 iterations 15 iterations

Restrain domain of Restrain domain of (x, y)(x, y)

0 iterations

Page 20: Theorem Proving and Data Structure Verification Charles Bouillaguet Viktor Kuncak, MIT Computer Science and Artificial Intelligence Lab Spring-Summer 2006.

• Unrealisticdrawing

Back to example

Fantasy invariant real invariant

xi2 + xiyi+ yi

2=x02 + x0y0+

y02

I(xi, yi) ≡

xi2 + xiyi+ yi

2=x02 + x0y0+

y02

Page 21: Theorem Proving and Data Structure Verification Charles Bouillaguet Viktor Kuncak, MIT Computer Science and Artificial Intelligence Lab Spring-Summer 2006.

Small notes on loop invariants

• There is always one (“True”)

• There is a strongest one (ie smallest set)

• Take (infinite ?) conjunction of all of them (aouch)

• Need a strong enough to prove postcondition

• “True” usually not strong enough

• Techniques to find them automatically

• Slow, do not always work, interesting, out of scope

• Recursive code eliminates problem

• but create some others

Page 22: Theorem Proving and Data Structure Verification Charles Bouillaguet Viktor Kuncak, MIT Computer Science and Artificial Intelligence Lab Spring-Summer 2006.

Real-life loop invariant

• Removal of an element in a linked list

public void remove (Object o1){ Node prev = null; Node current = first; while (current.data != o1) {

prev = current; current = current.next;

} ...}

current ≠ null ∃n. n ≠ null ∧ o1 = n.data ∧ n ∈ current.next*current ∈ first.next* prev ≠ null ⇒ prev.next = currentprev = null ⇒ current = first

scan the list to scan the list to find element to find element to

removeremove

prevprev points to points to nextnext

+ special empty + special empty casecase

end of loop: knows that prev.next = currentcurrent ≠ null

current.data = o1

Page 23: Theorem Proving and Data Structure Verification Charles Bouillaguet Viktor Kuncak, MIT Computer Science and Artificial Intelligence Lab Spring-Summer 2006.

How (most) verification system work ?

• Compute weakest precondition

• prove resulting formula

• external theorem prover

• Weakest preconditiondirectly from Java: hard

• Simplification passes before

weakest precondition

Page 24: Theorem Proving and Data Structure Verification Charles Bouillaguet Viktor Kuncak, MIT Computer Science and Artificial Intelligence Lab Spring-Summer 2006.

Three-address code

new_left = remove(k, t.left);r.data = t.data;

tmp_27 = t.left;tmp_28 = FuncTree.remove(k,tmp_27);new_left = tmp_28;

tmp_35 = t.data; r.data = tmp_35;

expressions simplification by introduction of fresh

variables

sequentialize side effectsMakes null pointer check easier to insert

Page 25: Theorem Proving and Data Structure Verification Charles Bouillaguet Viktor Kuncak, MIT Computer Science and Artificial Intelligence Lab Spring-Summer 2006.

Guarded Command Language

• Java contains nasty things

• if, while loops, method calls

• exceptions, dynamic dispatch

• First, convert Java tosimpler, more abstractlanguage

• Then, use simpler language tocompute weakest preconditions

Page 26: Theorem Proving and Data Structure Verification Charles Bouillaguet Viktor Kuncak, MIT Computer Science and Artificial Intelligence Lab Spring-Summer 2006.

Guarded Command Language

• Adapted from Dijkstra ‘76

StatementStatement MeaningMeaning

assert φ if φ is true, continue. Fail otherwise

assume φ if φ is false, succeeds. Continue otherwise

stmt1 ; stmt2 sequential composition

stmt1⎮stmt2 non-deterministic choice

havoc xnon-deterministically change variable x

x := e assign e to x

Page 27: Theorem Proving and Data Structure Verification Charles Bouillaguet Viktor Kuncak, MIT Computer Science and Artificial Intelligence Lab Spring-Summer 2006.

if, calls, Loops ?

• ...can be expressed

• using contracts and loop invariants

• IF encoded usingnon-deterministic choice

• both case can happens

[| if (cond) then branchtrue else branchfalse |]

(assume cond ; [| branchtrue |])| (assume ¬cond ; [| branchfalse |])

Page 28: Theorem Proving and Data Structure Verification Charles Bouillaguet Viktor Kuncak, MIT Computer Science and Artificial Intelligence Lab Spring-Summer 2006.

if, calls, Loops ?

• Procedure calls

• could be inlined

• exponential blowup !

• encoded using contracts

• modularity ! Check 1 proc. at a time

• assume that all others proc. are correct

[| r = m(x, y, z) |]

assert m.precondition;havoc r;havoc {vars modified by m};assume m.postcondition;

Page 29: Theorem Proving and Data Structure Verification Charles Bouillaguet Viktor Kuncak, MIT Computer Science and Artificial Intelligence Lab Spring-Summer 2006.

if, calls, Loops ?

• loops... induction on invariant (no term. check)

assert invariant;havoc vars(body);assume invariant;((assume condition; [| body |]; assert invariant; assume False)| assume not condition);

[| while (condition) /*: invariant */ {body} |]

invariant hold initiallyno assumptions on loop

variables except that invariant hold

loop condition hold before loop body

invariant still hold

branch succeeds

condition does not holdand verification continue

Page 30: Theorem Proving and Data Structure Verification Charles Bouillaguet Viktor Kuncak, MIT Computer Science and Artificial Intelligence Lab Spring-Summer 2006.

WP from guarded command language

• Easy from guarded command

• Note

• do not enforce loop/recursion termination...

stmtstmt wlp(stmt, ψ)wlp(stmt, ψ)assert φ φ ∧ ψ

assume φ φ ⇒ ψstmt1 ; stmt2 wlp(stmt1, wlp(stmt2, ψ))

stmt1⎮stmt2 wlp(stmt1,ψ) ∧ wlp(stmt2, ψ)

havoc x ∀x. ψ

x := e ψ{x ↦ e}

Page 31: Theorem Proving and Data Structure Verification Charles Bouillaguet Viktor Kuncak, MIT Computer Science and Artificial Intelligence Lab Spring-Summer 2006.

Real life verification condition

• Same List.remove method((List_inlist = (% this. {n. ((rtrancl_pt (% x y. ((Node_next x) = y)) (List_first this) n) & (n ~= null))})) --> ((List_content = (% this. {x. ((x ~= null) & (EX n. ((x = (Node_data n)) & (n : (List_inlist this)))))})) --> ((((Node_data null) = null) & ((Node_next null) = null) & ((List_first null) = null) & (ALL xObj. (xObj : Object)) & ((Node Int List) = {null}) & ((Array Int List) = {null}) & ((Array Int Node) = {null}) & (null : Object_alloc) & (pointsto Node Node_data Object) & (pointsto Node Node_next Node) & (pointsto List List_first Node) & comment ''unalloc_lonely'' (ALL x. ((x ~: Object_alloc) --> ((ALL y. ((Node_data y) ~= x)) & (ALL y. ((Node_next y) ~= x)) & (ALL y. ((List_first y) ~= x)) & ((Node_data x) = null) & ((Node_next x) = null) & ((List_first x) = null)))) & (o1 : (List_content this)) & comment ''List_PrivateInv '' (ALL this. (((this : Object_alloc) & (this : List)) --> (ALL n. ((n : (List_inlist this)) --> ((Node_data n) ~= null))))) & comment ''List_PrivateInv '' (tree [List_first, Node_next]) & comment ''List_PrivateInv '' (ALL this. (((this : Object_alloc) & (this : List)) --> (ALL n m. (((n : (List_inlist this)) & (m : (List_inlist this)) & ((Node_data n) = (Node_data m))) --> (n = m)))))) --> ((comment ''thisNotNull'' (this ~= null) & comment ''thisType'' ((this : List) & (this : Object_alloc)) & comment ''o1_type'' ((o1 : Object) & (o1 : Object_alloc))) --> (comment ''InvHoldsInitially'' (((List_inlist this) = {n. ((rtrancl_pt (% x y. ((Node_next x) = y)) (List_first this) n) & (n ~= null))}) & ((List_first this) ~= null) & (EX n. ((o1 = (Node_data n)) & (n : (List_inlist this)))) & (rtrancl_pt (% x y. ((Node_next x) = y)) (List_first this) (List_first this)) & ((null ~= null) --> ((Node_next null) = (List_first this))) & ((null = null) --> ((List_first this) = (List_first this)))) & (((inlist_curr_28 = {n. ((rtrancl_pt (% x y. ((Node_next x) = y)) current_25 n) & (n ~= null))}) & (current_25 ~= null) & (EX n. ((o1 = (Node_data n)) & (n : inlist_curr_28))) & (rtrancl_pt (% x y. ((Node_next x) = y)) (List_first this) current_25) & ((prev_27 ~= null) --> ((Node_next prev_27) = current_25)) & ((prev_27 = null) --> (current_25 = (List_first this)))) --> (comment ''NullCheckFieldNode_data'' (current_25 ~= null) & (((Node_data current_25) ~= o1) --> (comment ''NullCheckFieldNode_next'' (current_25 ~= null) & (EX n. ((o1 = (Node_data n)) & (n : (inlist_curr_28 - {current_25})))) & ((EX n. ((o1 = (Node_data n)) & (n : (inlist_curr_28 - {current_25})))) --> comment ''InvPreservation'' (((inlist_curr_28 - {current_25}) = {n. ((rtrancl_pt (% x y. ((Node_next x) = y)) (Node_next current_25) n) & (n ~= null))}) & ((Node_next current_25) ~= null) & (EX n. ((o1 = (Node_data n)) & (n : (inlist_curr_28 - {current_25})))) & (rtrancl_pt (% x y. ((Node_next x) = y)) (List_first this) (Node_next current_25)) & ((current_25 ~= null) --> ((Node_next current_25) = (Node_next current_25))) & ((current_25 = null) --> ((Node_next current_25) = (List_first this))))))) & ((~((Node_data current_25) ~= o1)) --> (((prev_27 ~= null) --> (comment ''NullCheckFieldNode_next'' (current_25 ~= null) & comment ''ObjNullCheck'' (prev_27 ~= null) & comment ''ObjNullCheck'' (current_25 ~= null) & ((List_inlist_10 = (% this__9. {n. ((rtrancl_pt (% x y. ((((Node_next(prev_27 := (Node_next current_25)))(current_25 := null)) x) = y)) (List_first this__9) n) & (n ~= null))})) --> ((List_content_9 = (% this__10. {x. ((x ~= null) & (EX n. ((x = (Node_data n)) & (n : (List_inlist_10 this__10)))))})) --> (comment ''currentNotInInlist'' ((List_inlist_10 this) = ((List_inlist this) - {current_25})) & (comment ''currentNotInInlist'' ((List_inlist_10 this) = ((List_inlist this) - {current_25})) --> (comment ''otherInlistSame'' (ALL x. (((x : Object_alloc) & (x : List) & (x ~= this)) --> ((List_inlist_10 x) = (List_inlist x)))) & (comment ''otherInlistSame'' (ALL x. (((x : Object_alloc) & (x : List) & (x ~= this)) --> ((List_inlist_10 x) = (List_inlist x)))) --> (comment ''otherContentSame'' (ALL x. (((x : Object_alloc) & (x : List) & (x ~= this)) --> ((List_content_9 x) = (List_content x)))) & (comment ''otherContentSame'' (ALL x. (((x : Object_alloc) & (x : List) & (x ~= this)) --> ((List_content_9 x) = (List_content x)))) --> ((((List_content_9 this) = ((List_content this) - {o1})) & (ALL framedObj. (((framedObj : Object_alloc) & (framedObj : List) & (framedObj ~= this)) --> ((List_content_9 framedObj) = (List_content framedObj))))) & comment ''List_PrivateInv '' (ALL this__11. (((this__11 : Object_alloc) & (this__11 : List)) --> (ALL n. ((n : (List_inlist_10 this__11)) --> ((Node_data n) ~= null))))) & comment ''List_PrivateInv '' (tree [List_first, ((Node_next(prev_27 := (Node_next current_25)))(current_25 := null))]) & comment ''List_PrivateInv '' (ALL this__12. (((this__12 : Object_alloc) & (this__12 : List)) --> (ALL n m. (((n : (List_inlist_10 this__12)) & (m : (List_inlist_10 this__12)) & ((Node_data n) = (Node_data m))) --> (n = m)))))))))))))))) & ((~(prev_27 ~= null)) --> (comment ''NullCheckFieldNode_next'' ((List_first this) ~= null) & comment ''ObjNullCheck'' (current_25 ~= null) & ((List_inlist_2 = (% this__5. {n. ((rtrancl_pt (% x y. (((Node_next(current_25 := null)) x) = y)) ((List_first(this := (Node_next (List_first this)))) this__5) n) & (n ~= null))})) --> ((List_content_1 = (% this__6. {x. ((x ~= null) & (EX n. ((x = (Node_data n)) & (n : (List_inlist_2 this__6)))))})) --> (comment ''currentNotInInlist'' ((List_inlist_2 this) = ((List_inlist this) - {current_25})) & (comment ''currentNotInInlist'' ((List_inlist_2 this) = ((List_inlist this) - {current_25})) --> (comment ''otherInlistSame'' (ALL x. (((x : Object_alloc) & (x : List) & (x ~= this)) --> ((List_inlist_2 x) = (List_inlist x)))) & (comment ''otherInlistSame'' (ALL x. (((x : Object_alloc) & (x : List) & (x ~= this)) --> ((List_inlist_2 x) = (List_inlist x)))) --> (comment ''otherContentSame'' (ALL x. (((x : Object_alloc) & (x : List) & (x ~= this)) --> ((List_content_1 x) = (List_content x)))) & (comment ''otherContentSame'' (ALL x. (((x : Object_alloc) & (x : List) & (x ~= this)) --> ((List_content_1 x) = (List_content x)))) --> ((((List_content_1 this) = ((List_content this) - {o1})) & (ALL framedObj. (((framedObj : Object_alloc) & (framedObj : List) & (framedObj ~= this)) --> ((List_content_1 framedObj) = (List_content framedObj))))) & comment ''List_PrivateInv '' (ALL this__7. (((this__7 : Object_alloc) & (this__7 : List)) --> (ALL n. ((n : (List_inlist_2 this__7)) --> ((Node_data n) ~= null))))) & comment ''List_PrivateInv '' (tree [(List_first(this := (Node_next (List_first this)))), (Node_next(current_25 := null))]) & comment ''List_PrivateInv '' (ALL this__8. (((this__8 : Object_alloc) & (this__8 : List)) --> (ALL n m. (((n : (List_inlist_2 this__8)) & (m : (List_inlist_2 this__8)) & ((Node_data n) = (Node_data m))) --> (n = m)))))))))))))))))))))))))

7 ko7 ko

Page 32: Theorem Proving and Data Structure Verification Charles Bouillaguet Viktor Kuncak, MIT Computer Science and Artificial Intelligence Lab Spring-Summer 2006.

My work begins NOW

• We have generated a big formula

• We need to prove that it is valid (ie True).

• But first, a word about the system itself

• Jahob system for verifying data structures and programs.

• Input language: subset of Java

• Specification language: subset of Isabelle

• implementation of the previous verification schema

Page 33: Theorem Proving and Data Structure Verification Charles Bouillaguet Viktor Kuncak, MIT Computer Science and Artificial Intelligence Lab Spring-Summer 2006.

Jahob system, in theory...

Automated Automated reasoning reasoning

infrastructureinfrastructure

=

Page 34: Theorem Proving and Data Structure Verification Charles Bouillaguet Viktor Kuncak, MIT Computer Science and Artificial Intelligence Lab Spring-Summer 2006.

• Written in OCaml

• 4 active developers

• in 3 countries

• including myself

• 44.5 k lines of code

• 96 modules

• CVS repository

• big test suite

...And in practice

automated automated reasoningreasoning

Page 35: Theorem Proving and Data Structure Verification Charles Bouillaguet Viktor Kuncak, MIT Computer Science and Artificial Intelligence Lab Spring-Summer 2006.

Why automated reasoning ?

• Generating the formula was EASY !

• Polynomial time if we fix the system

• Proving it is hard

• complexity: reasonable tower of exponential (or worse)

• How do we prove it ?

• humans cannot program correctly, how could they prove correctly ?

• Machine-checked proof: use of proof assistants

• COQ, Isabelle, ACL2, Nuprl, HOL, ...

Page 36: Theorem Proving and Data Structure Verification Charles Bouillaguet Viktor Kuncak, MIT Computer Science and Artificial Intelligence Lab Spring-Summer 2006.

Proof assistants ?

• Check that a “proof” is correct

• “proof” of a part of a VC from Jahob, in COQeauto ; intros. intuition ; subst . apply Extensionality_Ensembles. unfold Same_set. unfold Included. unfold In. unfold In in H1. intuition. destruct H0. destruct (eq_nat_dec x1 ArraySet_size). subst. rewrite arraywrite_match in H0 ; auto. intuition. subst. apply Union_intror. auto with sets. assert (x1 < ArraySet_size). omega. clear n. apply Union_introl. rewrite arraywrite_not_same_i in H0. unfold In. exists x1. intuition. omega.

inversion H0 ; subst ; clear H0. unfold In in H3. destruct H3. exists x1. intuition. rewrite arraywrite_not_same_i. intuition ; omega. omega. exists ArraySet_size. intuition. inversion H3. subst. rewrite arraywrite_match ; trivial.

1 hour to write.

To verify a whole procedure, dozens of these

Page 37: Theorem Proving and Data Structure Verification Charles Bouillaguet Viktor Kuncak, MIT Computer Science and Artificial Intelligence Lab Spring-Summer 2006.

Proof assistants VS Theorem provers

• Proof assistants: check correctness of a proof

• good point: no complexity limit for proofs

• bad point: long and painful to use (wizardry required)

• Theorem provers: check validity of formula

• good point: no need to write proof ! Automatic !

• bad point: sometimes takes forever and crash

• not surprising: problem is undecidable...

• In practice: can work, but can take long without hacks

• I worked here

Page 38: Theorem Proving and Data Structure Verification Charles Bouillaguet Viktor Kuncak, MIT Computer Science and Artificial Intelligence Lab Spring-Summer 2006.

• Main work

• translation HOL->FOL

• interface to E, Vampire, SPASS

• 6 modules

• 3k lines of code

• Various hacks

• COQ interface

• bugfixes

• improvements

My contributions

Page 39: Theorem Proving and Data Structure Verification Charles Bouillaguet Viktor Kuncak, MIT Computer Science and Artificial Intelligence Lab Spring-Summer 2006.

HOL ? FOL ? E ? SPASS ?

your usual mathematical language(Convenient to write specification)expressive, but hard to prove automatically

restricted logic (no sets,...)but has better theorem provers

1st-order Theorem proversVery efficient on “easy” formulas(provable with low number of deduction steps)re

ac

ha

bili

ty

reachability reasoning in trees

ari

thm

eti

c(q

ua

ntif

ier-

fre

e)

proof assistants

cardinality constraints

Page 40: Theorem Proving and Data Structure Verification Charles Bouillaguet Viktor Kuncak, MIT Computer Science and Artificial Intelligence Lab Spring-Summer 2006.

A Hack: making formula easier

• HOL formulas have type decorations

• Object, Integer, ...

• In general, this information is important (x,y : S), x = y∀ (u,v : T), u ≠ y∃

• Satisfiable with sorts (S={a}, T={c,d})

• Unsatisfiable without

• But sort informations make proofs longer !

Page 41: Theorem Proving and Data Structure Verification Charles Bouillaguet Viktor Kuncak, MIT Computer Science and Artificial Intelligence Lab Spring-Summer 2006.

Striping sorts

• Lemma: If sorts are

•pairwise disjoint

•of same cardinality

•Then we can forget about them

• Moreover, shortest proof is always shorter without sorts

• Removing sorts:

•important speed-up of proof process

OK for Object, OK for Object, Integer Integer

Page 42: Theorem Proving and Data Structure Verification Charles Bouillaguet Viktor Kuncak, MIT Computer Science and Artificial Intelligence Lab Spring-Summer 2006.

Effectiveness

benchmarkbenchmark Time (s)Time (s) Proof lengthProof length generated generated clausesclauses

sortssorts with w/o with w/o with w/o

Tree.removeTree.remove

4.5 0.53 250 154 14 348 5 959

44.0 0.46 1082 315 97 672 5 505

5.2 0.75 209 201 17 081 6 597

30.1 0.38 869 266 77 091 5 474

5.8 0.75 249 167 18 065 6 365

7.3 0.28 863 231 34 032 3 492

Tree.remove_Tree.remove_maxmax

83.1 4.8 797 314 118 364 28 478

37.9 0.85 2622 502 115 928 8 289

Page 43: Theorem Proving and Data Structure Verification Charles Bouillaguet Viktor Kuncak, MIT Computer Science and Artificial Intelligence Lab Spring-Summer 2006.

Summary of verified data structures

• Using Jahob system with 1st-order provers

• no automatic verification before

BenchmarkBenchmark lines of lines of codecode

lines of lines of spec. spec.

# of # of methodsmethods verif. timeverif. time

Association Association list (fun.)list (fun.)

76 26 9 50s

Functional Functional ordered treeordered tree

186 38 10 4min 16s

Imperative Imperative linked listlinked list

60 24 9 17.9s

LibraryLibrary 97 63 9 31.5s

Hash tableHash table 41 39 6 38s

Page 44: Theorem Proving and Data Structure Verification Charles Bouillaguet Viktor Kuncak, MIT Computer Science and Artificial Intelligence Lab Spring-Summer 2006.

LIVE DEMO !!!!

Page 45: Theorem Proving and Data Structure Verification Charles Bouillaguet Viktor Kuncak, MIT Computer Science and Artificial Intelligence Lab Spring-Summer 2006.

MIT, STATA center

Page 46: Theorem Proving and Data Structure Verification Charles Bouillaguet Viktor Kuncak, MIT Computer Science and Artificial Intelligence Lab Spring-Summer 2006.

MIT, STATA center

Page 47: Theorem Proving and Data Structure Verification Charles Bouillaguet Viktor Kuncak, MIT Computer Science and Artificial Intelligence Lab Spring-Summer 2006.

Inside

Page 48: Theorem Proving and Data Structure Verification Charles Bouillaguet Viktor Kuncak, MIT Computer Science and Artificial Intelligence Lab Spring-Summer 2006.

Conclusion

• Developing a full system is hard

• Watching it in action is cool !

• Formal methods are the future of Comp. Sci.

• Always have been...

• Always will be

• Questions ?