Provably Correct Compilers (Part 2) Nazrul Alam and Krishnaprasad Vikram April 21, 2005.

Provably Correct Compilers (Part 2)

Nazrul Alam and Krishnaprasad Vikram

April 21, 2005

Automated Soundness Proofs for Dataflow Analyses and Transformations via Local

Rules (Rhodium Paper)

Sorin Lerner, Todd Millstein, Erika Rice, Craig Chambers

Today’s Focus…

Rhodium: Successor of Cobalt

• Increased expressiveness– New model for expressing opts: local propagation

rules with explicit dataflow facts– Heap summaries– Infinite analysis domains– Flow-sensitive and -insensitive– Intraprocedural and interprocedural

• Some Rhodium opts not expressible in Cobalt:– Arithmetic invariant detection, integer range analysis,

loop-induction-variable strength reduction, Andersen's may-point-to analysis with allocation-site summaries

Recap from Last Class

• Optimization needs analysis.

• Each analysis is formulated as a global path condition in Cobalt.

• In Rhodium it is done with local rules.

Similarities with Cobalt…(Also helpful for refreshing your

memory..:)

Checker

Written by programmer

Given

Rhodium Execution engine

RdmOpt

RdmOpt

RdmOpt

Checker

Written by programmer

Given


RdmOpt

RdmOpt

RdmOpt

RdmOpt

RdmOpt

RdmOpt

Checker Checker Checker

CheckerChecker CheckerChecker CheckerChecker

Exec

Compiler


RdmOpt

RdmOpt

RdmOpt

if (…) { x := …;} else { y := …;}…;

Checker Checker Checker

Verification Task

AutomaticTheoremProver

Rdm Opt

Verification Task

Checker

Show that for any original program:

behavior oforiginal program

=

behavior ofoptimized program

Verification Task

Verification Task


Rdm Opt

Verification Task

Verification Task

Verification Task


Rdm Opt

Verification Task

Verification Task

Three techniques to simplify Verification Task

1. Rhodium is declarative• no loops, no branches,

no program counter• declare intent using

rules• execution engine takes

care of the rest


Rdm Opt


1. Rhodium is declarative• no loops, no branches,

no program counter• declare intent using

rules• execution engine takes

care of the rest


Rdm Opt


1. Rhodium is declarative

2. Factor out heuristics– legal transformations– vs. profitable

transformations


Rdm OptHeuristics not

affecting correctnessPart that must be reasoned

about




2. Factor out heuristics– legal transformations– vs. profitable

transformations

Heuristics not affecting correctness

Part that must be reasoned

about



2. Factor out heuristics

3. Split verification task– opt-dependent– vs. opt-independent


opt-dependent

opt-independent















3. Split verification task

Result:• Expressive language• Automated

correctness checking

Where is the difference?

Rohodium’s local rules are different from Cobalt’s global condition

Then how exactly Rhodium works?

MustPointTo analysis

c = a

a = &b

d = *c

a b

ca b

d = b

MustPointTo info in Rhodium

c = a

a = &b

mustPointTo (a, b)

ca b mustPointTo (a, b)

mustPointTo (c, b)

a b

d = *c

MustPointTo info in Rhodium

c = a

a = &b

d = *c


mustPointTo (c, b)

c = a

a = &b

d = *c


mustPointTo (c, b)

mustPointTo (a, b)a ba b mustPointTo (a, b)

MustPointTo info in Rhodiumdefine fact mustPointTo(X:Var,Y:Var)with meaning σ( X)== σ(&Y)

c = a

a = &b

d = *c


mustPointTo (c, b)

mustPointTo (a, b)a bFact correct on edge if:

whenever program execution reaches edge, meaning of fact evaluates to true in the program state

Propagating facts

c = a

a = &b

d = *c


mustPointTo (c, b)

define fact mustPointTo(X:Var,Y:Var)with meaning σ(X)== σ(&Y)

mustPointTo (a, b)a b

a = &bif currStmt == [X = &Y]then mustPointTo(X,Y)@out

Propagating facts

c = a

a = &b

d = *c


mustPointTo (c, b)

if currStmt == [X = &Y]then mustPointTo(X,Y)@out

define fact mustPointTo(X:Var,Y:Var)with meaning σ(X) == σ(&Y)


if currStmt == [X = &Y] then mustPointTo(X,Y)@out

Propagating facts

c = a

a = &b

d = *c


mustPointTo (c, b)



c = ac = a

Propagating facts

a = &b

d = *c


mustPointTo (c, b)

if mustPointTo(X,Y)@in currStmt == [Z = X]

then mustPointTo(Z,Y)@out

mustPointTo (c, b)



mustPointTo (a, b)a b mustPointTo (a, b)

Propagating facts

c = a

a = &b

d = *c


mustPointTo (c, b)

define fact mustPointTo(X:Var,Y:Var)with meaning σ(X )== σ(&Y)





d = *cd = *c

Transformations

c = a

a = &b


mustPointTo (c, b) if mustPointTo(X,Y)@in currStmt == [Z = *X]

then transform to [Z = Y]

mustPointTo (c, b)

d = b






d = *c

Transformations

c = a

a = &b


mustPointTo (c, b) if mustPointTo(X,Y)@in currStmt == [Z = *X]

then transform to [Z = Y]d = b






Semantics of a Rhodium opt

• Run propagation rules in a loop until there are no more changes (optimistic iterative analysis)

• Then run transformation rules

• Then run profitability heuristics

• For better precision, combine propagation rules and transformations rules.

Rhodium is more expressive

Cobalt: Simple Pointer Analysis

Rhodium version..

And Rhodium can do more…

…..Can not be expressed in Cobalt

Arithmetic Simplification Optimization

Arithmetic Simplification Optimization (Cont.)

Loop induction-variable strength reduction

Loop induction-variable strength reduction (cont.)

Checking Rhodium optimizations

Exec

Compiler


RdmOpt

RdmOpt

if (…) { x := …;} else { y := …;}…;

Checker Checker

Rhodium correctness checker

RdmOpt

Checker


Checker

CheckerRdmOpt

Checker


Automatic theorem prover

RdmOpt

Checker



definefact …

if …then transform …

if …then …

Checker

Profitability heuristics

Rhodium optimization




definefact …


if …then …

Checker




definefact …

VCGen

LocalVC

LocalVC

LemmaFor any Rhodium opt:

If Local VCs are trueThen opt is correct

Proof

«¬

$

\ rt l

Checker

Opt-dependent

Opt-independent

VCGen

if …then …


Local correctness of prop. rules

currStmt == [Z = X]


if mustPointTo(X,Y)@in

define fact mustPointTo(X,Y)with meaning σ(X) == σ(&Y)

Fact correct on edge iff:

whenever program execution reaches edge, meaning of fact evaluates to true in the program state


currStmt == [Z = X]


Local VC (generated and proven automatically)


define fact mustPointTo(X,Y)with meaning σ(X)==σ(&Y)

Assume:

Propagated factis correct

Show:

All incoming facts are correct

Show: Z == &Y (out)

X == &Y (in)

out= step (in , [Z = X] )

Assume:


Show: Z == &Y (out)

X == &Y (in)

out= step (in , [Z = X] )

Assume:

Local VC (generated and proven automatically)

define fact mustPointTo(X,Y)with meaning X == &Y

currStmt == [Z = X]



Evaluation

Dimensions of evaluation

• Correctness guarantees

• Usefulness of the checker

• Expressiveness

Correctness guarantees

• Once checked, optimizations are guaranteed to be correct

• Caveat: trusted computing base– execution engine– checker implementation– proofs done by hand once by Lerner.

• Adding a new optimization does not increase the size of the trusted computing base

• Ease of use• Guarantees• Usefulness• Expressiveness

Usefulness of the checker

• Found subtle bugs in Lerner’s initial implementation of various optimizations

define fact equals(X:Var, E:Expr)with meaning σ(X) == σ(E)

if currStmt == [X = E] then equals(X,E)@out

x := x + 1x = x + 1

equals (x , x + 1)


if currStmt == [X = E] then equals(X,E)@outif currStmt == [X = E] “X does not appear in E”then equals(X,E)@out



define fact equals(X:Var, E:Expr)with meaning σ(X) == σ(E) x := x + 1x = x + 1

equals (x , x + 1)


x = x + 1x = x + 1x = *y + 1



define fact equals(X:Var, E:Expr)with meaning σ(X) == σ(E)

if currStmt == [X = E] Æ “X does not appear in E”then equals(X,E)@out

equals (x , x + 1)equals (x , *y + 1)if currStmt == [X = E] “E does not use X”then equals(X,E)@out


Rhodium expressiveness

• Traditional optimizations:– const prop and folding, branch folding, dead assignment elim,

common sub-expression elim, partial redundancy elim, partial dead assignment elim, arithmetic invariant detection, and integer range analysis.

• Pointer analyses– must-point-to analysis, Andersen's may-point-to analysis with

heap summaries

• Loop opts– loop-induction-variable strength reduction, code hoisting, code

sinking

• Array opts– constant propagation through array elements, redundant array

load elimination


Expressiveness limitations

• May not be able to express your optimization in Rhodium– opts that build complicated data structures– opts that perform complicated many-to-many

transformations (e.g.: loop fusion, loop unrolling)

• A correct Rhodium optimization may be rejected by the correctness checker – limitations of the theorem prover– limitations of first-order logic


Summary

• Rhodium system– makes it easier to write optimizations– provides correctness guarantees– is expressive enough for realistic optimizations

• Rhodium system provides a foundation for safe extensible program manipulators

Future works

• Overcome the limitation of Rhodium– opts that build complicated data structures– opts that perform complicated many-to-many

transformations (e.g.: loop fusion, loop unrolling)

• Overcoming the limitation of theorem prover– Simplify is conservative– Using higher order logic?

Future works..

• Automatically infer the whole compiler from a high level specification– Infer analyses– Infer transformations– Program optimizations by demonstration– Automatically pick good data representations– Automatically explore the tradeoffs between

scalability and precision

References

• Correctness– By hand [Cousot and Cousot 77, 79, Benton 04, Lacey et al. 02]– With interactive theorem prover [Cachera et al. 04]– One compilation at a time [Pnueli et al. 98, Necula 00, Rinard

99]

• Declarative languages for writing transformations– Attribute grammars [Reps and Teitelbaum 88]– Temporal logic [Steffen 91, Lacey et al. 02]

• Execution engines– Incremental execution of transformations [Sittampalam et al. 04]– Running opts specified with temporal logic [Steffen 91]

More facts

define fact mustNotPointTo(X:Var,Y:Var)with meaning σ (X) σ (&Y)

define fact hasConstantValue(X:Var,C:Const)with meaning σ(X) == C

define fact doesNotPointIntoHeap(X:Var)with meaning Y:Var and σ (X) == σ (&Y)

More rules

if currStmt == [X = *A] mustNotPointToHeap(A)@in B:Var . mayPointTo(A,B)@in

mustNotPointTo(B,Y)then mustNotPointTo(X,Y)@out

Provably Correct Compilers (Part 2) Nazrul Alam and Krishnaprasad Vikram April 21, 2005.

Documents

checker slide

independent slide

verification task checker

split verification task

rules execution engine

program counter

behavior of original

local propagation rules