Provably Correct Compilers (Part 2) Nazrul Alam and Krishnaprasad Vikram April 21, 2005
Dec 21, 2015
Provably Correct Compilers (Part 2)
Nazrul Alam and Krishnaprasad Vikram
April 21, 2005
Automated Soundness Proofs for Dataflow Analyses and Transformations via Local
Rules (Rhodium Paper)
Sorin Lerner, Todd Millstein, Erika Rice, Craig Chambers
Today’s Focus…
Rhodium: Successor of Cobalt
• Increased expressiveness– New model for expressing opts: local propagation
rules with explicit dataflow facts– Heap summaries– Infinite analysis domains– Flow-sensitive and -insensitive– Intraprocedural and interprocedural
• Some Rhodium opts not expressible in Cobalt:– Arithmetic invariant detection, integer range analysis,
loop-induction-variable strength reduction, Andersen's may-point-to analysis with allocation-site summaries
Recap from Last Class
• Optimization needs analysis.
• Each analysis is formulated as a global path condition in Cobalt.
• In Rhodium it is done with local rules.
Similarities with Cobalt…(Also helpful for refreshing your
memory..:)
Checker
Written by programmer
Given
Rhodium Execution engine
RdmOpt
RdmOpt
RdmOpt
Checker
Written by programmer
Given
Rhodium Execution engine
RdmOpt
RdmOpt
RdmOpt
RdmOpt
RdmOpt
RdmOpt
Checker Checker Checker
CheckerChecker CheckerChecker CheckerChecker
Exec
Compiler
Rhodium Execution engine
RdmOpt
RdmOpt
RdmOpt
if (…) { x := …;} else { y := …;}…;
Checker Checker Checker
Verification Task
AutomaticTheoremProver
Rdm Opt
Verification Task
Checker
Show that for any original program:
behavior oforiginal program
=
behavior ofoptimized program
Verification Task
Verification Task
AutomaticTheoremProver
Rdm Opt
Verification Task
Verification Task
Verification Task
AutomaticTheoremProver
Rdm Opt
Verification Task
Verification Task
Three techniques to simplify Verification Task
1. Rhodium is declarative• no loops, no branches,
no program counter• declare intent using
rules• execution engine takes
care of the rest
AutomaticTheoremProver
Rdm Opt
Three techniques to simplify Verification Task
1. Rhodium is declarative• no loops, no branches,
no program counter• declare intent using
rules• execution engine takes
care of the rest
AutomaticTheoremProver
Rdm Opt
Three techniques to simplify Verification Task
1. Rhodium is declarative
2. Factor out heuristics– legal transformations– vs. profitable
transformations
AutomaticTheoremProver
Rdm OptHeuristics not
affecting correctnessPart that must be reasoned
about
Three techniques to simplify Verification Task
AutomaticTheoremProver
1. Rhodium is declarative
2. Factor out heuristics– legal transformations– vs. profitable
transformations
Heuristics not affecting correctness
Part that must be reasoned
about
Three techniques to simplify Verification Task
1. Rhodium is declarative
2. Factor out heuristics
3. Split verification task– opt-dependent– vs. opt-independent
AutomaticTheoremProver
opt-dependent
opt-independent
Three techniques to simplify Verification Task
1. Rhodium is declarative
2. Factor out heuristics
3. Split verification task– opt-dependent– vs. opt-independent
AutomaticTheoremProver
Three techniques to simplify Verification Task
AutomaticTheoremProver
1. Rhodium is declarative
2. Factor out heuristics
3. Split verification task– opt-dependent– vs. opt-independent
Three techniques to simplify Verification Task
AutomaticTheoremProver
1. Rhodium is declarative
2. Factor out heuristics
3. Split verification task
Result:• Expressive language• Automated
correctness checking
Where is the difference?
Rohodium’s local rules are different from Cobalt’s global condition
Then how exactly Rhodium works?
MustPointTo analysis
c = a
a = &b
d = *c
a b
ca b
d = b
MustPointTo info in Rhodium
c = a
a = &b
mustPointTo (a, b)
ca b mustPointTo (a, b)
mustPointTo (c, b)
a b
d = *c
MustPointTo info in Rhodium
c = a
a = &b
d = *c
ca b mustPointTo (a, b)
mustPointTo (c, b)
c = a
a = &b
d = *c
ca b mustPointTo (a, b)
mustPointTo (c, b)
mustPointTo (a, b)a ba b mustPointTo (a, b)
MustPointTo info in Rhodiumdefine fact mustPointTo(X:Var,Y:Var)with meaning σ( X)== σ(&Y)
c = a
a = &b
d = *c
ca b mustPointTo (a, b)
mustPointTo (c, b)
mustPointTo (a, b)a bFact correct on edge if:
whenever program execution reaches edge, meaning of fact evaluates to true in the program state
Propagating facts
c = a
a = &b
d = *c
ca b mustPointTo (a, b)
mustPointTo (c, b)
define fact mustPointTo(X:Var,Y:Var)with meaning σ(X)== σ(&Y)
mustPointTo (a, b)a b
a = &bif currStmt == [X = &Y]then mustPointTo(X,Y)@out
Propagating facts
c = a
a = &b
d = *c
ca b mustPointTo (a, b)
mustPointTo (c, b)
if currStmt == [X = &Y]then mustPointTo(X,Y)@out
define fact mustPointTo(X:Var,Y:Var)with meaning σ(X) == σ(&Y)
mustPointTo (a, b)a b
if currStmt == [X = &Y] then mustPointTo(X,Y)@out
Propagating facts
c = a
a = &b
d = *c
ca b mustPointTo (a, b)
mustPointTo (c, b)
define fact mustPointTo(X:Var,Y:Var)with meaning σ(X) == σ(&Y)
mustPointTo (a, b)a b
c = ac = a
Propagating facts
a = &b
d = *c
ca b mustPointTo (a, b)
mustPointTo (c, b)
if mustPointTo(X,Y)@in currStmt == [Z = X]
then mustPointTo(Z,Y)@out
mustPointTo (c, b)
define fact mustPointTo(X:Var,Y:Var)with meaning σ(X) == σ(&Y)
if currStmt == [X = &Y] then mustPointTo(X,Y)@out
mustPointTo (a, b)a b mustPointTo (a, b)
Propagating facts
c = a
a = &b
d = *c
ca b mustPointTo (a, b)
mustPointTo (c, b)
define fact mustPointTo(X:Var,Y:Var)with meaning σ(X )== σ(&Y)
if currStmt == [X = &Y] then mustPointTo(X,Y)@out
if mustPointTo(X,Y)@in currStmt == [Z = X]
then mustPointTo(Z,Y)@out
mustPointTo (a, b)a b
d = *cd = *c
Transformations
c = a
a = &b
ca b mustPointTo (a, b)
mustPointTo (c, b) if mustPointTo(X,Y)@in currStmt == [Z = *X]
then transform to [Z = Y]
mustPointTo (c, b)
d = b
define fact mustPointTo(X:Var,Y:Var)with meaning σ(X )== σ(&Y)
if currStmt == [X = &Y] then mustPointTo(X,Y)@out
if mustPointTo(X,Y)@in currStmt == [Z = X]
then mustPointTo(Z,Y)@out
mustPointTo (a, b)a b
d = *c
Transformations
c = a
a = &b
ca b mustPointTo (a, b)
mustPointTo (c, b) if mustPointTo(X,Y)@in currStmt == [Z = *X]
then transform to [Z = Y]d = b
define fact mustPointTo(X:Var,Y:Var)with meaning σ(X )== σ(&Y)
if currStmt == [X = &Y] then mustPointTo(X,Y)@out
if mustPointTo(X,Y)@in currStmt == [Z = X]
then mustPointTo(Z,Y)@out
mustPointTo (a, b)a b
Semantics of a Rhodium opt
• Run propagation rules in a loop until there are no more changes (optimistic iterative analysis)
• Then run transformation rules
• Then run profitability heuristics
• For better precision, combine propagation rules and transformations rules.
Rhodium is more expressive
Cobalt: Simple Pointer Analysis
Rhodium version..
And Rhodium can do more…
…..Can not be expressed in Cobalt
Arithmetic Simplification Optimization
Arithmetic Simplification Optimization (Cont.)
Loop induction-variable strength reduction
Loop induction-variable strength reduction (cont.)
Checking Rhodium optimizations
Exec
Compiler
Rhodium Execution engine
RdmOpt
RdmOpt
if (…) { x := …;} else { y := …;}…;
Checker Checker
Rhodium correctness checker
RdmOpt
Checker
Rhodium correctness checker
Checker
CheckerRdmOpt
Checker
Rhodium correctness checker
Automatic theorem prover
RdmOpt
Checker
Rhodium correctness checker
Automatic theorem prover
definefact …
if …then transform …
if …then …
Checker
Profitability heuristics
Rhodium optimization
Rhodium correctness checker
Automatic theorem prover
Rhodium optimization
definefact …
if …then transform …
if …then …
Checker
Rhodium correctness checker
Automatic theorem prover
Rhodium optimization
definefact …
VCGen
LocalVC
LocalVC
LemmaFor any Rhodium opt:
If Local VCs are trueThen opt is correct
Proof
«¬
$
\ rt l
Checker
Opt-dependent
Opt-independent
VCGen
if …then …
if …then transform …
Local correctness of prop. rules
currStmt == [Z = X]
then mustPointTo(Z,Y)@out
if mustPointTo(X,Y)@in
define fact mustPointTo(X,Y)with meaning σ(X) == σ(&Y)
Fact correct on edge iff:
whenever program execution reaches edge, meaning of fact evaluates to true in the program state
Local correctness of prop. rules
currStmt == [Z = X]
then mustPointTo(Z,Y)@out
Local VC (generated and proven automatically)
if mustPointTo(X,Y)@in
define fact mustPointTo(X,Y)with meaning σ(X)==σ(&Y)
Assume:
Propagated factis correct
Show:
All incoming facts are correct
Show: Z == &Y (out)
X == &Y (in)
out= step (in , [Z = X] )
Assume:
Local correctness of prop. rules
Show: Z == &Y (out)
X == &Y (in)
out= step (in , [Z = X] )
Assume:
Local VC (generated and proven automatically)
define fact mustPointTo(X,Y)with meaning X == &Y
currStmt == [Z = X]
then mustPointTo(Z,Y)@out
if mustPointTo(X,Y)@in
Evaluation
Dimensions of evaluation
• Correctness guarantees
• Usefulness of the checker
• Expressiveness
Correctness guarantees
• Once checked, optimizations are guaranteed to be correct
• Caveat: trusted computing base– execution engine– checker implementation– proofs done by hand once by Lerner.
• Adding a new optimization does not increase the size of the trusted computing base
• Ease of use• Guarantees• Usefulness• Expressiveness
Usefulness of the checker
• Found subtle bugs in Lerner’s initial implementation of various optimizations
define fact equals(X:Var, E:Expr)with meaning σ(X) == σ(E)
if currStmt == [X = E] then equals(X,E)@out
x := x + 1x = x + 1
equals (x , x + 1)
• Ease of use• Guarantees• Usefulness• Expressiveness
if currStmt == [X = E] then equals(X,E)@outif currStmt == [X = E] “X does not appear in E”then equals(X,E)@out
Usefulness of the checker
• Found subtle bugs in Lerner’s initial implementation of various optimizations
define fact equals(X:Var, E:Expr)with meaning σ(X) == σ(E) x := x + 1x = x + 1
equals (x , x + 1)
• Ease of use• Guarantees• Usefulness• Expressiveness
x = x + 1x = x + 1x = *y + 1
Usefulness of the checker
• Found subtle bugs in Lerner’s initial implementation of various optimizations
define fact equals(X:Var, E:Expr)with meaning σ(X) == σ(E)
if currStmt == [X = E] Æ “X does not appear in E”then equals(X,E)@out
equals (x , x + 1)equals (x , *y + 1)if currStmt == [X = E] “E does not use X”then equals(X,E)@out
• Ease of use• Guarantees• Usefulness• Expressiveness
Rhodium expressiveness
• Traditional optimizations:– const prop and folding, branch folding, dead assignment elim,
common sub-expression elim, partial redundancy elim, partial dead assignment elim, arithmetic invariant detection, and integer range analysis.
• Pointer analyses– must-point-to analysis, Andersen's may-point-to analysis with
heap summaries
• Loop opts– loop-induction-variable strength reduction, code hoisting, code
sinking
• Array opts– constant propagation through array elements, redundant array
load elimination
• Ease of use• Guarantees• Usefulness• Expressiveness
Expressiveness limitations
• May not be able to express your optimization in Rhodium– opts that build complicated data structures– opts that perform complicated many-to-many
transformations (e.g.: loop fusion, loop unrolling)
• A correct Rhodium optimization may be rejected by the correctness checker – limitations of the theorem prover– limitations of first-order logic
• Ease of use• Guarantees• Usefulness• Expressiveness
Summary
• Rhodium system– makes it easier to write optimizations– provides correctness guarantees– is expressive enough for realistic optimizations
• Rhodium system provides a foundation for safe extensible program manipulators
Future works
• Overcome the limitation of Rhodium– opts that build complicated data structures– opts that perform complicated many-to-many
transformations (e.g.: loop fusion, loop unrolling)
• Overcoming the limitation of theorem prover– Simplify is conservative– Using higher order logic?
Future works..
• Automatically infer the whole compiler from a high level specification– Infer analyses– Infer transformations– Program optimizations by demonstration– Automatically pick good data representations– Automatically explore the tradeoffs between
scalability and precision
References
• Correctness– By hand [Cousot and Cousot 77, 79, Benton 04, Lacey et al. 02]– With interactive theorem prover [Cachera et al. 04]– One compilation at a time [Pnueli et al. 98, Necula 00, Rinard
99]
• Declarative languages for writing transformations– Attribute grammars [Reps and Teitelbaum 88]– Temporal logic [Steffen 91, Lacey et al. 02]
• Execution engines– Incremental execution of transformations [Sittampalam et al. 04]– Running opts specified with temporal logic [Steffen 91]
More facts
define fact mustNotPointTo(X:Var,Y:Var)with meaning σ (X) σ (&Y)
define fact hasConstantValue(X:Var,C:Const)with meaning σ(X) == C
define fact doesNotPointIntoHeap(X:Var)with meaning Y:Var and σ (X) == σ (&Y)
More rules
if currStmt == [X = *A] mustNotPointToHeap(A)@in B:Var . mayPointTo(A,B)@in
mustNotPointTo(B,Y)then mustNotPointTo(X,Y)@out