A Decision Procedure A Decision Procedure for for Program Analysis and Bug Program Analysis and Bug Finding Finding Vijay Ganesh Vijay Ganesh Affiliation: CSAIL, MIT Affiliation: CSAIL, MIT Supported by Lincoln Labs Supported by Lincoln Labs February 7 February 7 th th , 2008 , 2008
43
Embed
A Decision Procedure for Program Analysis and Bug Finding
A Decision Procedure for Program Analysis and Bug Finding. Vijay Ganesh Affiliation: CSAIL, MIT Supported by Lincoln Labs February 7 th , 2008. Motivating Example. A[0]=0 A[1]=1 0
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
A Decision Procedure A Decision Procedure for for
Program Analysis and Bug FindingProgram Analysis and Bug Finding
A Decision Procedure A Decision Procedure for for
Program Analysis and Bug FindingProgram Analysis and Bug Finding
Vijay GaneshVijay GaneshAffiliation: CSAIL, MITAffiliation: CSAIL, MIT
Supported by Lincoln LabsSupported by Lincoln LabsFebruary 7February 7thth, 2008, 2008
Vijay GaneshVijay GaneshAffiliation: CSAIL, MITAffiliation: CSAIL, MIT
Supported by Lincoln LabsSupported by Lincoln LabsFebruary 7February 7thth, 2008, 2008
Motivating ExampleMotivating ExampleMotivating ExampleMotivating Example
Examples: Boolean SAT, Real Arithmetic, Bit-vectorsExamples: Boolean SAT, Real Arithmetic, Bit-vectors
Reduction easy for many problemsReduction easy for many problems
Approach better than coming up with special purpose algorithms: Approach better than coming up with special purpose algorithms: More efficient and saves workMore efficient and saves work
AI, program analysis, bug finding, verification,…AI, program analysis, bug finding, verification,…
Examples: Boolean SAT, Real Arithmetic, Bit-vectorsExamples: Boolean SAT, Real Arithmetic, Bit-vectors
Reduction easy for many problemsReduction easy for many problems
Approach better than coming up with special purpose algorithms: Approach better than coming up with special purpose algorithms: More efficient and saves workMore efficient and saves work
AI, program analysis, bug finding, verification,…AI, program analysis, bug finding, verification,…
DecisionProcedureInput FormulaInput Formula
SAT
UNSAT
STPSTPSTPSTP
1.1. DesignDesign and and ArchitectureArchitecture of STP (CAV ‘07, CCS ‘06) of STP (CAV ‘07, CCS ‘06)
2.2. Abstraction-RefinementAbstraction-Refinement based heuristics for Deciding based heuristics for Deciding ArraysArrays
3.3. SolverSolver Algorithm for deciding Linear Bit-vector Algorithm for deciding Linear Bit-vector Arithmetic O(nArithmetic O(n33))
4.4. Experimental ResultsExperimental Results
1.1. DesignDesign and and ArchitectureArchitecture of STP (CAV ‘07, CCS ‘06) of STP (CAV ‘07, CCS ‘06)
2.2. Abstraction-RefinementAbstraction-Refinement based heuristics for Deciding based heuristics for Deciding ArraysArrays
3.3. SolverSolver Algorithm for deciding Linear Bit-vector Algorithm for deciding Linear Bit-vector Arithmetic O(nArithmetic O(n33))
4.4. Experimental ResultsExperimental Results
Projects using STPProjects using STPProjects using STPProjects using STP Bug FindersBug Finders
EXE by Dawson Engler, Cristian Cadar and others (Stanford)EXE by Dawson Engler, Cristian Cadar and others (Stanford) MINESWEEPER by Dawn Song and her group (CMU)MINESWEEPER by Dawn Song and her group (CMU) CATCHCONV by David Molnar and David Wagner (Berkeley)CATCHCONV by David Molnar and David Wagner (Berkeley) Backward Path Sensitive Analysis by Tim Leek (MIT Lincoln)Backward Path Sensitive Analysis by Tim Leek (MIT Lincoln)
Security ToolsSecurity Tools REPLAYER: Security analysis thru protocol replay (CMU)REPLAYER: Security analysis thru protocol replay (CMU) Smart Fuzzer by Roberto Paleari (University of Milan, Italy)Smart Fuzzer by Roberto Paleari (University of Milan, Italy)
Program AnalysisProgram Analysis by Rupak Majumdar (UCLA)by Rupak Majumdar (UCLA)
Hardware verification Hardware verification Cache coherence protocols by Dill group (Stanford)Cache coherence protocols by Dill group (Stanford) By a chip companyBy a chip company
Software verification of crypto algorithms by Dill group (Stanford)Software verification of crypto algorithms by Dill group (Stanford)
Bug FindersBug Finders EXE by Dawson Engler, Cristian Cadar and others (Stanford)EXE by Dawson Engler, Cristian Cadar and others (Stanford) MINESWEEPER by Dawn Song and her group (CMU)MINESWEEPER by Dawn Song and her group (CMU) CATCHCONV by David Molnar and David Wagner (Berkeley)CATCHCONV by David Molnar and David Wagner (Berkeley) Backward Path Sensitive Analysis by Tim Leek (MIT Lincoln)Backward Path Sensitive Analysis by Tim Leek (MIT Lincoln)
Security ToolsSecurity Tools REPLAYER: Security analysis thru protocol replay (CMU)REPLAYER: Security analysis thru protocol replay (CMU) Smart Fuzzer by Roberto Paleari (University of Milan, Italy)Smart Fuzzer by Roberto Paleari (University of Milan, Italy)
Program AnalysisProgram Analysis by Rupak Majumdar (UCLA)by Rupak Majumdar (UCLA)
Hardware verification Hardware verification Cache coherence protocols by Dill group (Stanford)Cache coherence protocols by Dill group (Stanford) By a chip companyBy a chip company
Software verification of crypto algorithms by Dill group (Stanford)Software verification of crypto algorithms by Dill group (Stanford)
Projects using STP:Projects using STP:Smart Fuzzing thru’ Path SelectionSmart Fuzzing thru’ Path Selection
Projects using STP:Projects using STP:Smart Fuzzing thru’ Path SelectionSmart Fuzzing thru’ Path Selection
Smart Fuzzer by Roberto Paleari (University of Milan, Italy)Smart Fuzzer by Roberto Paleari (University of Milan, Italy)
Do dynamic analysis to determine dependency between input and control transfer (if Do dynamic analysis to determine dependency between input and control transfer (if conditional)conditional)
Collect path conditions Collect path conditions
Feed to STP to find values that drive a pathFeed to STP to find values that drive a path
Feed to STP to find values that drive the ‘other’ pathFeed to STP to find values that drive the ‘other’ path
Smart Fuzzer by Roberto Paleari (University of Milan, Italy)Smart Fuzzer by Roberto Paleari (University of Milan, Italy)
Do dynamic analysis to determine dependency between input and control transfer (if Do dynamic analysis to determine dependency between input and control transfer (if conditional)conditional)
Collect path conditions Collect path conditions
Feed to STP to find values that drive a pathFeed to STP to find values that drive a path
Feed to STP to find values that drive the ‘other’ pathFeed to STP to find values that drive the ‘other’ path
Projects using STP:Projects using STP:Formal Verification of Crypto AlgorithmsFormal Verification of Crypto Algorithms
Projects using STP:Projects using STP:Formal Verification of Crypto AlgorithmsFormal Verification of Crypto Algorithms
Eric Smith and David DillEric Smith and David Dill
TechniqueTechnique
Annotate code with InvariantsAnnotate code with Invariants
Symbolically execute the Java implementation of the Crypto AlgoSymbolically execute the Java implementation of the Crypto Algo
Plug the symbolically executed terms into the invariantsPlug the symbolically executed terms into the invariants
Feed invariants into ACL2 + STPFeed invariants into ACL2 + STP
ACL2 handles any induction + integer related stuff, and STP handles (in)equalities ACL2 handles any induction + integer related stuff, and STP handles (in)equalities over bit-vector termsover bit-vector terms
Eric Smith and David DillEric Smith and David Dill
TechniqueTechnique
Annotate code with InvariantsAnnotate code with Invariants
Symbolically execute the Java implementation of the Crypto AlgoSymbolically execute the Java implementation of the Crypto Algo
Plug the symbolically executed terms into the invariantsPlug the symbolically executed terms into the invariants
Feed invariants into ACL2 + STPFeed invariants into ACL2 + STP
ACL2 handles any induction + integer related stuff, and STP handles (in)equalities ACL2 handles any induction + integer related stuff, and STP handles (in)equalities over bit-vector termsover bit-vector terms
Projects using STP:Projects using STP: Cross Checking, Model Checking, Equivalence Checking(?) Cross Checking, Model Checking, Equivalence Checking(?)
Projects using STP:Projects using STP: Cross Checking, Model Checking, Equivalence Checking(?) Cross Checking, Model Checking, Equivalence Checking(?)
Cross Checking: EXE : Dawson Engler, Cristian Cadar,…Cross Checking: EXE : Dawson Engler, Cristian Cadar,… Different implementations of grep… Do they match?Different implementations of grep… Do they match? Symbolic-simulate Grep1Symbolic-simulate Grep1 Symbolic-simulate Grep2Symbolic-simulate Grep2 Equate the two and feed to STPEquate the two and feed to STP
Model Checking Cache Coherence Protocols: Chang and DillModel Checking Cache Coherence Protocols: Chang and Dill Does model satisfy property P? Does model satisfy property P? Convert to decision problem and feed to STPConvert to decision problem and feed to STP If you are using BDDs, try SAT or STPIf you are using BDDs, try SAT or STP
Cross Checking: EXE : Dawson Engler, Cristian Cadar,…Cross Checking: EXE : Dawson Engler, Cristian Cadar,… Different implementations of grep… Do they match?Different implementations of grep… Do they match? Symbolic-simulate Grep1Symbolic-simulate Grep1 Symbolic-simulate Grep2Symbolic-simulate Grep2 Equate the two and feed to STPEquate the two and feed to STP
Model Checking Cache Coherence Protocols: Chang and DillModel Checking Cache Coherence Protocols: Chang and Dill Does model satisfy property P? Does model satisfy property P? Convert to decision problem and feed to STPConvert to decision problem and feed to STP If you are using BDDs, try SAT or STPIf you are using BDDs, try SAT or STP
Projects using STP:Projects using STP: Work by Dawn Song and her group Work by Dawn Song and her group
Projects using STP:Projects using STP: Work by Dawn Song and her group Work by Dawn Song and her group
Automatic discovery of deviations in binary implementations : error detection and fingerprint generation
Protocol Replay: Try to reproduce a dialog between an initiator and a network hostProtocol Replay: Try to reproduce a dialog between an initiator and a network host Auto Generation of modules for honeypots so that they can correctly respond to Auto Generation of modules for honeypots so that they can correctly respond to
connection attempts by wormsconnection attempts by worms
Automatic patch based exploit generation: Using STP to reveal exploit information Automatic patch based exploit generation: Using STP to reveal exploit information from a windows patchfrom a windows patch
Automatic discovery of deviations in binary implementations : error detection and fingerprint generation
Protocol Replay: Try to reproduce a dialog between an initiator and a network hostProtocol Replay: Try to reproduce a dialog between an initiator and a network host Auto Generation of modules for honeypots so that they can correctly respond to Auto Generation of modules for honeypots so that they can correctly respond to
connection attempts by wormsconnection attempts by worms
Automatic patch based exploit generation: Using STP to reveal exploit information Automatic patch based exploit generation: Using STP to reveal exploit information from a windows patchfrom a windows patch
Quantifier-free Theory of Quantifier-free Theory of Bit-vectors and ArraysBit-vectors and Arrays
Quantifier-free Theory of Quantifier-free Theory of Bit-vectors and ArraysBit-vectors and Arrays
Expressions in STP correspond to Expressions in STP correspond to C/Java… programming language expressionsC/Java… programming language expressions Microprocessor instruction setMicroprocessor instruction set Arrays represent program memory or array data structure in C/Java…Arrays represent program memory or array data structure in C/Java…
Except Except Our bit-vectors are of any fixed lengthOur bit-vectors are of any fixed length No floating pointNo floating point No loopsNo loops
SAT problem for this theory is NP-completeSAT problem for this theory is NP-complete
Expressions in STP correspond to Expressions in STP correspond to C/Java… programming language expressionsC/Java… programming language expressions Microprocessor instruction setMicroprocessor instruction set Arrays represent program memory or array data structure in C/Java…Arrays represent program memory or array data structure in C/Java…
Except Except Our bit-vectors are of any fixed lengthOur bit-vectors are of any fixed length No floating pointNo floating point No loopsNo loops
SAT problem for this theory is NP-completeSAT problem for this theory is NP-complete
Quantifier-free Theory of Quantifier-free Theory of Bit-vectors and ArraysBit-vectors and Arrays
Quantifier-free Theory of Quantifier-free Theory of Bit-vectors and ArraysBit-vectors and Arrays
Array TermsArray Terms Read (Array, index)Read (Array, index) Write (Array, index, val)Write (Array, index, val) Example : R(W(A, i, 0b00), i) = 0b00Example : R(W(A, i, 0b00), i) = 0b00
Conditional in programming/multiplexors in hardwareConditional in programming/multiplexors in hardware
ite (c, t1, t2) = if ( c ) then t1 else t2 endifite (c, t1, t2) = if ( c ) then t1 else t2 endif
Predicates: =, <=, <=sPredicates: =, <=, <=s
Features of STPFeatures of STPFeatures of STPFeatures of STP
Can handle very large formulas efficientlyCan handle very large formulas efficiently
Large number of array reads (10Large number of array reads (1055)) Deeply nested array writes (10Deeply nested array writes (1044 deep) deep) Very large number of linear equations (10Very large number of linear equations (1066)) Very large number of variables (10Very large number of variables (1066))
EnabledEnabled several software and hardware applications several software and hardware applications
Won the SMTCOMP 2006 competition in bit-vector category
Can handle very large formulas efficientlyCan handle very large formulas efficiently
Large number of array reads (10Large number of array reads (1055)) Deeply nested array writes (10Deeply nested array writes (1044 deep) deep) Very large number of linear equations (10Very large number of linear equations (1066)) Very large number of variables (10Very large number of variables (1066))
EnabledEnabled several software and hardware applications several software and hardware applications
Won the SMTCOMP 2006 competition in bit-vector category
Alternative ArchitecturesAlternative ArchitecturesAlternative ArchitecturesAlternative Architectures
SAT
Simplifier
DP2DP1 DPn…
New Derived Constraints
Input Formula
Result
Input Formula
Refinement Loop
Substitutions
Simplifications
Linear Solving
BitBlast
CNF Conversion
Boolean SAT
Array Abstraction
STPSTPSTPSTP
1.1. DesignDesign and and ArchitectureArchitecture of STP of STP
2.2. Abstraction-RefinementAbstraction-Refinement based heuristics for Deciding based heuristics for Deciding ArraysArrays
3.3. SolverSolver Algorithm for deciding Linear Bit-vector Algorithm for deciding Linear Bit-vector Arithmetic O(nArithmetic O(n33))
4.4. Experimental ResultsExperimental Results
1.1. DesignDesign and and ArchitectureArchitecture of STP of STP
2.2. Abstraction-RefinementAbstraction-Refinement based heuristics for Deciding based heuristics for Deciding ArraysArrays
3.3. SolverSolver Algorithm for deciding Linear Bit-vector Algorithm for deciding Linear Bit-vector Arithmetic O(nArithmetic O(n33))
4.4. Experimental ResultsExperimental Results
Standard Handling of Array readsStandard Handling of Array readsStandard Handling of Array readsStandard Handling of Array reads
• Problem : O(nProblem : O(n22) axioms added, n is number of read indices) axioms added, n is number of read indices
• Lethal, if n is large: n = 10000, # of axioms: ~ 100 million Lethal, if n is large: n = 10000, # of axioms: ~ 100 million • Blowup seems hard to avoid (e.g. UCLID)Blowup seems hard to avoid (e.g. UCLID)
• This is “aliasing” from another perspectiveThis is “aliasing” from another perspective• Key ObservationKey Observation: Most indices don’t alias: Most indices don’t alias
3.2 GHz Pentium, 512 Kb cache, 32 bit machine, MO @ 3.2 GB, TO @ 30 minutes3.2 GHz Pentium, 512 Kb cache, 32 bit machine, MO @ 3.2 GB, TO @ 30 minutes
Examples courtesy Dawn Song (CMU) and David Molnar (Berkeley)Examples courtesy Dawn Song (CMU) and David Molnar (Berkeley)
STPSTPSTPSTP
1.1. DesignDesign and and ArchitectureArchitecture of STP of STP
2.2. Abstraction-RefinementAbstraction-Refinement based heuristics for Deciding based heuristics for Deciding ArraysArrays
3.3. SolverSolver Algorithm for deciding Linear Bit-vector Algorithm for deciding Linear Bit-vector Arithmetic O(nArithmetic O(n33))
4.4. Experimental ResultsExperimental Results
1.1. DesignDesign and and ArchitectureArchitecture of STP of STP
2.2. Abstraction-RefinementAbstraction-Refinement based heuristics for Deciding based heuristics for Deciding ArraysArrays
3.3. SolverSolver Algorithm for deciding Linear Bit-vector Algorithm for deciding Linear Bit-vector Arithmetic O(nArithmetic O(n33))
4.4. Experimental ResultsExperimental Results
Algorithm for SolvingAlgorithm for SolvingLinear Bit-vector EquationsLinear Bit-vector Equations
Algorithm for SolvingAlgorithm for SolvingLinear Bit-vector EquationsLinear Bit-vector Equations
Previous WorkPrevious Work Mostly Variants of Gaussian EliminationMostly Variants of Gaussian Elimination
Unsuitable for Online Decision ProceduresUnsuitable for Online Decision Procedures
Basic Idea in STPBasic Idea in STP Solve for a variable and substitute it awaySolve for a variable and substitute it away
Online AlgorithmOnline Algorithm Enables other algebraic simplificationsEnables other algebraic simplifications
If cannot isolate a whole variable, If cannot isolate a whole variable, Then isolate part of bit-vector variable, Then isolate part of bit-vector variable, Solve, and substitute it awaySolve, and substitute it away
Previous WorkPrevious Work Mostly Variants of Gaussian EliminationMostly Variants of Gaussian Elimination
Unsuitable for Online Decision ProceduresUnsuitable for Online Decision Procedures
Basic Idea in STPBasic Idea in STP Solve for a variable and substitute it awaySolve for a variable and substitute it away
Online AlgorithmOnline Algorithm Enables other algebraic simplificationsEnables other algebraic simplifications
If cannot isolate a whole variable, If cannot isolate a whole variable, Then isolate part of bit-vector variable, Then isolate part of bit-vector variable, Solve, and substitute it awaySolve, and substitute it away
Purpose of Linear SolverPurpose of Linear SolverPurpose of Linear SolverPurpose of Linear Solver
Helps eliminate lots of redundant variablesHelps eliminate lots of redundant variables
Makes problem much easier for SATMakes problem much easier for SAT
Essential for many real-word large examplesEssential for many real-word large examples
Helps eliminate lots of redundant variablesHelps eliminate lots of redundant variables
Makes problem much easier for SATMakes problem much easier for SAT
Essential for many real-word large examplesEssential for many real-word large examples
Importance of Importance of Online Linear SolverOnline Linear Solver
Importance of Importance of Online Linear SolverOnline Linear Solver
Online Solving enables algebraic Simplifications
ResultResult
Input FormulaInput Formula
Refinement LoopRefinement Loop
SubstitutionsSubstitutions
SimplificationsSimplifications
Linear SolvingLinear Solving
BitBlastBitBlast
CNF ConversionCNF Conversion
Boolean SATBoolean SAT
Array AbstractionArray Abstraction
Algorithm for Solving Algorithm for Solving Linear Bit-vector EquationsLinear Bit-vector Equations
Algorithm for Solving Algorithm for Solving Linear Bit-vector EquationsLinear Bit-vector Equations