Modeling Data in Formal Verification Bits, Bit Vectors, or Words Karam AbdElkader Based on: Presentations form • Randal E. Bryant - Carnegie Mellon University • Decision Procedures An Algorithmic Point of View D.Kroening – Oxsoford Unversity, O.Strichman - Technion
93
Embed
Modeling Data in Formal Verification Bits , Bit Vectors, or Words
Modeling Data in Formal Verification Bits , Bit Vectors, or Words. Karam AbdElkader Based on: Presentations form Randal E. Bryant - Carnegie Mellon University Decision Procedures An Algorithmic Point of View D.Kroening – Oxsoford Unversity , O.Strichman - Technion. Agenda. - PowerPoint PPT Presentation
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Modeling Data in Formal Verification
Bits, Bit Vectors, or Words
Karam AbdElkaderBased on: Presentations form
• Randal E. Bryant - Carnegie Mellon University• Decision Procedures An Algorithmic Point of View
• Decision procedures for Bit-Vector Logic• Flattening Bit-Vector Logic• Incremental Flattening• Bit-Vector Arithmetic With Abstraction
2
– 3 –
Issue How should data be modeled in formal analysis? Verification, test generation, security analysis, …
Approaches Bits: Every bit is represented individually
Basis for most CAD, model checking Words: View each word as arbitrary value
E.g., unbounded integersHistoric program verification work
Bit Vectors: Finite precision words
Captures true semantics of hardware and softwareMore opportunities for abstraction than with bits
Over View
– 4 –
Data Path
Com.Log.
1
Com.Log.
2
Bit-Level Modeling
Represent Every Bit of State Individually Behavior expressed as Boolean next-state over current state Historic method for most CAD, testing, and verification tools
E.g., model checkers
Control Logic
– 5 –
Bit-Level Modeling in Practice
Strengths Allows precise modeling of system Well developed technology
BDDs & SAT for Boolean reasoning
Limitations Every state bit introduces two Boolean variables
Current state & next state Overly detailed modeling of system functions
Don’t want to capture full details of FPU
Making It Work Use extensive abstraction to reduce bit count Hard to abstract functionality
– 6 –
Word-Level Abstraction #1:Bits → Integers
View Data as Symbolic Words Arbitrary integers
No assumptions about size or encodingClassic model for reasoning about software
Can store in memories & registers
x0x1x2
xn-1
x
– 7 –
Data Path
Com.Log.
1
Com.Log.
2
Abstracting Data BitsControl Logic
Data Path
Com.Log.
1
Com.Log.
1? ?
What do we do about logic functions?
– 8 –
Word-Level Abstraction #2:Uninterpreted Functions
For any Block that Transforms or Evaluates Data: Replace with generic, unspecified function Only assumed property is functional consistency:
a = x b = y f (a, b) = f (x, y)
ALUf
– 9 –
Abstracting Functions
For Any Block that Transforms Data: Replace by uninterpreted function Ignore detailed functionality Conservative approximation of actual system
Data Path
Control Logic
Com.Log.
1
Com.Log.
1F1 F2
– 10 –
Word-Level Modeling: History
Historic Used by theorem provers
More Recently Burch & Dill, CAV ’94
Verify that pipelined processor has same behavior as unpipelined reference model
Use word-level abstractions of data paths and memoriesUse decision procedure to determine equivalence
Bryant, Lahiri, Seshia, CAV ’02UCLID verifierTool for describing & verifying systems at word level
– 11 –
Pipeline Verification Example
Reg.File
IF/ID
InstrMem
+4
PC ID/EX
ALU
EX/WB
=
=
RdRa
Rb
Imm
Op
Adat
Control Control
Reg.File
IF/ID
InstrMem
+4
PC ID/EX
ALU
EX/WB
=
=
RdRa
Rb
Imm
Op
Adat
Control Control
Reg.File
InstrMem
+4
ALU
RdRa
Rb
Imm
Op
Adat
Control
Bdat
Reg.File
InstrMem
+4
ALU
RdRa
Rb
Imm
Op
Adat
Control
Bdat
Pipelined Processor
Reference Model
– 12 –
Abstracted Pipeline Verification
Pipelined Processor
Reference Model
Reg.File
IF/ID
InstrMem
+4
PC ID/EX
ALU
EX/WB
=
=
RdRa
Rb
Imm
Op
Adat
Control Control
Reg.File
IF/ID
InstrMem
+4
PC ID/EX
ALU
EX/WB
=
=
RdRa
Rb
Imm
Op
Adat
Control Control
F1
F2
F3
Reg.File
InstrMem
+4
ALU
RdRa
Rb
Imm
Op
Adat
Control
Bdat
Reg.File
InstrMem
+4
ALU
RdRa
Rb
Imm
Op
Adat
Control
Bdat
PC
F1
F2
F3
F1
F2
F3
– 13 –
Experience with Word-Level Modeling
Powerful Abstraction Tool Allows focus on control of large-scale system Can model systems with very large memories
Hard to Generate Abstract Model Hand-generated: how to validate? Automatic abstraction: limited success
Andraus & Sakallah, DAC 2004
Realistic Features Break Abstraction E.g., Set ALU function to A+0 to pass operand to output
Desire Should be able to mix detailed bit-level representation with
abstracted word-level representation
– 14 –
Bit Vectors: Motivating Example #1
Do these functions produce identical results?Strategy
Represent and reason about bit-level program behavior Specific to machine word size, integer representations,
and operations
int abs(int x) { int mask = x>>31; return (x ^ mask) + ~mask + 1;}
int test_abs(int x) { return (x < 0) ? -x : x; }
– 15 –
Motivating Example #2
Is there an input string that causes value 234 to be written to address a4a3a2a1?
Use bit blasting as core technique Apply to simplified versions of formula Successive approximations until solve or show unsatisfiable
– 68 –
Iterative Approach Background: Approximating Formula
Example Approximation Techniques Underapproximating
Restrict word-level variables to smaller ranges of values Overapproximating
Replace subformula with Boolean variable
Original Formula
+Overapproximation + More solutions:
If unsatisfiable, then so is
Underapproximation−
−
Fewer solutions:Satisfying solution also satisfies
– 69 –
Starting Iterations
Initial Underapproximation (Greatly) restrict ranges of word-level variables Intuition: Satisfiable formula often has small-domain
solution
1−
– 70 –
First Half of Iteration
SAT Result for 1− Satisfiable
Then have found solution for Unsatisfiable
Use UNSAT proof to generate overapproximation 1+ (Described later)
1−If SAT, then done
1+
UNSAT proof:generate overapproximation
– 71 –
Second Half of Iteration
SAT Result for 1+ Unsatisfiable
Then have shown unsatisfiable Satisfiable
Solution indicates variable ranges that must be expandedGenerate refined underapproximation
1−
If UNSAT, then done1+
SAT:Use solution to generate refined underapproximation
2−
– 72 –
Example
:= (x = y+2) ^ (x2 > y2)
1− := (x[1] = y[1]+2) ^(x[1]2 > y[1]
2)
2− := (x[2] = y[2]+2) ^ (x[2]2 > y[2]
2)
1+ := (x = y+2)
SAT, done.
UNSATLook at proof
SATx = 2, y = 0
– 73 –
Iterative Behavior
Underapproximations Successively more precise
abstractions of Allow wider variable ranges
Overapproximations No predictable relation UNSAT proof not unique
1−
1+
2−
k−
2+
k+
– 74 –
Overall EffectSoundness
Only terminate with solution on underapproximation
Only terminate as UNSAT on overapproximation
Completeness Successive
underapproximations approach
Finite variable ranges guarantee termination
In worst case, get k−
1−
1+
2−
k−
2+
k+
SAT
UNSAT
– 75 –
Generating Over approximation
Given Underapproximation 1− Bit-blasted translation of 1−
into Boolean formula Proof that Boolean formula
unsatisfiable
Generate Overapproximation 1+ If 1+ satisfiable, must lead to
refined underapproximation
1−
1+
UNSAT proof:generate overapproximation
2−
– 76 –
Bit-Vector Formula Structure DAG representation to allow shared subformulas
x + 2 z 1
x % 26 = v
w & 0xFFFF = x
x = y
Ç
Æ:
Ç
ÆÇ
a
– 77 –
Structure of Underapproximation
Linear complexity translation to CNFEach word-level variable encoded as set of Boolean variablesAdditional Boolean variables represent subformula values
x + 2 z 1
x % 26 = v
w & 0xFFFF = x
x = y
Ç
Æ:
Ç
ÆÇ
a −
RangeConstraints
wxyz
Æ
– 78 –
Encoding Range ConstraintsExplicit
View as additional predicates in formula
Implicit Reduce number of variables in encoding Constraint Encoding 0 w 8 0 0 0 ··· 0 w2w1w0
−4 x 4 xsxsxs··· xsxsx1x0
Yields smaller SAT encodings
RangeConstraints
wx
0 w 8 −4 x 4
– 79 –
RangeConstraints
wxyz
Æ
UNSAT Proof Subset of clauses that is unsatisfiable Clause variables define portion of DAG Sub graph that cannot be satisfied with given range
constraints
x + 2 z 1
x % 26 = v
w & 0xFFFF = x
x = y
a
Ç
Æ
ÆÇ
Ç
:
– 80 –
Extracting Circuit from UNSAT Proof Subgraph that cannot be satisfied with given range
constraintsEven when replace rest of graph with unconstrained
variables
x + 2 z 1
x = y
a Æ
ÆÇ
Ç
:
b1
b2
RangeConstraints
wxyz
ÆUNSAT
– 81 –
Generated Over Approximation Remove range constraints on word-level variables Creates overapproximation
Ignores correlations between values of subformulas
x + 2 z 1
x = y
a Æ
ÆÇ
Ç
:
b1
b2
1+
– 82 –
Generated Over ApproximationAlgorithm
– 83 –
Refinement PropertyClaim
1+ has no solutions that satisfy 1−’s range constraintsBecause 1+ contains portion of 1− that was shown to be
unsatisfiable under range constraints
x + 2 z 1
x = y
a Æ
ÆÇ
Ç
:
b1
b2
RangeConstraints
wxyz
ÆUNSAT
1+
– 84 –
Refinement Property (Cont.)
Consequence Solving 1+ will expand range of some variables Leading to more exact underapproximation 2−
x + 2 z 1
x = y
a Æ
ÆÇ
Ç
:
b1
b2
1+
– 85 –
Effect of Iteration
Each Complete Iteration Expands ranges of some word-level variables Creates refined underapproximation
1−
1+
SAT:Use solution to generate refined underapproximation
2−
UNSAT proof:generate overapproximation
– 86 –
Approximation Methods
So Far Range constraints
Underapproximate by constraining values of word-level variables
Subformula eliminationOverapproximate by assuming subformula value arbitrary
General Requirements Systematic under- and over-approximations Way to connect from one to another
Goal: Devise Additional Approximation Strategies
– 87 –
Function Approximation Example
§: Prohibit Via Additional Range Constraints Gives underapproximation Restricts values of (possibly intermediate) terms
§: Abstract as f (x,y) Overapproximate as uninterpreted function f Value constrained only by functional consistency
*x
y
x
0 1 else
y
0 0 0 0
1 0 1 x
else 0 y §
– 88 –
Function Approximation Example
*x
y
x
0 1 else
y
0 0 0 0
1 0 1 x
else 0 y §
– 89 –
Results: UCLID BV vs. Bit-blasting
UCLID always better than bit blasting Generally better than other available procedures SAT time is the dominating factor
[results on 2.8 GHz Xeon, 2 GB RAM]
– 90 –
Challenges with Iterative ApproximationFormulating Overall Strategy
Which abstractions to apply, when and where How quickly to relax constraints in iterations
Which variables to expand and by how much?Too conservative: Each call to SAT solver incurs costToo lenient: Devolves to complete bit blasting.
Predicting SAT Solver Performance Hard to predict time required by call to SAT solver Will particular abstraction simplify or complicate SAT?
Combination Especially Difficult Multiple iterations with unpredictable inner loop
– 91 –
Summary: Modeling LevelsBits
Limited ability to scale Hard to apply functional abstractions
Words Allows abstracting data while precisely representing control Overlooks finite word-size effects
Bit Vectors Realistic semantic model for hardware & software Captures all details of actual operation
Detects errors related to overflow and other artifacts of finite representation
Can apply abstractions found at word-level
– 92 –
Areas of Agreement
SAT-Based Framework Is Only Logical Choice SAT solvers are good & getting better
Want to Automatically Exploit Abstractions Function structure Arithmetic properties