A Translation from Typed Assembly Language to Certified Assembly Programming Zhong Shao Yale University Joint work with Zhaozhong Ni Paper URL: http://flint.cs.yale.edu/flint/publications/talcap. html August 11, 2006
Mar 27, 2015
A Translation from Typed Assembly Language to Certified Assembly
Programming
Zhong Shao Yale University
Joint work with Zhaozhong NiPaper URL: http://flint.cs.yale.edu/flint/publications/talcap.html
August 11, 2006
Research objective (of the FLINT group)
To build a certified software platform with real guarantee of reliability & security !
Hardware
certified L1 software
legacy SW layer1
legacy SW layer2
legacy SW layer3
legacy SW layer4
certified L2 SW
certified L3 SW
certified L4 SW
certified L5 SW
The lowest SW layer is the key!
A buggy L1 software can take over the machine easily!
Hardware
buggy L1 software (or VM)
legacy SW layer1
legacy SW layer2
legacy SW layer3
legacy SW layer4
certified L2 SW
certified L3 SW
certified L4 SW
certified L5 SW
infected L2 SW
infected L3 SW
infected L4 SW
infected L5 SW
Must be Trusted!
Structure of our certified framework
• certified code (proof + machine code)
• machine model• safety policy• mechanized meta-logic• proof checker
Proof Checker Yes
CPU
Safety Policy
Proof
machine code
No
What makes a good mechanized meta logic?
You’d better be very paranoid!
• The logic must be “rock-solid”, i.e., consistent!
• The logic must be expressive to express everything a hacker wants to say
• Support explicit, machine-checkable proof objects
• The logic must be simple so that the proof checker can be hand-verified
• Can serve as logical framework and meta-logical framework to allow one to prove using specialized logics
• Compatible with “automated proof construction”
How to scale?
Modularity, modularity, modularity!
specification S1
binary code C1
formal proof P1
specification S2
binary code C2
formal proof P2
specification S3
binary code C3
formal proof P3
specification S4
binary code C4
formal proof P4
specification S6
binary code C6
formal proof P6
specification S5
binary code C5
formal proof P5
specification S
binary code C
formal proof P
Linking
Another form of modularity
Software is often organized as a vertical stack of abstractions!
Not everything is certified at the assembly level!
Hardware
certified L1 software
certified L2 SW
certified L3 SW
certified L4 SW
certified L5 SW
Must accurately specify & certify all these interfaces!
A really “juicy” research area …Many interesting & exciting problems:
• How to certify each standard language and OS abstraction?– general code pointers– procedure call/return – general stack-based control abstraction– mutable data structures (& malloc/free …)– self-modifying code (& OS boot loader …)– interrupt/signal handling– device drivers and IO managers– thread libraries and synchronization – multiprocessor and memory model– OS kernel/user abstraction – …………
• How to combine proof assistant with general-purpose programming?
• Other exciting interplays btw machine-checked proofs & computation
Related research projects at Yale
Certifying different language & OS abstractions:
– certified assembly programming [CAP ESOP’02]– embedded code pointers [XCAP POPL’06]– non-preemptive threads [CCAP ICFP’04 & CMAP ICFP’05]– stack-based control abstractions [SCAP PLDI’06]– self-modifying code & local reasoning [Cai et al GCAP on-going]– thread libraries and synchronizations [Ni et al on-going]– interrupts & multiprocessors [Ferreira et al on-going]– open framework for interoperability [Feng et al OCAP on-going]– boot-loaders & preemptive threads [Feng et al on-going]– memory management using malloc/free [CAP ESOP’02]– garbage collector & mutator [McCreight et al on-going]
Features of a CAP-style system
• All built on a mechanized meta logic (e.g., Coq)
• Both the machine-level program and the property are specified by formulas in the meta logic
• Like TLA except our meta logic is mechanized
• Hoare-style assertions & inference rules enforce both the correctness & type safety properties
• No need of a separate type system; not a “refinement”
• Assertion languages can vary:• Borrow those from Coq (shallow embedding) --- CAP• Hybrid: Coq assertions + a thin layer of syntax --- XCAP
TAL vs. CAP
• Type-based Approach– TAL [Morrisett98]
– Touchstone PCC [Colby00]
– Syntactic FPCC [Hamid02]
– FTAL [Crary03]
– LTAL [Chen03]
– …
– Modular– Generate proof easily– Type safety
• Logic-based Approach– Original PCC [Necula98]
– CAP [Yu03]
– CCAP/CMAP [Yu04, Feng05] – XCAP [Ni & Shao 06]
– SCAP [Feng et al 06]
– …
– Expressive– Advanced properties– Good interoperability
This talk! We also show how to embed TAL into our new XCAP!
SCAP [Feng et al PLDI’06]
XCAP [Ni & Shao POPL’06]
Can we have the best of both worlds?
• Can a Hoare-style CAP system support:– embedded code pointers?– closures?– exceptions?– runtime stacks?– general references w. weak update?– recursive data structures?
Syntax of target machine TM
Operational semantics of TM
Mechanized meta logic
All implemented in the Coq proof assistant!
XCAP assertion language
Validity rules for PropX [P] = ¢ ` P
Soundness of interpretation
XCAP inference rules: top level
XCAP inference rules: instructions
Soundness of XCAP
Memory mutation in XCAP?
• Strong update!– special conjunction (p * q) in separation logic
– directly definable in Prop and PropX– explicit alias control, popular in system level
• Weak update (general reference)?– mutable reference (int ref) in ML– managed data pointers (int __gc*) in .NET
– rely on GC to recycle memory
XCAP extension: general reference
XCAP ext.: supporting weak update
Recursive specification in XCAP?
• Simple recursive data structures!– linked list, queue, stack, tree, etc.
– supported via inductive definition of Prop
• Complex recursive structures with ECP?– object (self refers to the entire object)
– threading invariant (each thread assumes others)
XCAP ext.: recursive predicates
TAL: type definitions
TAL typing rules: top level
TAL typing rules: instructions
TAL typing rules: instructions (cont’d)
TAL uses “syntactic” subtyping!
Well-formed state in TAL
TAL-to-XCAP translation
Step #1: we build a “logic-based” TAL that uses semantic subtyping
Step #2: translating regular TAL into “logic-based” TAL
(this is fairly straight-forward!)
Step #3: translating “logic-based” TAL into XCAP
TAL-to-XCAP: translating value types
• translation of preconditions
• translation of code heap types
• translation of data heap types
TAL-to-XCAP: other translations
TAL-to-XCAP: typing preservation
Conclusion
user application
libraryTAL
device driver
OS kernelfirmwareXCAP
XCAP can be extended to support general reference, weak update, and recursive specification !
We give a direct embedding of TAL into XCAP.