Top Banner
A Dependently Typed Assembly Language (Joint work with Robert Robert Harper Harper)
38

A Dependently Typed Assembly Language Robert Harper (Joint work with Robert Harper)

Dec 21, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: A Dependently Typed Assembly Language Robert Harper (Joint work with Robert Harper)

A Dependently Typed Assembly Language

(Joint work with Robert HarperRobert Harper)

Page 2: A Dependently Typed Assembly Language Robert Harper (Joint work with Robert Harper)

The general motivation

Q: Why do we want to type low level

languages?

A: We want to reap the benefits of type systems at low levels.

Page 3: A Dependently Typed Assembly Language Robert Harper (Joint work with Robert Harper)

Advantages of type systems

•Capturing program errors at compile-time (well-known)

•Enabling aggressive compiler optimizations (recent)

•Supporting sophisticated module systems (SML)

•Facilitating program verification (NuPRL, Coq, PVS)

•Serving as program documentation

Page 4: A Dependently Typed Assembly Language Robert Harper (Joint work with Robert Harper)

The goal of this work

The goal is to capture memory safety of assembly code through a type system

Memory Safety = Type Safety + Safe Array Access

Page 5: A Dependently Typed Assembly Language Robert Harper (Joint work with Robert Harper)

Array bounds checking problem

Array bounds checking refers to determining whether the value of an expression is within the bounds of an array when the expression is used to index the array.

Page 6: A Dependently Typed Assembly Language Robert Harper (Joint work with Robert Harper)

Byte copy: A version in C

voidbcopy(int src[], int dst[]) { int i;

if (length(src) != length(dst) { printf “bcopy: unequal lengths\n”; exit(1); } for(i=1; i < length(src), i++) dst[ii] = src[i];}

Page 7: A Dependently Typed Assembly Language Robert Harper (Joint work with Robert Harper)

Dynamic array bounds checking

• is required for safe languages such as Java, Modula-3, ML, Pascal

• can be expensive in practice (e.g. numerical computation)

• bounds violation is a rich source of program errors in unsafe languages such as C, C++ (e.g. off-by-one error)

Page 8: A Dependently Typed Assembly Language Robert Harper (Joint work with Robert Harper)

Some experimental results

0102030405060

bco

py

bin

ary

sear

ch

bub

ble

sor

t

mat

rix m

ult

quee

n

quic

k so

rt

han

oi t

ower

w/ checks

w/o checks

Page 9: A Dependently Typed Assembly Language Robert Harper (Joint work with Robert Harper)

Static array bounds checking

• Flow Analysis– no annotations required (fully

automatic)– limited to simple cases and sensitive

to program structures– little or no feedback for detecting

program errors

Page 10: A Dependently Typed Assembly Language Robert Harper (Joint work with Robert Harper)

Static array bounds checking

• Type-based approaches– ML type system is too coarse– full dependent types is too fine– dependent ML provides an intermediate

type system with practical type-checking• it is adequate for array bounds checking

elimination• the programmer must write some type

annotations

Page 11: A Dependently Typed Assembly Language Robert Harper (Joint work with Robert Harper)

What are dependent types?

Dependent types depend on the values of language expressions.

For instance, type : dependent type array : array(x) int : int(x) stack : stack(x)

Page 12: A Dependently Typed Assembly Language Robert Harper (Joint work with Robert Harper)

Byte copy: A version in de Caml

let bcopy src dst = begin for i = 0 to vect_length(src) - 1 do dst..(i) <- src..(i) doneendwithtype {n:nat} int vect(n) -> int vect(n) -> unit

Page 13: A Dependently Typed Assembly Language Robert Harper (Joint work with Robert Harper)

Array bounds checking in mobile code

• It needs to be enforced for safety concerns

• It is difficult to eliminate since the machine which executes the code may not trust the source of the code

• It is time-consuming to be compiled away

Page 14: A Dependently Typed Assembly Language Robert Harper (Joint work with Robert Harper)

Some key applications of DTAL

• Compiler verification• Mobile code security• Mobile code efficiency

Page 15: A Dependently Typed Assembly Language Robert Harper (Joint work with Robert Harper)

Increment: A flow chart

i:int

l:label

sp: pop r1

l:labelsp:

r1 = i add r1, r1, 1

r1 = i + 1

sp:

l:labelpop r2 r1 = i + 1

r2 = l

i+1:intsp:

r2 = l

push r1

jmp r2continue ...

Page 16: A Dependently Typed Assembly Language Robert Harper (Joint work with Robert Harper)

Increment: An assembly version

inc: pop r1 add r1, r1, 1 pop r2 push r1 jmp r2

Page 17: A Dependently Typed Assembly Language Robert Harper (Joint work with Robert Harper)

State types

A state type corresponds to code continuation. It records the type information about register file and stack.

For instance,• [r1: int(i), r2: int array(i)]• (‘a)[r1: ‘a, r2: [r1: ‘a]]• (‘a,‘b)[r1: ‘a, r2: ‘b, r3: [r1: ‘a, r2: ‘b]]• {n:nat}[sp: [sp: stack(n)] :: stack(n)]

Page 18: A Dependently Typed Assembly Language Robert Harper (Joint work with Robert Harper)

Register file

We use an array representation for register file.

r1: tau1

r2: tau2

rn: taun

Page 19: A Dependently Typed Assembly Language Robert Harper (Joint work with Robert Harper)

Stack

We use a list representation for stack.

i1 tau1 i2 tau2

in taun

i tau

Page 20: A Dependently Typed Assembly Language Robert Harper (Joint work with Robert Harper)

Increment: A version in DTAL

inc:{i:int}{n:nat} [sp: int(i):: [sp: int(i+1)::stack(n)]::stack(n)]

pop r1 // r1: int(i) add r1, r1, 1 // r1: int(i+1) pop r2 // r2: [sp:int(i+1)::stack(n)] push r1 // sp: int(i+1)::stack(n) jmp r2

Page 21: A Dependently Typed Assembly Language Robert Harper (Joint work with Robert Harper)

Type index objects

• index i,j ::= a | c | i+j | i-j | i*j | i/j

• index prop P ::= false | true | i<j | i<=j | i=j| i>=j| i>j | not P | P1 and P2 | P1 or P2

• index sort gamma ::= int | {a: gamma | P} For instance, nat is a shorthand for {a: int | a >= 0}

Page 22: A Dependently Typed Assembly Language Robert Harper (Joint work with Robert Harper)

Types

• types tau ::= alpha | sigma | int(x) | tau array(x) | stack(x) | prod(tau1,…,taun) | {a:gamma}.tau

• state types sigma ::=[(alpha1,…,alphan){a1:gamma1,…,an:gamman}.state]

• state state ::= (register file,stack)

Note: int is for {a:int}.int(a) nat is for {a:nat}.int(a)

Page 23: A Dependently Typed Assembly Language Robert Harper (Joint work with Robert Harper)

Instructions in DTAL

• instructions ins ::= aop rd, rs, v | bop r, v | jmp v | load rd, rs(v) | store r |

newtuple[tau] r | newarray[tau] r | mov r, v | pop r | push r

• values v ::= () | i | l | r

• instruction sequences I ::= halt | ins; I | l; I

Page 24: A Dependently Typed Assembly Language Robert Harper (Joint work with Robert Harper)

Programs in DTAL

• label maps Lambda ::= (l1: sigma1, …, ln: sigman)• programs ::= (Lambda, I)

Page 25: A Dependently Typed Assembly Language Robert Harper (Joint work with Robert Harper)

Memory allocation

• Tuple of type prod(tau1,…,taun)

• Array of type tau array(n)

tau1

tau2

taun-1

taun

tautau

tautau

Page 26: A Dependently Typed Assembly Language Robert Harper (Joint work with Robert Harper)

Memory allocation: an example

A tuple of type prod(int, prod(int, int)):

intintint

Page 27: A Dependently Typed Assembly Language Robert Harper (Joint work with Robert Harper)

Array types are non-variant

tau1 <= tau2 does not implies tau1 array(n) <= tau2 array(n)

Here is a counterexample: r1: nat array(2) r2: int array(2)

00

r1 = = r2

Page 28: A Dependently Typed Assembly Language Robert Harper (Joint work with Robert Harper)

State types are contra-variant

state state’ implies

[state] <= [state’]

Page 29: A Dependently Typed Assembly Language Robert Harper (Joint work with Robert Harper)

Typing unconditional jumps

state V : [state’]

state state’

I

state jmp v; I

Page 30: A Dependently Typed Assembly Language Robert Harper (Joint work with Robert Harper)

Typing conditional jumps

r = 0?

assumption;state

assumption;state

beq r,v; I

r:int(x)

assumption;state

v:[state’]

assumption,x!=0;state I

assumption,x=0;state state’

Page 31: A Dependently Typed Assembly Language Robert Harper (Joint work with Robert Harper)

Byte copy: A flow chart

r4 <- 0

loop

bcopy

r5 <- r1 - r4

r5 > 0? finish

r5 <- r2(r4)r3(r4) <- r5

r4 <- r4 + 1

r1: array size r2: src r3: dst

Page 32: A Dependently Typed Assembly Language Robert Harper (Joint work with Robert Harper)

Byte copy: A version in DTAL

bcopy:{i:nat} [r1: int(i), r2: int array(i), r3: int array(i)] mov r4, 0 // r4 <- 0 jmp loop // start loop

Page 33: A Dependently Typed Assembly Language Robert Harper (Joint work with Robert Harper)

Byte copy: a version in DTAL

loop: {i:nat, k:nat} [r1: int(i), r2: int array(i) r3: int array(i), r4: int(k)] sub r5, r1, r4 // r5 = r1 - r4 blte r5, finish // r5 <= 0 ? load r5, r2(r4) // safe load store r3(r4), r5 // safe store add r4, r4, 1 // r4 <- r4 + 1 jmp loop // loop againfinish:[] halt

Page 34: A Dependently Typed Assembly Language Robert Harper (Joint work with Robert Harper)

Operational semantics of DTAL

We use a standard abstract machine for assigning operational semantics to DTAL programs. The machine consists of three finite maps:

(Heap, Register File, Stack)

Page 35: A Dependently Typed Assembly Language Robert Harper (Joint work with Robert Harper)

Soundness

• The execution of a DTAL program can either– terminate normally, or– run forever, or– stall.

• A well-typed DTAL program can never stall.

Page 36: A Dependently Typed Assembly Language Robert Harper (Joint work with Robert Harper)

Related work

Here is a (partial) list of some closely related work.– Dependent types in practical

programming (Xi & Pfenning)– TALC Compiler (Morrisett et al at Cornell)– Safe C compiler (Necula & Lee)– TIL compiler (the Fox project at CMU)

Page 37: A Dependently Typed Assembly Language Robert Harper (Joint work with Robert Harper)

Current status & Future work

• We have finished the following.– Theoretical development of DTAL– A prototype implementation of a type-

checker for DTAL

• We are working on the following.– Designing a dependent type system

of JVML (de JVML)– Compiling (a subset of Java) into de

JVML

Page 38: A Dependently Typed Assembly Language Robert Harper (Joint work with Robert Harper)

Conclusion

We have demonstrated some uses of dependent types at assembly level.– It can help compiler debugging and

verification– It can certify the memory safety

property of mobile code– It can lead to safer programs without

compromising efficiency