Top Banner
Yoyak: static analysis framework Heejong Lee ScalaDays 2015
105
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Yoyak ScalaDays 2015

Yoyak: static analysis framework

Heejong Lee

ScalaDays 2015

Page 2: Yoyak ScalaDays 2015

Speaker Introduction

• Has been working in a static analysis industry since 2008

• Studied programming language theory at a graduate school

• Has been developing several static analyzers which are mostly commercial ones

• Began to use Scala six years ago and still actively using it in everyday development

Page 3: Yoyak ScalaDays 2015

Agenda

• Static analysis

• Theory of abstract interpretation

• Yoyak framework: implementation highlights

• Yoyak framework: Scala experience

• Yoyak framework: Roadmap

Page 4: Yoyak ScalaDays 2015

Static Analysis

Page 5: Yoyak ScalaDays 2015

What is Static Analysis?

• Analyze source codes without actually running it

• Someone prefers to call it white box test

• Used for finding bugs, optimizing a compiled binary, calculating a software metric, proving safety properties, etc.

Page 6: Yoyak ScalaDays 2015

Examples of Static Analysis

• Finding bugs : symbolic execution

• Optimizing a compiled binary: data flow analysis

• Calculating a software metric: syntactic analysis

• Proving safety properties: model checking, abstract interpretation, type system

Page 7: Yoyak ScalaDays 2015

Two important terms in Static Analysis

• Soundness

• The analysis result should contain all possibilities which can happen in the runtime

• If the analysis uses an over-approximation, it is sound

• Completeness

• The analysis result should not contain any possibility which cannot happen in the runtime

• If the analysis uses an under-approximation, it is complete

Page 8: Yoyak ScalaDays 2015

Two important terms in Static Analysis

Over-approximation of Semantics

Program Semantics

Under-approximation of Semantics

Page 9: Yoyak ScalaDays 2015

Abstract Interpretation

The beauty of abstraction

http://cargocollective.com/carlyfox/Design

Page 10: Yoyak ScalaDays 2015

What is the result of this expression?

19224⇥ 7483919⇥ (11952� 20392)

Page 11: Yoyak ScalaDays 2015

What is the result of this expression?

19224⇥ 7483919⇥ (11952� 20392)

= �1214270048744640

How long does it take without a calculator?

Page 12: Yoyak ScalaDays 2015

What is the result of this expression?

19224⇥ 7483919⇥ (11952� 20392)

= �1214270048744640

What if we do not have an interest in the exact number, rather we just want to know whether it is positive or negative?

Page 13: Yoyak ScalaDays 2015

What is the result of this expression?

19224⇥ 7483919⇥ (11952� 20392)

+⇥ +⇥ �= �

= n (n 2 Z ^ n < 0)

Page 14: Yoyak ScalaDays 2015

What is the result of this expression?

19224⇥ 7483919⇥ (11952� 20392)

= �1214270048744640

= n (n 2 Z ^ n < 0)

takes 30 seconds

takes 3 seconds

• inaccurate but not incorrect • accurate enough for a specific purpose • much faster than a real calculation

This is abstract interpretation

Page 15: Yoyak ScalaDays 2015

Is this program safe from buffer overruns?

void foo(int x) {String[] strs = new String[10];int index = 0;while(x > 0) {

index = index + 1;x = x - 1;

}strs[index] = "hello!";

}

Page 16: Yoyak ScalaDays 2015

No, ArrayIndexOutOfBoundsException may occur at the last line

void foo(int x) {String[] strs = new String[10];int index = 0;while(x > 0) {

index = index + 1;x = x - 1;

}strs[index] = "hello!";

}

index = [0,0]

index = [1,∞]

index = [0,∞]

Page 17: Yoyak ScalaDays 2015

• Roughly but soundly execute the program

Abstract interpretation for dummies

Page 18: Yoyak ScalaDays 2015

?

Abstract interpretation for brains

Page 19: Yoyak ScalaDays 2015

First, we need to precisely define what “domain” and “semantics” means in a mathematical way

Page 20: Yoyak ScalaDays 2015

Let me introduce you Javar language

Page 21: Yoyak ScalaDays 2015

1

Page 22: Yoyak ScalaDays 2015

1

What this program means?

Page 23: Yoyak ScalaDays 2015

Javar-1

C ! n (n 2 Z)

Page 24: Yoyak ScalaDays 2015

Javar-1 semantic domain

n 2 V alue = ZJCK 2 V alue

Page 25: Yoyak ScalaDays 2015

Javar-1 semantics

JnK = n

Page 26: Yoyak ScalaDays 2015

1+1

Page 27: Yoyak ScalaDays 2015

Javar-2

C ! n op n (n 2 Z, op 2 {+,�, ⇤, /})

Page 28: Yoyak ScalaDays 2015

Javar-{1,2} semantic domain

n 2 V alue = ZJCK 2 V alue

Page 29: Yoyak ScalaDays 2015

Javar-2 semantics

JnK = n

Jn1 + n2K = Jn1K + Jn2KJn1 � n2K = Jn1K � Jn2KJn1 ⇤ n2K = Jn1K ⇥ Jn2KJn1 / n2K = Jn1K ÷ Jn2K

Page 30: Yoyak ScalaDays 2015

x := x + 1

Page 31: Yoyak ScalaDays 2015

Javar-3

C ! x := E

E ! n (n 2 Z)| x| E op E (op 2 {+,�, ⇤, /})

Page 32: Yoyak ScalaDays 2015

Javar-3 semantic domain

M 2 Memory = V ar ! V alue

n 2 V alue = Zx 2 V ar = V ariables

JCK 2 Memory ! Memory

JEK 2 Memory ! Z

Page 33: Yoyak ScalaDays 2015

Javar-3 semantics

Jx := EKM = M{x ! JEKM}JnKM = n

JxKM = M(x)

JE1{+,�, ⇤, /}E2KM = JE1KM{+,�,⇥,÷}JE2KM

Page 34: Yoyak ScalaDays 2015

x := 100 + 2; if(x) x := x * 10 else x := x / 2; while(x) x := x - 1

Page 35: Yoyak ScalaDays 2015

Javar-4

C ! x := E

| if (E) C else C

| while (E) C

| C;C

E ! n (n 2 Z)| x| E op E (op 2 {+,�, ⇤, /})

Page 36: Yoyak ScalaDays 2015

Javar-{3,4} semantic domain

M 2 Memory = V ar ! V alue

n 2 V alue = Zx 2 V ar = V ariables

JCK 2 Memory ! Memory

JEK 2 Memory ! Z

Page 37: Yoyak ScalaDays 2015

Javar-4 semantics

Jx := EKM = M{x ! JEKM}Jif(E) C1 else C2KM = if JEKM 6= 0 then JC1KM else JC2KM

Jwhile(E) CKM = if JEKM 6= 0 then Jwhile(E) CK(JCKM) else M

JnKM = n

JxKM = M(x)

JE1{+,�, ⇤, /}E2KM = JE1KM{+,�,⇥,÷}JE2KM

Page 38: Yoyak ScalaDays 2015

This is not a definition

Jwhile(E) CKM = if JEKM 6= 0 then Jwhile(E) CK(JCKM) else M

GNU = GNU’s Not Unix

Page 39: Yoyak ScalaDays 2015

The existence and uniqueness of the fixed-point is guaranteed by domain theory

Jwhile(E) CKM = if JEKM 6= 0 then Jwhile(E) CK(JCKM) else M

Jwhile(E) CK = �M.if JEKM 6= 0 then Jwhile(E) CK(JCKM) else M

F = �M.if JEKM 6= 0 then F (JCKM) else M

F = H(F )

Jwhile(E) CK = fix(�F.�M.if JEKM 6= 0 then F (JCKM) else M)

Page 40: Yoyak ScalaDays 2015

Abstract interpretation revisited

• Safely estimate program semantics in a finite time

• Abstraction is not omission, guarantees soundness

• Most of static analysis techniques can be defined in a form of abstract interpretation

Page 41: Yoyak ScalaDays 2015

Key Elements of Abstract Interpretation

• Domain : concrete domain, abstract domain

• Semantics : concrete semantics, abstract semantics

• Galois connection : pair of abstraction and concretization functions

• CPO : complete partial order

• Continuous function : preserving upper bound

Page 42: Yoyak ScalaDays 2015

Galois Connection

8x 2 D, x 2 D : ↵(x) v x () x v �(x)

x

x

D D

Page 43: Yoyak ScalaDays 2015

CPO

exists partial order ⊑

exists element x where x ⊑ y (for all y ∈ D)

for all ordered subset of D, there exists upper bound x where x ∈ D

Page 44: Yoyak ScalaDays 2015

Lattices

Partially ordered set in which every two elements have a unique LUB(⊔)

and a unique GLB(⊓)

Page 45: Yoyak ScalaDays 2015

Continuous Function

x

D

8ordered subset S ✓ D,F (G

x2S

x) =G

x2S

F (x)

D

y

z

F (x)

F (y)

F (z)

Page 46: Yoyak ScalaDays 2015

Abstract Interpretation in a NutshellConcrete Abstract

Program Semantics

Domain D should be CPO should be CPO

Galois Connection

Semantic Function F should be continuous should be monotonic

Program Execution

F : D ! D F : D ! D

lfp F =G

i2NF i(?)

G

i2NF i(?) v X

↵ : D ! D � : D ! D

Performing analysis using abstract interpretation = calculating in a finite time X

And the following formula is always satisfied (soundness guarantee)

lfp F v �X

Page 47: Yoyak ScalaDays 2015

Abstract Interpretation in a Nutshell

lfp F v �X

false positives

lfp F

X

lfp F

↵ � F v F � ↵

D D

Page 48: Yoyak ScalaDays 2015

Is this program safe from buffer overruns?

void foo(int x) {String[] strs = new String[10];int index = 0;if(x > 0) {

index = 1;} else {

index = 10;}strs[index] = "hello!";

}

Page 49: Yoyak ScalaDays 2015

void foo(int x) {String[] strs = new String[10];int index = 0;if(x > 0) {

index = 1;} else {

index = 10;}strs[index] = "hello!";

}

index = [0,0]

index = [1,1]

index = [10,10]

index = [1,10]

Page 50: Yoyak ScalaDays 2015

Interval analysis based on abstract interpretation

• Concrete domain: the domain in the real world

Memory = V ar ! V alue

V alue = 2Z

C 2 C ! Memory ! Memory

V 2 E ! Memory ! V alue

Page 51: Yoyak ScalaDays 2015

Interval analysis based on abstract interpretation

• Concrete semantics: the semantics in the real world

C x := E m = m{x 7! V E m}C if(E) C1 C2 m = V E m ? C C1 m : C C2 m

C while(E) C m = V E m ? C while(E) C (C C m) : m

C C1;C2 m = C C2 (C C1 m)

V x m = m x

V n m = {n}V E1 + E2 m = (V E1 m) + (V E2 m)

Page 52: Yoyak ScalaDays 2015

Interval analysis based on abstract interpretation

• Concrete execution of a program

? @ F (?) @ F (F (?)) @ F (F (F (?)))... @ F i(?) = F i+1(?)

is the execution result of a programF

i(?) 2 Memory

F = �m.C C m

lfp F =G

i2NF i({})

Page 53: Yoyak ScalaDays 2015

Interval analysis based on abstract interpretation

• Abstract domain: the domain we will use in an analysis

ˆMemory = V ar ! ˆ

V alue

ˆV alue = Z [ {?}

Z = {[a, b] | a 2 Z [ {�1}, b 2 Z [ {1}, a b}C 2 C ! ˆ

Memory ! ˆMemory

V 2 E ! ˆMemory ! ˆ

V alue

Page 54: Yoyak ScalaDays 2015

[0,0] [1,1] [2,2] ……..[-1,-1][-2,-2][-3,-3]

[-1,0] [0,1] [0,2][-2,-1][-3,-2]

[-3,-1] [-2,0] [-1,1] [0,2]

[-2,1][-3,0] [-1,2]

……..

[-∞,∞]

[0,∞]

[-1,∞]

[-2,∞]

……..

[-∞,0]

[-∞,1]

[-∞,2]

…….……

………………

………………..…

……..

…….……

………………

………………..…

Lattice of Interval Domain

Page 55: Yoyak ScalaDays 2015

Interval analysis based on abstract interpretation

• Abstract semantics: the semantics we will use in an analysis

C x := E m = m{x 7! V E m}C if(E) C1 C2 m = C C1 m t C C2 m

C while(E) C m = m t C while(E) C (C C m)

C C1;C2 m = C C2 (C C1 m)

V x m = m x

V n m = ↵{n}V E1 + E2 m = (V E1 m)+(V E2 m)

Page 56: Yoyak ScalaDays 2015

Interval analysis based on abstract interpretation

• Abstract execution of a program

is the analysis result of a program

F = �m.C C mG

i2NF i({}) v X

? @ F (?) @ F (F (?)) @ F (F (F (?)))... @ F i(?) v X

X

Page 57: Yoyak ScalaDays 2015

Interval analysis based on abstract interpretation

• Widening

What if this chain has infinite length?

? @ F (?) @ F (F (?)) @ F (F (F (?)))... @ F i(?) v X

? @ F (?) @ F (F (?)) @ F (F (F (?)))... @ F i�1(?)rF i(?) v X

rWe need a widening operator

Page 58: Yoyak ScalaDays 2015

Interval analysis based on abstract interpretation

• Widening

? @ [0, 0] @ [0, 1] @ [0, 2]... @ [0, i� 1] r [0, i] v [0,1]

void foo(int x) {String[] strs = new String[10];int index = 0;while(x > 0) {

index = index + 1;x = x - 1;

}strs[index] = "hello!";

}

index = [0,0]

index = [1,∞]

index = [0,∞]

Page 59: Yoyak ScalaDays 2015

Is this program safe from buffer overruns?

void foo(int x) {String[] strs = new String[10];int index = 0;if(x > 0) {

index = 1;} else {

index = 10;}strs[index] = "hello!";

}

Page 60: Yoyak ScalaDays 2015

Interval analysis based on abstract interpretation

0

213 4

5 6

index = 0; if(x > 0) index = 1 else index = 10; result = index

C C0 m = C C2 (C C1 m)

C C1 m = m{index 7! ↵{0}}C C2 m = C C4 (C C3 m)

C C3 m = C C5 m t C C6 m

C C4 m = m{result 7! m index}C C5 m = m{index 7! ↵{1}}C C6 m = m{index 7! ↵{10}}

Page 61: Yoyak ScalaDays 2015

Interval analysis based on abstract interpretation

C C0 {} = C C2 (C C1 {})C C1 {} = {index 7! [0, 0]}

C C2 {index 7! [0, 0]} = C C4 (C C3 {index 7! [0, 0]})C C3 {index 7! [0, 0]} = C C5 {index 7! [0, 0]} t C C6 {index 7! [0, 0]}C C4 {index 7! [1, 10]} = {index 7! [1, 10], result 7! [1, 10]}C C5 {index 7! [0, 0]} = {index 7! [1, 1]}C C6 {index 7! [0, 0]} = {index 7! [10, 10]}

C C0 {} = {index 7! [1, 10], result 7! [1, 10]}

Page 62: Yoyak ScalaDays 2015

void foo(int x) {String[] strs = new String[10];int index = 0;if(x > 0) {

index = 1;} else {

index = 10;}strs[index] = "hello!";

}

index may have an integer between 1 and 10

Since the size of the buffer strs is 10, ArrayIndexOutOfBoundsException may occur here

Is this program safe from buffer overruns?

Page 63: Yoyak ScalaDays 2015

YoyakDo not reinvent the wheel

https://trimaps.com/assets/website/dontreinventthemap-6ba62b8ba05d4957d2ed772584d7e4cd.png

Page 64: Yoyak ScalaDays 2015

Motivation

• Do no reinvent the wheel : many components that static analyzers often use are reusable

• CFG data types : construction, optimization, visualization

• Graph algorithms : unrolling loops, finding loop heads, finding topological order

• Intermediate language data types : construction, optimization, pretty printing

• Common abstract domains : integer interval, abstract object, abstract memory

• Common abstract semantics : assignment, invoking methods, evaluating binary expressions

Page 65: Yoyak ScalaDays 2015

Motivation

• Perfect to be a framework : the theory of abstract interpretation guarantees soundness and termination of the analysis if a user supplies valid abstract domain and semantics

Generic fixed point computation engine

Abstract domain D Abstract semantics F

Fixed point x = F(x) (x∈D)

Page 66: Yoyak ScalaDays 2015

OverviewYoyak

Abstract Domain Fixed Point Computation Abstract Semantics

MapDom

MemDom

Interval

ArithmeticOps

LatticeOps

StdSemanticsForwardAnalysis

AbstractTransferable

Widening

Galois

ILFlowSensitiveFixedPoint

Computation

Worklist

WideningAtLoopHeads

InterproceduralIteration

DoWidening

CommonIL

Attachable

Typable

Page 67: Yoyak ScalaDays 2015

Fixed-point Computation in Yoyak

Built-in work-list algorithm

x := 10

Assume (y == 0) println(“0”)

println(“2”)

Assume (y != 0)

Assume (y == 1) println(“0”) Assume (y != 1)

Assume (z) throw new Ex();

ENTRY

EXIT

Assume (!z) println(“done”) return;

def computeFixedPoint(startNodes: List[BasicBlock])(implicit widening: Option[Widening[D]] = None) : MapDom[BasicBlock,D] = { worklist.add(startNodes:_*) var map = MapDom.empty[BasicBlock,D] while(worklist.size() > 0) { val bb = worklist.pop().get val prevInputs = memoryFetcher(map,bb) val prev = getInput(map,prevInputs) val (mapOut,next) = work(map,prev,bb) val orig = map.get(bb) val isStableOpt = ops.<=(next,orig) if(isStableOpt.isEmpty) { println("error: abs. transfer func. is not distributive") } if(!isStableOpt.get) { val widened = if(widening.nonEmpty) { doWidening(widening.get)(orig,next,bb) } else next map = mapOut.update(bb->widened) val nextWork = getNextBlocks(bb) worklist.add(nextWork:_*) } } map

Page 68: Yoyak ScalaDays 2015

Fixed-point Computation in Yoyak

Built-in work-list algorithm

trait FlowSensitiveFixedPointComputation[D<:Galois] extends FlowSensitiveIteration[D] with CfgNavigator[D] with DoWidening[D] {

def computeFixedPoint(startNodes: List[BasicBlock])(implicit widening: Option[Widening[D]] = None) : MapDom[BasicBlock,D] = {

class FlowSensitiveForwardAnalysis[D<:Galois](val cfg: CFG)( implicit val ops: LatticeOps[D], val absTransfer: AbstractTransferable[D], val widening: Option[Widening[D]] = None) extends FlowSensitiveFixedPointComputation[D] with WideningAtLoopHeads[D] {

Page 69: Yoyak ScalaDays 2015

Abstract Semantics in Yoyak

Built-in work-list algorithm

trait AbstractTransferable[D<:Galois] { protected def transferIdentity(stmt: Identity, input: D#Abst)( implicit context: Context) : D#Abst = input protected def transferAssign(stmt: Assign, input: D#Abst)( implicit context: Context) : D#Abst = input protected def transferInvoke(stmt: Invoke, input: D#Abst)( implicit context: Context) : D#Abst = input protected def transferIf(stmt: If, input: D#Abst)( implicit context: Context) : D#Abst = input protected def transferAssume(stmt: Assume, input: D#Abst)( implicit context: Context) : D#Abst = input

// so on

Page 70: Yoyak ScalaDays 2015

Abstract Semantics in Yoyak

Built-in standard semantic

trait StdSemantics[A<:Galois,D,Mem<:MemDomLike[A,D,Mem]] extends AbstractTransferable[GaloisIdentity[Mem]] { val arithOps : ArithmeticOps[A]

override protected def transferAssign(stmt: Assign, input: Mem)( implicit context: Context) : Mem = { val (rv,output) = eval(stmt.rv,input) output.update(stmt.lv,rv) }

Page 71: Yoyak ScalaDays 2015

Abstract Domain in Yoyak

Composable abstract domains

class MapDom[K,V <: Galois : LatticeOps] {

trait LatticeOps[D <: Galois] extends ParOrdOps[D] { def \/(lhs: D#Abst, rhs: D#Abst) : D#Abst def bottom : D#Abst

trait ParOrdOps[D <: Galois] { def <=(lhs: D#Abst, rhs: D#Abst) : Option[Boolean]

trait Galois { type Conc type Abst

Page 72: Yoyak ScalaDays 2015

Abstract Domain in Yoyak

Built-in Interval Domain

scala> import com.simplytyped.yoyak.framework.domain.arith._ import com.simplytyped.yoyak.framework.domain.arith._

scala> import com.simplytyped.yoyak.framework.domain.arith.Interval._ import com.simplytyped.yoyak.framework.domain.arith.Interval._

scala> val intv1 = Interv.of(10) intv1: com.simplytyped.yoyak.framework.domain.arith.Interval = Interv(IInt(10),IInt(10))

scala> val intv2 = Interv.in(IInt(-10),IInt(10)) intv2: com.simplytyped.yoyak.framework.domain.arith.Interval = Interv(IInt(-10),IInt(10))

scala> val intv3 = Interv.in(IInfMinus,IInf) intv3: com.simplytyped.yoyak.framework.domain.arith.Interval = IntervTop

scala> val intv4 = Interv.in(IInt(-10),IInf) intv4: com.simplytyped.yoyak.framework.domain.arith.Interval = Interv(IInt(-10),IInf)

Page 73: Yoyak ScalaDays 2015

Abstract Domain in Yoyak

Built-in Interval Domain

scala> import IntervalInt.arithOps import IntervalInt.arithOps

scala> arithOps.+(intv1,intv2) // [10,10] + [-10,10] res1: com.simplytyped.yoyak.framework.domain.arith.IntervalInt#Abst = Interv(IInt(0),IInt(20))

scala> arithOps.-(intv1,intv2) // [10,10] - [-10,10] res2: com.simplytyped.yoyak.framework.domain.arith.IntervalInt#Abst = Interv(IInt(0),IInt(20))

scala> arithOps.+(intv2,intv3) // [-10,10] + [-∞,∞] res3: com.simplytyped.yoyak.framework.domain.arith.IntervalInt#Abst = IntervTop

scala> arithOps.*(intv2,intv4) // [-10,10] * [-10,∞] res4: com.simplytyped.yoyak.framework.domain.arith.IntervalInt#Abst = IntervTop

scala> arithOps.*(intv1,intv4) // [10,10] * [-10,∞] res5: com.simplytyped.yoyak.framework.domain.arith.IntervalInt#Abst = Interv(IInt(-100),IInf)

Page 74: Yoyak ScalaDays 2015

Abstract Domain in Yoyak

Built-in Standard Object Model

trait StdObjectModel[A<:Galois,D<:Galois,This<:StdObjectModel[A,D,This]] extends MemDomLike[A,D,This] with ArrayJoinModel[A,D,This] {

implicit val arithOps : ArithmeticOps[A] implicit val boxedOps : LatticeWithTopOps[D]

def update(kv: (Loc,AbsValue[A,D])) : This def remove(loc: Local) : This def alloc(from: Stmt) : (AbsRef,This) def get(k: Loc) : AbsValue[A,D] def isStaticAddr(addr: AbsAddr) : Boolean def isDynamicAddr(addr: AbsAddr) : Boolean

class MemDom[A <: Galois : ArithmeticOps, D <: Galois : LatticeWithTopOps] extends StdObjectModel[A,D,MemDom[A,D]] {

Page 75: Yoyak ScalaDays 2015

Abstract Domain in Yoyak

Built-in Memory Domain

scala> import com.simplytyped.yoyak.framework.domain.mem.MemDom scala> import com.simplytyped.yoyak.framework.domain.mem.MemElems._ scala> import com.simplytyped.yoyak.framework.domain.Galois._ scala> import com.simplytyped.yoyak.framework.domain.arith.Interv scala> import com.simplytyped.yoyak.framework.domain.arith.IntervalInt scala> import com.simplytyped.yoyak.il.CommonIL.Value._

scala> val memory = new MemDom[IntervalInt,SetAbstraction[String]] memory: com.simplytyped.yoyak.framework.domain.mem.MemDom[com.simplytyped.yoyak.framework.domain.arith.IntervalInt,com.simplytyped.yoyak.framework.domain.Galois.SetAbstraction[String]] = com.simplytyped.yoyak.framework.domain.mem.MemDom@8443a1

Page 76: Yoyak ScalaDays 2015

Abstract Domain in Yoyak

scala> val memory2 = memory.update(Local("x") -> AbsArith[IntervalInt](Interv.of(1)))

scala> val memory3 = memory.update(Local("x") -> AbsArith[IntervalInt](Interv.of(10)))

scala> val memory4 = MemDom.ops[IntervalInt,SetAbstraction[String]].\/(memory2,memory3)

scala> memory4.get(Local("x")) res1: com.simplytyped.yoyak.framework.domain.mem.MemElems.AbsValue[com.simplytyped.yoyak.framework.domain.arith.IntervalInt,com.simplytyped.yoyak.framework.domain.Galois.SetAbstraction[String]] = AbsArith(Interv(IInt(1),IInt(10)))

Built-in Memory Domain

Page 77: Yoyak ScalaDays 2015

IL in Yoyak

CommonIL

abstract class Stmt extends Attachable { override def equals(that: Any): Boolean = this eq that.asInstanceOf[AnyRef] override def hashCode() : Int = System.identityHashCode(this)

private[Stmt] def copyAttr(stmt: Stmt) : this.type = {sourcePos = stmt.pos; this} }

Page 78: Yoyak ScalaDays 2015

IL in Yoyak

CommonIL

case class Block(stmts: StatementContainer) extends Stmtcase class Switch(v: Value.Loc, keys: List[Value.t], targets: List[Target]) extends Stmtcase class Placeholder(x: AnyRef) extends Stmt

sealed trait CoreStmt extends Stmtcase class If(cond: Value.CondBinExp, target: Target) extends CoreStmtcase class Goto(target: Target) extends CoreStmt

sealed trait CfgStmt extends CoreStmtcase class Identity(lv: Value.Local, rv: Value.Param) extends CfgStmtcase class Assign(lv: Value.Loc, rv: Value.t) extends CfgStmtcase class Invoke(ret: Option[Value.Local], callee: Type.InvokeType) extends CfgStmtcase class Assume(cond: Value.CondBinExp) extends CfgStmtcase class Return(v: Option[Value.Loc]) extends CfgStmtcase class Nop() extends CfgStmtcase class EnterMonitor(v: Value.Loc) extends CfgStmtcase class ExitMonitor(v: Value.Loc) extends CfgStmtcase class Throw(v: Value.Loc) extends CfgStmt

Page 79: Yoyak ScalaDays 2015

IL in YoyakStmt

x := 10; switch (y) { case 0: println(“0”); break; case 1: println(“1”); default: println(“2”); } if(z) { throw new Exception(); } else { println(“done”); } return 0;

x := 10; if(y == 0) { println(“0”); goto D; } if(y == 1) { println(“1”); } D: println(“2”); if(z) { throw new Exception(); } else { println(“done”); } return 0;

CoreStmt

x := 10

Assume (y == 0) println(“0”)

println(“2”)

Assume (y != 0)

Assume (y == 1) println(“0”) Assume (y != 1)

Assume (z) throw new Ex();

ENTRY

EXIT

Assume (!z) println(“done”) return;

CfgStmt

Page 80: Yoyak ScalaDays 2015

Simple Interval Analysis in Yoyakclass IntervalAnalysis(cfg: CFG) { def run() = { import IntervalAnalysis.{memDomOps,absTransfer,widening} val analysis = new FlowSensitiveForwardAnalysis[GMemory](cfg) val output = analysis.compute output }}

object IntervalAnalysis { type Memory = MemDom[IntervalInt,SetAbstraction[Any]] type GMemory = GaloisIdentity[Memory] implicit val absTransfer : AbstractTransferable[GMemory] = new StdSemantics[IntervalInt,SetAbstraction[Any],Memory] { val arithOps: ArithmeticOps[IntervalInt] = IntervalInt.arithOps }

implicit val memDomOps : LatticeOps[GMemory] = MemDom.ops[IntervalInt,SetAbstraction[Any]] implicit val widening : Option[Widening[GMemory]] = { implicit val NoWideningForSetAbstraction = Widening.NoWidening[SetAbstraction[Any]] Some(MemDom.widening[IntervalInt,SetAbstraction[Any]]) }}

Page 81: Yoyak ScalaDays 2015

Simple Interval Analysis in YoyakMemDom

StdObjectModel

MapDom

AbsValue

AbsRef

AbsArithIntervalInt

AbsBoxSetAb[Any]

AbsBottom

AbsTop

AbsObject

AbsAddr

IntervalAnalysis

FlowSensitiveForwardAnalysis

FlowSensitiveFixedPointComputation

Worklist

LatticeOps

FlowSensitiveIterationAbstract

Transferable

CfgNavigator

WideningAtLoopHeads

Widening

MapDom

BasicBlock

MemDom

MemDom.op

IntervalInt.widening

IntervalAnalysisTransferFunction

CFG

Fixed-point result

StdSemantics

ArithmeticOps

IntervalInt.arithOps

Page 82: Yoyak ScalaDays 2015

Yoyak : Scala Experience

• Scala is a very good language to implement a static analyzer

• Function is a first class citizen

• Type class support

• Algebraic data type support

• Native support for mutable and immutable values

• Excellent support for parallelization

Page 83: Yoyak ScalaDays 2015

Yoyak : Scala Experience

• Function is a first class citizen

Natural way to express mathematical logic

// optimize Cfg(insertAssume _ andThen removeIfandGoto) apply rawCfg

Page 84: Yoyak ScalaDays 2015

Yoyak : Scala Experience

• Type class support

Can avoid F-bounded polymorphism which is the fast lane to overworking

• F-bounded polymorphism

• Commonly happen when inheritance meets immutability

• Seriously deteriorate code readability

Page 85: Yoyak ScalaDays 2015

Yoyak : Scala Experience• F-bounded polymorphism

trait Queue[T, This <: Queue[T, This]] {def push(elem: T) : This

}trait GoodQueue[T, This <: GoodQueue[T, This]] extends Queue[T, This] {

def pop : (T, This)}trait BetterQueue[T, R, This <: BetterQueue[T, R, This]] extends GoodQueue[T, This] {

def giveMeSomethingNew : R}trait QueueUnited[T, R, Q <: Queue[T, Q], G <: GoodQueue[T, G], B <: BetterQueue[T, R, B], This <: QueueUnited[T, R, Q, G, B, This]] extends BetterQueue[T, R, This] {

def giveUp : Unit}

• Always need the type of concrete subclass • Reiterate all type variables again in subclass reference • Type class liberates methods from inheritance

Page 86: Yoyak ScalaDays 2015

Yoyak : Scala Experience• Type class

trait QueueLike[T,This] {def push(elem: T) : This

}trait GoodQueueLike[T,This] {

implicit val queueLike : QueueLike[T,This]def push(elem: T) : This = queueLike.push(elem)def pop(q: This) : (T,This)

}trait BetterQueueLike[T,R,This] {

implicit val goodQueueLike : GoodQueueLike[T,This]def push(elem: T) : This = goodQueueLike.push(elem)def pop(q: This) : (T,This) = goodQueueLike.pop(q)def giveMeSomethingNew : R

}class QueueUnited[T,R,This](implicit val q : QueueLike[T,This], g : GoodQueueLike[T,This], b : BetterQueueLike[T,R,This]) {

def push(elem: T) : This = b.push(elem)def pop(q: This) : (T,This) = b.pop(q)def giveMeSomethingNew : R = b.giveMeSomethingNewdef giveUp : Unit = {}

}

Page 87: Yoyak ScalaDays 2015

Yoyak : Scala Experience• Type class in Yoyak

trait StdObjectModel[A<:Galois,D<:Galois,This<:StdObjectModel[A,D,This]] extends MemDomLike[A,D,This] with ArrayJoinModel[A,D,This] { implicit val arithOps : ArithmeticOps[A] implicit val boxedOps : LatticeWithTopOps[D]

Use both methods in an appropriate place

Page 88: Yoyak ScalaDays 2015

Yoyak : Scala Experience

• Algebraic data type support

Natural way to express an abstract syntax tree of a program

;

if(x)

a = 1 a = 2

println(a)

Seq( If(“x”,Assign(“a”,1), Assign(“a”,2)), Invoke(“println”,List(“a”)))

Page 89: Yoyak ScalaDays 2015

Yoyak : Scala Experience

• Algebraic data type support

Easy to navigate the abstract syntax tree

def eval(v: Value.t, input: Mem)(implicit context: Context) : (AbsValue[A,D],Mem) = { v match { case x : Value.Constant => evalConstant(x,input) case x : Value.Loc => evalLoc(x,input) case x : Value.BinExp => evalBinExp(x,input) case Value.This => (AbsRef(Set("$this")),input) case Value.CaughtExceptionRef => (AbsRef(Set("$caughtex")),input) case Value.CastExp(v, ofTy) => evalLoc(v,input) case Value.InstanceOfExp(v, ofTy) => (AbsTop,input) case Value.LengthExp(v) => (AbsTop,input) case Value.NewExp(ofTy) => input.alloc(context.stmt) case Value.NewArrayExp(ofTy, size) => input.alloc(context.stmt)

Page 90: Yoyak ScalaDays 2015

Yoyak : Scala Experience

• Native support for mutable and immutable values

Memory

x

y

z

Object

f

g

1

“A”

In some cases, mutability is more important than immutability

Page 91: Yoyak ScalaDays 2015

Yoyak : Scala Experience

• Native support for mutable and immutable values

Memory

x

y

z

Object

f

g

1

“A”

NewObject

f

g

2

“A”

memory.filter{_._2 == object}.foldLeft(memory) { case (m,(k,_)) => m + (k -> newObject)}

O(n)

Page 92: Yoyak ScalaDays 2015

Yoyak : Scala Experience

• Native support for mutable and immutable values

Memory

x

y

z

NewObject

f

g

2

“A”

object.update(newObject) O(1)

Page 93: Yoyak ScalaDays 2015

Yoyak : Scala Experience

• Native support for mutable and immutable values

Memory

x

y

z

Object

f

g

1

“A”

NewObject

f

g

2

“A”

If we frequently update immutable objects in a big memory, it may result in severe inefficiency

Page 94: Yoyak ScalaDays 2015

Yoyak : Scala Experience

• Excellent support for parallelization

• Static analysis does not sufficiently utilize today’s advancement of computing scalability (multicore machines, big data technologies, cloud computing)

• Scala has a perfect platform to experiment parallelization which called Akka

• Many fun things to try with Yoyak powered by Akka

Page 95: Yoyak ScalaDays 2015

Yoyak : Scala Experience• Excellent support for parallelization

Worklist Parallelization can be naturally

implemented by Akka’s Actor model

Page 96: Yoyak ScalaDays 2015

Yoyak : Roadmap

• Add more built-in abstract domains

• Optimize analysis performance

• Visualize analysis details

• Build Scala compiler plug-in

Page 97: Yoyak ScalaDays 2015

Yoyak : Roadmap

• Add more built-in abstract domains

Interval domain cannot represent the relation between two variables

x = [2,8], y = [1,7] produce 49 combinations of (x,y) pairs

100 1 2 3 4 5 6 7 8 9

10

0

1

2

3

4

5

6

7

8

9

X Axis

Y A

xis

Page 98: Yoyak ScalaDays 2015

Yoyak : Roadmap

• Add more built-in abstract domains

Octagon domain can represent the relation between two variables

100 1 2 3 4 5 6 7 8 9

10

0

1

2

3

4

5

6

7

8

9

X Axis

Y A

xis

http://www.di.ens.fr/~mine/publi/article-mine-HOSC06.pdf

Page 99: Yoyak ScalaDays 2015

Yoyak : Roadmap

• Add more built-in abstract domains

2-interval domain is more precise than interval domain

100 1 2 3 4 5 6 7 8 9

10

0

1

2

3

4

5

6

7

8

9

X Axis

Y A

xis

Page 100: Yoyak ScalaDays 2015

Yoyak : Roadmap

• Optimize analysis performance

• {Worklist, Method, Class}-level parallelization

• Reduce abstract memory size by removing unused variables (faster join operation for abstract memory)

• Optional faster but unsound analysis

Page 101: Yoyak ScalaDays 2015

Yoyak : Roadmap

• Visualize analysis details

It is hard to know what a static analyzer is doing at a specific moment because…

• Static analyzer’s behavior is very different for each input program

• Often need to inspect and compare a map with thousands of entries

• Unable to look over the big picture by ordinary Java debuggers

Page 102: Yoyak ScalaDays 2015

Yoyak : Roadmap

• Visualize analysis details

Example from SAT solvers

Visualization of the search tree generated by a basic DPLL

algorithm

DPVis

Page 103: Yoyak ScalaDays 2015

Yoyak : Roadmap

• Build Scala compiler plug-in

• Programming language researchers foresee that the semantic program analyzer will be merged with compiler systems in the near future as the type system did

Syntactic Analysis Grammar Checking Type System Semantic Analysis

Page 104: Yoyak ScalaDays 2015

Yoyak : Roadmap

• Build Scala compiler plug-in

• Scala compiler is well modularized, cleanly coded (as compared to other compiler systems), so it is an excellent platform for experimenting new ideas

• Pure Scala code is safe from null, however linked Java libraries are not

• It would be great if Scala compiler can detect possible null dereferences at a compile time and issue a warning

Page 105: Yoyak ScalaDays 2015

Thank you!

Further Questions,

ScalaDays 2015

twitter @heejongl

gmail [email protected]