Top Banner
Precise Program Analysis with Data Structures Collaborators: George Necula, Xavier Rival (INRI Bor-Yuh Evan Chang University of California, Berkeley February-April 2008
57

Precise Program Analysis with Data Structures

Feb 22, 2016

Download

Documents

jadyn

Precise Program Analysis with Data Structures. Bor-Yuh Evan Chang University of California, Berkeley February-April 2008. Collaborators: George Necula , Xavier Rival (INRIA). Precise Program Analysis with Data Structures by Designing with the User in Mind. Bor-Yuh Evan Chang - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Precise Program Analysis  with Data Structures

Precise Program Analysis with Data Structures

Collaborators: George Necula, Xavier Rival (INRIA)

Bor-Yuh Evan ChangUniversity of California, Berkeley

February-April 2008

Page 2: Precise Program Analysis  with Data Structures

Precise Program Analysis with Data Structures

by Designing with the User in Mind

Collaborators: George Necula, Xavier Rival (INRIA)

Bor-Yuh Evan ChangUniversity of California, Berkeley

February-April 2008

Page 3: Precise Program Analysis  with Data Structures

3Bor-Yuh Evan Chang, UC Berkeley - Precise Program Analysis with Data Structures

Software errors cost a lot

~$60 billion annually (~0.5% of US GDP)– 2002 National Institute of Standards and

Technology report

total annual revenue of>10x annual budget of >

Page 4: Precise Program Analysis  with Data Structures

4Bor-Yuh Evan Chang, UC Berkeley - Precise Program Analysis with Data Structures

But there’s hope in program analysis

Microsoft uses and distributesthe Static Driver Verifier

Airbus appliesthe Astrée Static Analyzer

Companies, such as Coverity and Fortify, market static source code analysis tools

Page 5: Precise Program Analysis  with Data Structures

5Bor-Yuh Evan Chang, UC Berkeley - Precise Program Analysis with Data Structures

Because program analysis caneliminate entire classes of bugsFor example,

– Reading from a closed file:– Reacquiring a locked lock:

How?– Systematically examine the program– Simulate running program on “all inputs”– “Automated code review”

read( );acquire( );

Page 6: Precise Program Analysis  with Data Structures

6Bor-Yuh Evan Chang, UC Berkeley - Precise Program Analysis with Data Structures

… code …// x now points to an unlocked lock

acquire(x);… code …

analysis state

Program analysis by example:Checking for double acquiresSimulate running program on “all inputs”

x

acquire(x);… code …

Page 7: Precise Program Analysis  with Data Structures

7Bor-Yuh Evan Chang, UC Berkeley - Precise Program Analysis with Data Structures

… code …// x now points to an unlocked lock in a linked list

acquire(x);… code …

ideal analysis state

Program analysis by example:Checking for double acquiresSimulate running program on “all inputs”

x xx

or or or …

undecidability

Page 8: Precise Program Analysis  with Data Structures

8Bor-Yuh Evan Chang, UC Berkeley - Precise Program Analysis with Data Structures

… code …// x now points to an unlocked lock in a linked list

acquire(x);… code …

ideal analysis state analysis state

Must abstract

x xx

or or or … ?

xFor decidability, must abstract—“model all inputs” (e.g., merge objects)

Abstraction too coarse or not precise enough (e.g., lost x is always unlocked)

mislabels good code as buggy

Page 9: Precise Program Analysis  with Data Structures

9Bor-Yuh Evan Chang, UC Berkeley - Precise Program Analysis with Data Structures

To address the precision challengeTraditional program analysis mentality:

“Why can’t developers write more specifications for our analysis? Then, we could verify so much more.”

“Since developers won’t write specifications, we will use default abstractions (perhaps coarse) that work hopefully most of the time.”

My approach:“Can we design program analyses around the

user? Developers write testing code. Can we adapt the analysis to use those as specifications?”

Page 10: Precise Program Analysis  with Data Structures

10Bor-Yuh Evan Chang, UC Berkeley - Precise Program Analysis with Data Structures

Summary of overviewChallenge in analysis: Finding a good abstraction

precise enough but not more than necessaryPowerful, generic abstractions

expensive, hard to use and understandBuilt-in, default abstractions

often not precise enough (e.g., data structures)

My approach: Must involve the user in abstraction

without expecting the user to be a program analysis expert

Page 11: Precise Program Analysis  with Data Structures

11Bor-Yuh Evan Chang, UC Berkeley - Precise Program Analysis with Data Structures

Overview of contributions

Extensible Inductive Shape Analysis [POPL’08,SAS’07]

Precise inference of data structure propertiesAble to check, for instance, the locking

exampleTargeted to software developers

Uses data structure checking code for guidance Turns testing code into a specification for

static analysisEfficient

~10-100x speed-up over generic approaches Builds abstraction out of developer-supplied

checking code

Page 12: Precise Program Analysis  with Data Structures

Extensible InductiveShape Analysis

Precise inference of data structure properties

Developer-oriented approach

[POPL’08, SAS’07]

Part 1

Page 13: Precise Program Analysis  with Data Structures

13Bor-Yuh Evan Chang, UC Berkeley - Precise Program Analysis with Data Structures

Shape analysis is a fundamental analysisData structures are at the core of

– Traditional languages (C, C++, Java)– Emerging web scripting languages

Improves verifiers that try to– Eliminate resource usage bugs

(locks, file handles)– Eliminate memory errors (leaks, dangling

pointers)– Eliminate concurrency errors (data races)– Validate developer assertions

Enables program transformations– Compile-time garbage collection– Data structure refactorings

Page 14: Precise Program Analysis  with Data Structures

14Bor-Yuh Evan Chang, UC Berkeley - Precise Program Analysis with Data Structures

Shape analysis by example:Removing duplicates

// l is a sorted doubly-linked list

for each node cur in list l {remove cur if duplicate;

}assert l is sorted,

doubly-linked with no duplicates;

Example/Testing Code Review/Static Analysis

“no duplicates”l

“sorted dl list”l

program-specific

l 2 2 44

l 2 44

cur

l 2 4

“sorted dl list”l “segment withno duplicates”

cur

intermediate state more

complicated

Page 15: Precise Program Analysis  with Data Structures

15Bor-Yuh Evan Chang, UC Berkeley - Precise Program Analysis with Data Structures

Shape analysis is not yet practicalChoosing the heap abstraction difficult for precision

Parametric in high-level, developer-oriented predicates+ Extensible+ Targeted to developers

Xisa

Built-in high-level predicates

- Hard to extend+ No additional user effort (if

precise enough)

Parametric in low-level, analyzer-oriented predicates+ Very general and expressive- Hard for non-expert

89

Traditional approaches:

My approach:

Space Invader [Distefano et

al.]

TVLA[Sagiv et al.]

Page 16: Precise Program Analysis  with Data Structures

16Bor-Yuh Evan Chang, UC Berkeley - Precise Program Analysis with Data Structures

Key insightfor being developer-friendly and efficientUtilize “run-time checking code” as specification for static analysis.

assert(sorted_dll(l,…));for each node cur in list l {

remove cur if duplicate;

}assert(sorted_dll_nodup(l,…));

l

l

cur

l

dll(h, p) =if (h = null) then

trueelse

h!prev = p and dll(h!next, h)checker

Contribution: Automatically generalize checkers for complicated intermediate states

Contribution: Build the abstraction for analysis out of developer-specified checking code

• p specifies where prev should point

Page 17: Precise Program Analysis  with Data Structures

17Bor-Yuh Evan Chang, UC Berkeley - Precise Program Analysis with Data Structures

Our framework is …

• Extensible and targeted for developers– Parametric in developer-supplied checkers

• Precise yet compact abstraction for efficiency– Data structure-specific based on properties of

interest to the developer

An automated shape analysis with a precise memory abstraction based around invariant checkers.

shape analyzer

dll(h, p) =if (h = null) then

trueelse

h!prev = prev and dll(h!next, h)

checkers

Page 18: Precise Program Analysis  with Data Structures

18Bor-Yuh Evan Chang, UC Berkeley - Precise Program Analysis with Data Structures

Splitting of summaries (materialization)

To reflect updates precisely (strong updates)

And summarizing for termination (widening)

Shape analysis is an abstract interpretation on abstract memory descriptions with …

cur

l

cur

l

cur

l

cur

l

cur

l

cur

l

Page 19: Precise Program Analysis  with Data Structures

19

Outline

shape analyzer

abstract interpretation

splitting andinterpreting update

summarizing

type“pre-analysis”

on checkerdefinitions

dll(h, p) =if (h = null) then

trueelse

h!prev = prev and dll(h!next, h)

checkers

Bor-Yuh Evan Chang, UC Berkeley - Precise Program Analysis with Data Structures

Learn information about the checker to use it as an abstraction 12

3Compare and contrast manual code review and our automated shape analysis

Page 20: Precise Program Analysis  with Data Structures

20Bor-Yuh Evan Chang, UC Berkeley - Precise Program Analysis with Data Structures

Overview: Split summariesto interpret updates precisely

l

cur

l

cur

Want abstract update to be “exact”, that is, to update one “concrete memory cell”.The example at a high-level: iterate using cur changing the doubly-linked list from purple to red.

l

cur

split at cur

update cur purple to red

l

cur

Challenge:How does the analysis “split” summaries and know where to “split”?

Page 21: Precise Program Analysis  with Data Structures

21Bor-Yuh Evan Chang, UC Berkeley - Precise Program Analysis with Data Structures

“Split forward”by unfolding inductive definition

Çdll(h, p) =

if (h = null) thentrue

elseh!prev = p and dll(h!next, h)

l

curget: cur!next

l

cur

null

p dll(cur, p)

l

cur

p dll(n, cur)n

Analysis doesn’t forget the empty case

Page 22: Precise Program Analysis  with Data Structures

22

“Split backward” also possible and necessary

dll(h, p) =if (h = null) then

trueelse

h!prev = p and dll(h!next, h)

Bor-Yuh Evan Chang, UC Berkeley - Precise Program Analysis with Data Structures

l

cur

p dll(n, cur)n

for each node cur in list l {

remove cur if duplicate;}assert l is

sorted, doubly-linked with no duplicates;

“dll segment”

l

cur

p0 dll(n, cur)n“dll segment”

cur!prev!next= cur!next;

l

cur

dll(n, cur)nnull

get: cur!prev!next

Ç

Technical Details:How does the analysis do this unfolding?Why is this unfolding allowed?(Key: Segments are also inductively defined)

[POPL’08]How does the analysis know to do this

unfolding?

Page 23: Precise Program Analysis  with Data Structures

23Bor-Yuh Evan Chang, UC Berkeley - Precise Program Analysis with Data Structures

Outline

shape analyzer

abstract interpretation

splitting andinterpreting update

summarizing

type“pre-analysis”

on checkerdefinitions

Contribution: Turns testing code into specification for static analysis

12

3

How do we decide where to unfold?

Derives additional information to guide unfolding

dll(h, p) =if (h = null) then

trueelse

h!prev = prev and dll(h!next, h)

checkers

Page 24: Precise Program Analysis  with Data Structures

24Bor-Yuh Evan Chang, UC Berkeley - Precise Program Analysis with Data Structures

memory cell (points-to: °!next = ±)

Abstract memory as graphs

dll(h, p) =if (h = null) then

trueelse

h!prev = p and dll(h!next, h)

l® dll(null) dll(¯)

cur° dll(°)

¯prev

next ±

Make endpoints and segments explicit, yet high-levell dll(±, °)±“dll segment”

cur

°

®

segment summary

checker summary (inductive pred)

memory address (value)

Contribution: Generalization of checker(Intuitively, dll(®,null) up to dll(°,¯).)

Some number of memory cells (thin edges)

Which summary (thick edge), in what direction, and how far do we unfold to get the edge ¯!next (cur!prev!next)?

¯

next

Page 25: Precise Program Analysis  with Data Structures

25Bor-Yuh Evan Chang, UC Berkeley - Precise Program Analysis with Data Structures

01

-1-2

Types for deciding where to unfold

®dll(null) dll(¯) dll(¯)

°

dll(®,null)dll(¯,®)dll(°,¯)dll(±,°)

dll(null,±)

Checker “Run” (call tree/derivation)

Instance

Summary

° ±® ¯ nullnull

dll(h, p) =if (h = null) then

trueelse

h!prev = p and dll(h!next, h)

h:{nexth0i,prevh0i }p:{nexth-1i,prevh-1i }

If it exists, where is:

°!next ?¯!next ?

Checker Definition

0-1

Says:For h!next/h!prev,

unfold from hFor p!next/p!prev,

unfold before h

Page 26: Precise Program Analysis  with Data Structures

26Bor-Yuh Evan Chang, UC Berkeley - Precise Program Analysis with Data Structures

Types make the analysis robust with respect to how checkers are written

¯dll(®) dll(¯) dll(¯)

°

Instance

Summary dll(h, p) =if (h = null) then

trueelse

h!prev = p and dll(h!next, h)

h:{nexth0i,prevh0i }p:{nexth-1i,prevh-1i }

°¯ null®

¯ ° nullInstance

¯dll0 dll0 dll0

°Summary

dll0(h) =if (h!next = null)

thentrue

elseh!next!prev = h

and dll0(h!next)

Alternative doubly-linked list checker h:{nexth0i,prevh-1i }

°!prev ? -1

Doubly-linked list checker (as before)

Different types for different unfolding

Page 27: Precise Program Analysis  with Data Structures

27Bor-Yuh Evan Chang, UC Berkeley - Precise Program Analysis with Data Structures

Summary of checker parameter typesTell where to unfold for which fields

Make analysis robust with respect to how checkers are written

Learn where in summaries unfolding won’t help

Can be inferred automatically with a fixed-point computation on the checker definitions

Page 28: Precise Program Analysis  with Data Structures

28Bor-Yuh Evan Chang, UC Berkeley - Precise Program Analysis with Data Structures

Summary of interpreting updates

Splitting of summaries needed for precision

Unfolding checkers is a natural way to do splitting

When checker traversal matches code traversal

Checker parameter typesEnable, for example, “back pointer” traversal without blindly guessing where to unfold

Page 29: Precise Program Analysis  with Data Structures

29Bor-Yuh Evan Chang, UC Berkeley - Precise Program Analysis with Data Structures

Outline

shape analyzer

abstract interpretation

splitting andinterpreting update

summarizing

type“pre-analysis”

on checkerdefinitions

12

3dll(h, p) =

if (h = null) thentrue

elseh!prev = prev and dll(h!next, h)

checkers

Page 30: Precise Program Analysis  with Data Structures

30Bor-Yuh Evan Chang, UC Berkeley - Precise Program Analysis with Data Structures

Summarizeby folding into inductive predicateslast = l;cur = l!next;while (cur != null) {

// … cur, last …if (…) last =

cur;cur = cur! next;

}

listl, lastnext cur

listl next next curlast

listl next next next curlast

summarize

listlast listnext curlistl

Challenge: Precision (e.g., last, cur separated by at least one step)

Previous approaches guess where to fold for each graph.Contribution: Determine where by comparing graphs across history

Page 31: Precise Program Analysis  with Data Structures

31Bor-Yuh Evan Chang, UC Berkeley - Precise Program Analysis with Data Structures

Summary:Given checkers, everything is automatic

shape analyzer

abstract interpretation

splitting andinterpreting update

summarizing

type“pre-analysis”

on checkerdefinitions

dll(h, p) =if (h = null) then

trueelse

h!prev = prev and dll(h!next, h)

checkers

Page 32: Precise Program Analysis  with Data Structures

32Bor-Yuh Evan Chang, UC Berkeley - Precise Program Analysis with Data Structures

Results: Performance

Benchmark

Max. Num.

Graphs at a

Program Pt

Analysis

Time (ms)

singly-linked list reverse 1 0.6doubly-linked list reverse 1 1.4doubly-linked list copy 2 5.3doubly-linked list remove 5 6.5doubly-linked list remove and back 5 6.8search tree with parent insert 5 8.3search tree with parent insert and

back5 47.0

two-level skip list rebalance 6 87.0Linux scull driver (894 loc)

(char arrays ignored, functions inlined)

4 9710.0

Times negligible for data structure operations (often in sec or 1/10 sec)Expressiveness:

Different data structures

Verified shape invariant as given by the checker is preserved across the operation.

TVLA: 850 ms

TVLA: 290 ms

Space Invaderonly analyzes lists (built-in)

Page 33: Precise Program Analysis  with Data Structures

33Bor-Yuh Evan Chang, UC Berkeley - Precise Program Analysis with Data Structures

Demo: Doubly-linked list reversal

http://xisa.cs.berkeley.edu

Body of loop over the elements:Swaps the next and prev fields of curr.

Already reversed segmentNode whose next and prev fields were swapped Not yet reversed list

Page 34: Precise Program Analysis  with Data Structures

34Bor-Yuh Evan Chang, UC Berkeley - Precise Program Analysis with Data Structures

Experience with the toolCheckers are easy to write and try out

– Enlightening (e.g., red-black tree checker in 6 lines)

– Harder to “reverse engineer” for someone else’s code

– Default checkers based on types useful

Future expressiveness and usability improvements– Pointer arithmetic and arrays– More generic checkers:

polymorphic “element kind unspecified”

higher-orderparameterized by other predicates

Future evaluation: user study

Page 35: Precise Program Analysis  with Data Structures

35Bor-Yuh Evan Chang, UC Berkeley - Precise Program Analysis with Data Structures

Short-term future work:Exploiting common specification frameworkScenario: Code instrumented with lots of

checker calls (perhaps automatically with object invariants)assert( mychecker(x) );

// … operation on x …assert( mychecker(x) );

Can we prove parts statically?Static Analysis View: Hybrid checkingTesting View: Incrementalize invariant checking

Example: Insert in a sorted listl v wu

Preservation of sortedness shown staticallyEmit run-time check for new element: u · v · w

• Very slow to execute• Hard to prove statically (in

general)

Page 36: Precise Program Analysis  with Data Structures

36Bor-Yuh Evan Chang, UC Berkeley - Precise Program Analysis with Data Structures

Summary ofExtensible Inductive Shape AnalysisKey Insight: Checkers as specifications

Developer View: Global, Expressed in a familiar style

Analysis View: Capture developer intent, Not arbitrary inductive definitions

Constructing the program analysisIntermediate states: Generalized segment predicates

Splitting: Checker parameter types with levels

Summarizing: History-guided approachnext listlist list listlist

® ¯c(°) c0(°0)

h : {nexth0i, prevh0i}p : {nexth-1i, prevh-1i}

Page 37: Precise Program Analysis  with Data Structures

37Bor-Yuh Evan Chang, UC Berkeley - Precise Program Analysis with Data Structures

Are there other kinds of program analysis users?

Page 38: Precise Program Analysis  with Data Structures

38Bor-Yuh Evan Chang, UC Berkeley - Precise Program Analysis with Data Structures

Two kinds of users of program analysis

software

software

developer end-user

Wants:Precise program analysis for development tools

Wants:Program analysis to certify software is okAnalysis of Low-Level

Code Using Decompilers

[SAS’06, TLDI’05]

Extensible Inductive Shape Analysis

[POPL’08, SAS’07]

1 2

Page 39: Precise Program Analysis  with Data Structures

Analysis ofLow-Level Code Using

Cooperating Decompilers

[SAS’06, TLDI’05]

Part 2

Page 40: Precise Program Analysis  with Data Structures

40Bor-Yuh Evan Chang, UC Berkeley - Precise Program Analysis with Data Structures

End-users want low-level code analysis

• Want analyses to check code to be executed is ok– E.g., won’t crash, good wrt

Static Driver Verifier• Do not know any details

about the program– Analysis must be fully

automatic• But can demand

additional information from the developer– To make analysis

automatic

source code

But most program analyses operate at the source-level!

executable

code

end-user

for low-level code

analyzer

Page 41: Precise Program Analysis  with Data Structures

41

Analyzers for low-level code are more difficult and tedious to build

Porting source-level analyses is error prone– one statement becomes many

instructions– dependencies between

instructions must be carefully tracked

Key Insight:Low-level complexity– deals with compilation idioms– mostly independent of the

analysis– can be captured with

intermediate languagesBor-Yuh Evan Chang, UC Berkeley - Precise Program Analysis with Data Structures

executable

code

end-user

source code

for low-level code

analyzer

for low-level code

analyzer

Page 42: Precise Program Analysis  with Data Structures

42

Decompile code rather than port analysis

Framework of small, cooperating decompilers that gradually lift the level of the program

Decompilation for program analysis– Need not get to original, nor

be human understandable– Only concern is safety,

not performance• Unlike, e.g., Java VM platform (JIT

compiler)– Can use additional meta-data

(e.g., source-level types)Bor-Yuh Evan Chang, UC Berkeley - Precise Program Analysis with Data Structures

executable

code

end-user

source code

for low-level code

analyzer

Page 43: Precise Program Analysis  with Data Structures

43Bor-Yuh Evan Chang, UC Berkeley - Precise Program Analysis with Data Structures

Summary of resultsFlexibility and usability

– 3 compilers (gcc/C, gcj/Java, coolc/Cool)– 2 architectures (x86, MIPS)– With 6 decompiler modules– Basic Java type-checking for gcj output implemented in

3-4 hours, 500 lines of code

Benefits of modularity– decompiler-based re-implementation of a low-level analysis

uncovered 8 bugs in the original implementation (heavily used, deployed in the classroom)

Applicability of existing source-level tools– applied C code tools, BLAST and Cqual, on decompiled

benchmarks (size: ~10,000 lines of C)

Page 44: Precise Program Analysis  with Data Structures

Future Research

Page 45: Precise Program Analysis  with Data Structures

45Bor-Yuh Evan Chang, UC Berkeley - Precise Program Analysis with Data Structures

Long-term and outreachTheme: Overcome decidability issues in

program analysis by tailoring it to the user

• “Programs” are no longer only written to be executed on computers– E.g., computational models of biological

pathways in systems biology• Need new “program” analysis tools

– Validate models(e.g., pathway model produces only expected products)

– Reason about models• How do these users work?

Page 46: Precise Program Analysis  with Data Structures

46Bor-Yuh Evan Chang, UC Berkeley - Precise Program Analysis with Data Structures

Conclusion

Extensible Inductive Shape Analysisprecision demanding program analysis improved by novel user interactionDeveloper: Gets results

corresponding to intuitionAnalysis: Focused on what’s important

to the developerCooperating Decompilers

adapt program analyses to code end-users run

Practical precise tools for better software!

Page 47: Precise Program Analysis  with Data Structures

What can inductiveshape analysis do for you?

http://xisa.cs.berkeley.edu

Page 48: Precise Program Analysis  with Data Structures

Bonus Slides:Extensible Inductive

Shape Analysis

Page 49: Precise Program Analysis  with Data Structures

49Bor-Yuh Evan Chang, UC Berkeley - Precise Program Analysis with Data Structures

Intuition: Checkers and types

global specification (i.e., per data structure)

more precise (typically)holds only in “steady-

state”need generalization

global specification (i.e., per data structure)

less precise (typically)holds alwaysdoesn’t need

generalization

l.sorteddll(prev, min) =if (l = null) then

trueelse

l!prev = prev and min · l!val and

l!next.sorteddll(l,l!val)

struct Dll {int val;Dll* prev;Dll* next;

};

x . sorteddll(…) x : Dll

Page 50: Precise Program Analysis  with Data Structures

50Bor-Yuh Evan Chang, UC Berkeley - Precise Program Analysis with Data Structures

Segments as partial checkers

®.dll(null)¯.dll(®)°.dll(¯)±.dll(°)

null.dll(±)

Checker “Run”

Instance

Summary®

dll(¯)°

c0(¯,°0)

c(®,°)

… …

… ……

® ¯c(°) c0(°0)

ii

i = 0

i = 0

ii 00

c = c0

® = ¯° = °0

® = °¯ = null

null next° next ±prevprev

null® ¯

Page 51: Precise Program Analysis  with Data Structures

51Bor-Yuh Evan Chang, UC Berkeley - Precise Program Analysis with Data Structures

To unfold backward, split the segment and then unfold forward

cur = l!next;while (cur != null){

if ( cur!prev!val== cur!val )

{cur = cur!prev;

remove_after(cur);}cur = cur!next;

}

:= 9´.¼dll(½)

Ǽ = nullemp

¼ null¼ next dll(¼)´½ prev

materialize: cur!prev!next

l® dll(null) dll(°)

cur

°

±

prev

dll(±)next "

cur°

¯

±prev dll(±)next "

dll(±)next "

Ç

l, cur

°

±

prev

® = ±° = null

° 0dll(¯) dll(¯) 1

=unfol

d

Page 52: Precise Program Analysis  with Data Structures

52Bor-Yuh Evan Chang, UC Berkeley - Precise Program Analysis with Data Structures

Backward unfolding by forward unfolding

¯dll(null) dll(°)i+1

° prevsplit (lemma)

dll(null) dll(e)i 1± ¯

° prevdll(°)dll(e)

i

unfold forward at ±

dll(null) dll(e)i 0±

° prevdll(°)

e prev

¯´next dll(±)

nextdll(null) dll(e) ±preve prev

¯

reduce ´ = ¯, ± = °

Page 53: Precise Program Analysis  with Data Structures

53Chang, Rival, Necula - Shape Analysis with Structural Invariant Checkers

History-guided folding

listnext

listnext next

listnextlist

l, last

last

cur

cur

l

l

last cur

l,

list ?

v

?list

Yes

last = l;cur = l!next;while (cur != null) {

if (…) last = cur;

cur = cur! next;}

• Match edges to identify where to fold

• Apply local folding rules

nextl last

l last

l, last

Page 54: Precise Program Analysis  with Data Structures

Bonus Slides:Analysis of Low-Level

Code

Page 55: Precise Program Analysis  with Data Structures

55Bor-Yuh Evan Chang, UC Berkeley - Precise Program Analysis with Data Structures

Porting source-level analyses is error prone

class C extends P {void m() { … }

}

P p = new P();P c = … ? new C() :

p;…c.m();

rc := m[rsp]if (rc = 0) Lexc

r1 := m[rc]r1 := m[r1+28]rsp := rsp - 4m[rsp] := m[rsp+4] -m[rsp] := rc

icall [r1]

Analyzers for low-level code are more difficult and tedious to buildExample: Java Type Analysis

hrc : P, … ihrc : nonnull P, …i

hr1 : disp(P), …ihr1 : meth(P,28), …i

hm[rsp] : nonnull P, …i

Type analysis intermixed with low-level reasoning(e.g., args on stack)

Page 56: Precise Program Analysis  with Data Structures

56Bor-Yuh Evan Chang, UC Berkeley - Precise Program Analysis with Data Structures

Porting source-level analyses is error prone

class C extends P {void m() { … }

}

P p = new P();P c = … ? new C() :

p;…c.m();

rc := m[rsp]if (rc = 0) Lexc

r1 := m[rc]r1 := m[r1+28]rsp := rsp - 4m[rsp] := rp

icall [r1]

Analyzers for low-level code are more difficult and tedious to buildExample: Java Type Analysis

hrc : P, … ihrc : nonnull P, …i

hr1 : disp(P), …ihr1 : meth(P,28), …i

hm[rsp] : nonnull P, …i

unsound

Dependencies must be carefully tracked

Page 57: Precise Program Analysis  with Data Structures

57Bor-Yuh Evan Chang, UC Berkeley - Precise Program Analysis with Data Structures

f: … rc := m[rsp+12] if (rc = 0) Lexc

r1 := m[rc] r1 := m[r1+28] rsp := rsp - 4 m[rsp] := m[rsp+16] icall [r1]

f(tc):

rc := tc

if (rc = 0) Lexc

r1 := m[rc] r1 := m[r1+28]

t1 := tc

icall [r1](t1)

f(c):

if (c = 0) Lexc

icall [m[m[c]+28]] (c)

f(obj c):

if (c = 0) Lexc

invokevirtual [c, 28] ()

f(C c):

if (c = 0) Lexc

c.m()

Framework of small, reusable cooperating decompiler modulesstatic void f(C c)

{ c.m(); }

Locals SymEval OO JavaTypes

Local Variables

Symbolic Evaluation

Dynamic Dispatch

youranalyzer