Top Banner
Dynamic Purity Analysis for Java Programs Haiying Xu, Christopher J.F. Pickett, Clark Verbrugge School of Computer Science, McGill University PASTE ’07 Conference, San Diego, CA Presented by Derek White CSE 6329
30

Dynamic Purity Analysis for Java Programs Haiying Xu, Christopher J.F. Pickett, Clark Verbrugge School of Computer Science, McGill University PASTE ’07.

Dec 24, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Dynamic Purity Analysis for Java Programs Haiying Xu, Christopher J.F. Pickett, Clark Verbrugge School of Computer Science, McGill University PASTE ’07.

Dynamic Purity Analysis for Java Programs

Haiying Xu, Christopher J.F. Pickett, Clark VerbruggeSchool of Computer Science, McGill University

PASTE ’07 Conference, San Diego, CA

Presented by Derek White

CSE 6329

Page 2: Dynamic Purity Analysis for Java Programs Haiying Xu, Christopher J.F. Pickett, Clark Verbrugge School of Computer Science, McGill University PASTE ’07.

Outline

• Introduction• Approach and Contributions• Design: Static Purity Analysis• Kinds of Dynamic Purity• Design: Dynamic Purity Analysis• Memoization• Experimental Evaluation• Conclusions

Page 3: Dynamic Purity Analysis for Java Programs Haiying Xu, Christopher J.F. Pickett, Clark Verbrugge School of Computer Science, McGill University PASTE ’07.

Introduction

• Functional programming emphasizes application of functions and avoids mutable data (side effects)

• Popular functional languages include Scheme, Haskell, F#, OCaml, Scala, etc

• But you can program in a functional style using other languages

• “Pure” methods are methods that have functional (side effect free) behavior– Several definitions for purity, either no externally visible side

effects or the extent of side effects is limited– Constraints may also be placed on level of dependency on

previously available state

Page 4: Dynamic Purity Analysis for Java Programs Haiying Xu, Christopher J.F. Pickett, Clark Verbrugge School of Computer Science, McGill University PASTE ’07.

Introduction (2)

• Why do we care if a method is pure?• Helpful in program understanding, allows us

to isolate side effect free parts• Verification in model checking• Can be used to guide compiler optimization– Better method purity info allows for less

conservative assumptions– Caching (memoization) of function calls

Page 5: Dynamic Purity Analysis for Java Programs Haiying Xu, Christopher J.F. Pickett, Clark Verbrugge School of Computer Science, McGill University PASTE ’07.

Introduction (3)

• Static analysis has allowed large classifications for pure methods, there is variation in precise definitions used

• Static analysis is conservative with respect to runtime behavior

• It is unclear if some classes of pure methods have any practical value

• So, the authors present a detailed examination of method purity for Java– Considering several definitions of purity– Investigating both static and dynamic properties

Page 6: Dynamic Purity Analysis for Java Programs Haiying Xu, Christopher J.F. Pickett, Clark Verbrugge School of Computer Science, McGill University PASTE ’07.

Approach and Contributions

• Extending previous work on static analysis, showing different forms of purity at different frequencies in dynamic environment

• Design and implementation of dynamic purity analysis, online and offline– Scalable, handles SPECjvm98 at size 100 “with

acceptable overhead”• Support for multiple purity definitions in order to

compare to static purity analysis, also identified pure forms only observable dynamically

Page 7: Dynamic Purity Analysis for Java Programs Haiying Xu, Christopher J.F. Pickett, Clark Verbrugge School of Computer Science, McGill University PASTE ’07.

Approach and Contributions (2)

• Three metrics for the evaluation of extent of dynamic purity– Method, invocation, bytecode– These are applied to a static analysis as well as

dynamic purity definitions• Implementation of memoization on JVM, a

traditional consumer of purity information– Doesn’t achieve any speedup, just a functional

test module

Page 8: Dynamic Purity Analysis for Java Programs Haiying Xu, Christopher J.F. Pickett, Clark Verbrugge School of Computer Science, McGill University PASTE ’07.

Design: Static Analysis• Previous work has found that a large number of methods have weak

purity properties, stronger purity properties result in fewer pure method

• Static work done here considers strong purity– Method is “strongly pure” iff it doesn’t depend on OR change initial state

beyond primitive input values– Must always return the same result for the same input

• Specifically, the method may not:– Read/write heap or static data– Synchronize– Allocate objects– Invoke native methods– Throw exceptions– Invoke any non-pure methods

Page 9: Dynamic Purity Analysis for Java Programs Haiying Xu, Christopher J.F. Pickett, Clark Verbrugge School of Computer Science, McGill University PASTE ’07.

Design: Static Analysis (2)

• Java class files used as input• Flow-insensitive analysis done using Soot

SootSableVM

Class files

Jimple

Static Analysis

Attribute Generation

Class files + attributes

Attribute Parser

Dynamic Metrics

Output

Figure 1. Static analysis framework

Page 10: Dynamic Purity Analysis for Java Programs Haiying Xu, Christopher J.F. Pickett, Clark Verbrugge School of Computer Science, McGill University PASTE ’07.

Design: Static Analysis (3)• Instructions within a method are scanned, any instructions found to

be impure mark the method as impure• Interprocedural analysis is done next, propagating impurity up from

leaves of a CHA-based call graph• Assumption is made that exceptions do not propagate up the call

stack uncheckedImpurity Instructions

Native code exec native INVOKE*

Heap access NEW, NEWARRAY, ANEWARRAY, MULTIANEWARRAY, GETFIELD, PUTFIELD, *ALOAD, *ASTORE

Static access GETSTATIC, PUTSTATIC

Synchronization synchronized INVOKE*, synchronized *RETURN, MONITORENTER, MONITOREXIT

Exceptions ATHROW

Page 11: Dynamic Purity Analysis for Java Programs Haiying Xu, Christopher J.F. Pickett, Clark Verbrugge School of Computer Science, McGill University PASTE ’07.

Design: Static Analysis (4)

• Easily extended for dynamic evaluation of strong static purity analysis

• Soot writes purity information to class file attributes• SableVM reads attributes and records:– Pure methods reached at runtime– Frequency of pure method invocations– Percentage of pure bytecode executed by pure methods

• Provides indications about how static results correlate with dynamic runtime behavior

Page 12: Dynamic Purity Analysis for Java Programs Haiying Xu, Christopher J.F. Pickett, Clark Verbrugge School of Computer Science, McGill University PASTE ’07.

Design: Dynamic Analysis

• Under the static analysis, a method is determined to be pure for all possible executions or is impure otherwise – may be too conservative

• Methods that were flagged impure with static analysis may only execute pure flow control at runtime

• Goal of dynamic analysis is to identify pure methods based on runtime behavior, increasing number of pure methods found

Page 13: Dynamic Purity Analysis for Java Programs Haiying Xu, Christopher J.F. Pickett, Clark Verbrugge School of Computer Science, McGill University PASTE ’07.

Design: Dynamic Analysis (2)

Figure 2. Dynamic purity analysis framework

Page 14: Dynamic Purity Analysis for Java Programs Haiying Xu, Christopher J.F. Pickett, Clark Verbrugge School of Computer Science, McGill University PASTE ’07.

Design: Dynamic Analysis (3)

• Class files read into SableVM, instruction stream is examined for purity

• Purity analysis module uses an online escape analysis tracking writes to locally allocated objects

• Purity information can be used immediately by the VM or written to a file as offline analysis for a later execution

• Offline analysis removes the execution overhead• Clients of analysis are memoization and metrics used in

static analysis• Four kinds of purity: strong, moderate, weak, once-impure

Page 15: Dynamic Purity Analysis for Java Programs Haiying Xu, Christopher J.F. Pickett, Clark Verbrugge School of Computer Science, McGill University PASTE ’07.

Kinds of Dynamic Purity: Strong

• Same criteria as strong static purity• Only executed instructions are considered• All methods start with unknown status• Impure method information propagates up

the call stack• As with static, once a method is identified as

impure it is conservatively always considered impure

Page 16: Dynamic Purity Analysis for Java Programs Haiying Xu, Christopher J.F. Pickett, Clark Verbrugge School of Computer Science, McGill University PASTE ’07.

Kinds of Dynamic Purity: Moderate• Objects can be created and altered as long as the objects do not escape

the method execution context• A method may call an impure method as long as the impurity is contained• Must not change behavior based on heap or global state, based

completely on primitive input arguments• Methods still cannot:

– Invoke native methods– Read/write existing heap or static objects– Perform monitor operations– Throw exceptions– Call moderately impure methods, unless modified data belongs to and is

contained in the caller• Native System.arraycopy() and Object.clone() treated as heap access and

allocation instructions

Page 17: Dynamic Purity Analysis for Java Programs Haiying Xu, Christopher J.F. Pickett, Clark Verbrugge School of Computer Science, McGill University PASTE ’07.

Kinds of Dynamic Purity: Moderate (2)

• Analysis needs to take a closer look at *NEW*, GETFIELD, PUTFIELD, *ALOAD, *ASTORE

• *NEW* instructions used to determine object locality– Objects of a method are local if they do not escape the method, or if they

escape from a callee– Frames in the call stack have an object table storing all currently local

objects• PUTFIELD can allow objects local to the callee to escape to the

caller (requires an update to the object table)• GETFIELD, PUTFIELD, *ALOAD, *ASTORE can be

classified depending on a frame’s object table• Moderately pure methods can only use object parameters for

reference comparisons

Page 18: Dynamic Purity Analysis for Java Programs Haiying Xu, Christopher J.F. Pickett, Clark Verbrugge School of Computer Science, McGill University PASTE ’07.

Kinds of Dynamic Purity: Weak

• Allows heap reads so a method can inspect object parameters

• Maintains property that the method is function on its input

• GETFIELD is always safe• PUTFIELD still is considered in the context of

the escape analysis

Page 19: Dynamic Purity Analysis for Java Programs Haiying Xu, Christopher J.F. Pickett, Clark Verbrugge School of Computer Science, McGill University PASTE ’07.

Kinds of Dynamic Purity: Once-Impure

• Observed that some impure methods became weakly pure after a first invocation

• Once-Impure is a weakly pure method that was impure during its first execution

Page 20: Dynamic Purity Analysis for Java Programs Haiying Xu, Christopher J.F. Pickett, Clark Verbrugge School of Computer Science, McGill University PASTE ’07.

Memoization: Optimization with Purity

• All forms of purity mentioned previously ensure that there is a unique result for any given input

• All are candidates for memoization• Memoization caches argument to return value

mapping allowing the VM to bypass repeated execution of a method with the same arguments

• Benefit from jumping past execution must outweigh cost of looking up the return value in cache

Page 21: Dynamic Purity Analysis for Java Programs Haiying Xu, Christopher J.F. Pickett, Clark Verbrugge School of Computer Science, McGill University PASTE ’07.

Memoization (2)

• Method must be long enough to be worth optimizing• After the first invocation, arguments are hashed together,

looked up in a hash table, and the stored return value is substituted for invocation

• Primitive args stored directly, reference args are flattened (gathering type and primitive fields)– Done so that garbage collection doesn’t invalidate memo tables

• Direct object reference comparisons cannot be safely memoized, so ACMP_* bytecodes must be considered impure

• Upper bounds on memory consumption limit the number of method invocations that can be cached

Page 22: Dynamic Purity Analysis for Java Programs Haiying Xu, Christopher J.F. Pickett, Clark Verbrugge School of Computer Science, McGill University PASTE ’07.

Experimental Evaluation

• Experiments conducted using programs from SPEC JVM98 benchmark

• Metrics– Static method purity - percentage of all methods in the call

graph that are pure– Dynamic method purity - percentage of methods reached

at runtime that are pure– Dynamic invocation purity – percentage of method

invocations that are pure– Dynamic bytecode purity – percentage of executed

bytecode stream belonging to pure methods

Page 23: Dynamic Purity Analysis for Java Programs Haiying Xu, Christopher J.F. Pickett, Clark Verbrugge School of Computer Science, McGill University PASTE ’07.

Experimental Evaluation: Static• Experimental analysis includes both application and class library code used• On average, 13% of methods are found to be strongly pure• Not all methods are invoked at runtime, dynamically it is found that 5-6%

of reached methods are statically identified as pure• Many of these methods are small (20 inst or less) or are executed

infrequently

Table 2. Strong Static Purity: Static methods row shows percentage of all methods in the call graph identified as statically pure. Dynamic methods row shows percentageof all dynamic method invocations that execute a statically pure method. Bytecoderow shows the percentage of the bytecode stream that is executed by a staticallypure method

Page 24: Dynamic Purity Analysis for Java Programs Haiying Xu, Christopher J.F. Pickett, Clark Verbrugge School of Computer Science, McGill University PASTE ’07.

Experimental Evaluation: Dynamic

• Strong dynamic purity is a weaker than the static equivalent

• First row of Tables 3, 4, 5 show an improvement over the runtime use of strong static purity in rows 2-4 of Table 2

• Table 3 shows up to 4% more pure methods reached with strong dynamic purity

• Some methods invoked with significant frequency, Table 4 shows 13% more pure invocations for db

Page 25: Dynamic Purity Analysis for Java Programs Haiying Xu, Christopher J.F. Pickett, Clark Verbrugge School of Computer Science, McGill University PASTE ’07.

Experimental Evaluation: Dynamic (2)

Table 3. Dynamic method purity: All reached methods

Table 4. Dynamic invocation purity: Invoked methods that are pure for dynamic purity definitions

Table 5. Dynamic bytecode purity: Bytecode instruction streams that are pure for dynamic purity definitions

Page 26: Dynamic Purity Analysis for Java Programs Haiying Xu, Christopher J.F. Pickett, Clark Verbrugge School of Computer Science, McGill University PASTE ’07.

Experimental Evaluation: Dynamic (3)

• Reasons for impurity

Table 8. Reasons for dynamic impurity

Page 27: Dynamic Purity Analysis for Java Programs Haiying Xu, Christopher J.F. Pickett, Clark Verbrugge School of Computer Science, McGill University PASTE ’07.

Experimental Evaluation: Memoization

• Once-impure dynamic purity analysis used, a method is always invoked once prior to memoization

• Only applied to methods meeting cost effective criteria

Table 11. Memoized/memoizable methods: Minimum method size setting shown in far left column

Page 28: Dynamic Purity Analysis for Java Programs Haiying Xu, Christopher J.F. Pickett, Clark Verbrugge School of Computer Science, McGill University PASTE ’07.

Experimental Evaluation: Execution

Figure 3. Execution times: Minimum method size for memoization is set to 50

Page 29: Dynamic Purity Analysis for Java Programs Haiying Xu, Christopher J.F. Pickett, Clark Verbrugge School of Computer Science, McGill University PASTE ’07.

Conclusions

• Dynamic purity analyses identify considerable amounts of purity

• Actual program behavior is not predictable based on only on static observations

• Little variation in purity over the benchmark suite

• May be the case that memoization is of limited use for non-functional languages

Page 30: Dynamic Purity Analysis for Java Programs Haiying Xu, Christopher J.F. Pickett, Clark Verbrugge School of Computer Science, McGill University PASTE ’07.

Questions