Top Banner
Heap Decomposition for Concurrent Shape Analysis R. Manevich T. Lev-Ami M. Sagiv Tel Aviv University G. Ramalingam MSR India J. Berdine MSR Cambridge Dagstuhl 08061, February 7, 2008
36

Heap Decomposition for Concurrent Shape Analysis

Jan 13, 2016

Download

Documents

ryo

Heap Decomposition for Concurrent Shape Analysis. R. Manevich T. Lev-Ami M. Sagiv Tel Aviv University. G. Ramalingam MSR India. J. Berdine MSR Cambridge. Dagstuhl 08061, February 7, 2008. Thread modular analysis for coarse-grained concurrency. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Heap Decomposition for Concurrent Shape Analysis

Heap Decompositionfor Concurrent Shape

Analysis

R. ManevichT. Lev-AmiM. SagivTel Aviv

University

G. Ramalingam

MSR India

J. Berdine

MSR Cambridge

Dagstuhl 08061, February 7, 2008

Page 2: Heap Decomposition for Concurrent Shape Analysis

2

Thread modular analysisfor coarse-grained concurrency E.g., [Qadeer & Flanagan,

SPIN’03][Gotsman et al., PLDI’07] …

With each lock lk subheap h(lk) Partition heap

H = h(lk1) *…* h(lkn) local invariant I(lk)

inferred/specified When thread t

acquires lk it assumes I(lk) releases lk it ensures I(lk) Can analyze each thread “separately”

Avoid explicitly enumerating all thread interleavings

Page 3: Heap Decomposition for Concurrent Shape Analysis

3

Thread modular analysisfor fine-grained concurrency?

CAS

CAS

CAS

CAS

CAS (Compare And Swap)

No locks means more interference between threads

No nice heap partitioning

Still idea of reasoning about threads separately appealing

Page 4: Heap Decomposition for Concurrent Shape Analysis

4

Overview State space is too large for two reasons

Unbounded number of objects infinite Apply finitary abstractions to data structures (e.g.,

abstract away length of list) Exponential in the number of threads

Observation: Threads operate on part of state Correlations between different substates often

irrelevant to prove safety properties Our approach: develop abstraction for

substates Abstract away correlations between substates

of different threads Reduce exponential state space

Page 5: Heap Decomposition for Concurrent Shape Analysis

5

Non-blocking stack [Treiber 1986]

[1] void push(Stack *S, data_type v) {[2] Node *x = alloc(sizeof(Node));[3] x->d = v;[4] do {[5] Node *t = S->Top;[6] x->n = t;[7] } while (!CAS(&S->Top,t,x));[8] }

[9] data_type pop(Stack *S){[10] do {[11] Node *t = S->Top;[12] if (t == NULL)[13] return EMPTY;[14] Node *s = t->n;[15] data_type r = s->d;[16] } while (!CAS(&S->Top,t,s));[17] return r;[18] }

#define EMPTY -1

typedef int data type;

typedef struct node t { data type d; struct node t *n;} Node;

typedef struct stack t { struct node t *Top;} Stack;

Page 6: Heap Decomposition for Concurrent Shape Analysis

6

Example: successful push

[1] void push(Stack *S, data_type v) {[2] Node *x = alloc(sizeof(Node));[3] x->d = v;[4] do {[5] Node *t = S->Top;[6] x->n = t;[7] } while (!CAS(&S->Top,t,x));[8] }

Top

n

tn

xn

Page 7: Heap Decomposition for Concurrent Shape Analysis

7

Example: successful push

[1] void push(Stack *S, data_type v) {[2] Node *x = alloc(sizeof(Node));[3] x->d = v;[4] do {[5] Node *t = S->Top;[6] x->n = t;[7] } while (!CAS(&S->Top,t,x));[8] }

Top=CAS succeeds

n

n

tn

x

Page 8: Heap Decomposition for Concurrent Shape Analysis

8

Example: unsuccessful push

[1] void push(Stack *S, data_type v) {[2] Node *x = alloc(sizeof(Node));[3] x->d = v;[4] do {[5] Node *t = S->Top;[6] x->n = t;[7] } while (!CAS(&S->Top,t,x));[8] }

CAS fails

Top

n

tn

xn

n

Page 9: Heap Decomposition for Concurrent Shape Analysis

9

Concrete states with storable threads

Top

n

x

nx t

st

t

n

n

prod1

cons1

prod2

pc=7

cons2

pc=6

pc=14

pc=16

t

thread object:name +program location

local variable

next field of list

Page 10: Heap Decomposition for Concurrent Shape Analysis

10

Full state S1

Top

n

x

nx t

st

t

n

n

prod1

cons1

prod2

pc=7

cons2

pc=6

pc=14

pc=16

t

Page 11: Heap Decomposition for Concurrent Shape Analysis

11

Top

n

x

n

t

n

prod1

pc=7

Top

n

nx

t

prod2

pc=6

Top

n

n

cons1

pc=14t

Top

n

n

t

s

n

cons2

pc=16

M1 M2 M3 M4

Decomposition(S1) = M1 M2 M3 M4

Decomposition(S1)

Note that S1Decomposition(S1)

A substate represents all full states that

contain it

Decomposition isstate-sensitive

(depends on values of pointers and heap

connectivity)

Page 12: Heap Decomposition for Concurrent Shape Analysis

12

Full states S1 S2

S1 S2

Top

n

x

nx t

st

t

n

n

prod1

cons1

prod2

pc=7

cons2

pc=6

pc=14

pc=16

t

Top

n

x

nx t

st

t

n

n

prod2

cons2

prod1

pc=7

cons1

pc=6

pc=14

pc=16

t

Page 13: Heap Decomposition for Concurrent Shape Analysis

13

Decomposition(S1 S2)improve explanation

Top

nx

n

t

n

prod1

pc=7

Top

n

nx

t

n

prod2

pc=6

Top

n

n

t

cons1

pc=14

Top

n

nt

s

n

pc=16

cons2

Top

n

nx

t

n

prod1

pc=6

Top

nx

n

t

n

prod2

pc=7

Top

n

nt

s

n

pc=16

cons1

Top

n

n

t

cons2

pc=14

M1

M2

M3

M4

K1

K2

K3

K4

(S1S2) Decomposition(S1S2)Cartesian abstraction ignores

correlations between substates

Decomposition(S1S2) = (M1K1) (M2K2) (M3K3) (M4K4)

State space exponentially more compact

Page 14: Heap Decomposition for Concurrent Shape Analysis

14

Abstraction properties Substates in each subdomain

correspond to a single thread Abstract away correlations between

threads Exponential reduction of state space

Substates preserve information on part of heap (relevant to one thread)

Substates may overlap Useful for reasoning about programs with

fine-grained concurrency Better approximate interference between

threads

Page 15: Heap Decomposition for Concurrent Shape Analysis

15

Main results New parametric abstraction for heaps

Heap decomposition + Cartesian abstraction Parametric in underlying abstraction +

decomposition Parametric sound transformers

Allows balancing efficiency and precision Implementation in HeDec

Heap Decomposition + Canonical Abstraction Used to prove interesting properties of heap-

manipulating programs with fine-grained concurrency Linearizability

Analysis scales linearly in number of threads

Page 16: Heap Decomposition for Concurrent Shape Analysis

16

Sound transformers

{XHj1} j1

{XHj2} j2

{XHj3} j3

{Xj4} j4

{YHj1’} j1’

{YHj2’} j2’

{YHj3’} j3’

{YHj4’} j4’

#

Page 17: Heap Decomposition for Concurrent Shape Analysis

17

Pointwise transformers

{XHj1} j1

{XHj2} j2

{XHj3} j3

{XHj4} j4

{YHj1’} j1’

#

{YHj2’} j2’

#

{YHj3’} j3’

#

{YHj4’} j4’

#

often too imprecise

efficient

Page 18: Heap Decomposition for Concurrent Shape Analysis

18

Imprecision example[1] void push(Stack *S, data_type v) {[2] Node *x = alloc(sizeof(Node));[3] x->d = v;[4] do {[5] Node *t = S->Top;

[6] x->n = t;[7] } while (!CAS(&S->Top,t,x));[8] }

Top

n

nx

t

n

prod2

pc=6

M2 # : schedules prod1 and executes x->n=t

But where do x and t of prod1

point to?

Page 19: Heap Decomposition for Concurrent Shape Analysis

19

Imprecision example[1] void push(Stack *S, data_type v) {[2] Node *x = alloc(sizeof(Node));[3] x->d = v;[4] do {[5] Node *t = S->Top;

[6] x->n = t;[7] } while (!CAS(&S->Top,t,x));[8] }

Top

n

x

nx t

st

t

n

n

prod2

cons1

prod1

pc=7

cons2

pc=6

pc=14

pc=16

t #Top

n

x

n

t

n

prod2

pc=7

false alarm:possible cyclic

list

Page 20: Heap Decomposition for Concurrent Shape Analysis

20

Full composition transformers

{XHj1} j1

{XHj2} j2

{XHj3} j3

{XHj4} j4{XHj1}{XHj1}{XHj1}{X

Hj1} #

#({XHj1}{XHj2}{XHj3}{XHj4})

{YHj1’} j1’

{YHj2’} j2’

{YHj3’} j3’

{YHj4’} j4’

exponential space blow-up

precise

Page 21: Heap Decomposition for Concurrent Shape Analysis

21

Partial composition

{XHj1} j1

{XHj2} j2

{XHj3} j3

{XHj4} j4

{XHj1}{XHj2}

{XHj1}{XHj3}

{XHj1}{XHj4}

Page 22: Heap Decomposition for Concurrent Shape Analysis

22

Partial composition

{XHj1}{XHj2}

{XHj1}{XHj3}

{XHj1}{XHj4}

{YHj1’} j1’

{YHj2’} j2’

{YHj3’} j3’

{YHj4’} j4’

#

#({XHj1}{XHj2})

#

#({XHj1}{XHj3})

#

#({XHj1}{XHj4})

efficient and precise

Page 23: Heap Decomposition for Concurrent Shape Analysis

23

Partial composition example

Top

nx

n

t

n

prod1

pc=7

Top

n

nx

t

n

prod2

pc=6

Top

n

nx

t

n

prod1

pc=6

Top

nx

n

t

n

prod2

pc=7

M1

M2

K1

K2

{XHj1}{XHj2}

Page 24: Heap Decomposition for Concurrent Shape Analysis

24

Partial composition example

{XHj1} j1

{XHj2} j2

{XHj1}{XHj2}

Top

n

x

nx

t

t

n

prod2

prod1

pc=7

pc=7

Top

n

x

nx

t

t

n

prod2

prod1

pc=7

pc=6n

K2k1 K2M1

pc=7

false alarm avoided

Page 25: Heap Decomposition for Concurrent Shape Analysis

26

Experimental results List-based fine-grained algorithms

Non-blocking stack [Treiber 1986] Non-blocking queue [Doherty and Groves

FORTE’04]

Two-lock queue [Michael and Scott PODC’96] Benign data races

Verified absence of nullderef + mem. Leaks Verified Linearizability

Analysis built on top of existing full heap analysis of [Amit et al. CAV’07]

Scaled analysis from 2/3 threads to 20 threads Extended to unbounded threads (different work)

Page 26: Heap Decomposition for Concurrent Shape Analysis

27

0

50000

100000

150000

200000

250000

0 5 10 15 20

number of threads

nu

mb

er

of

stat

es

Decomp

Full

0

1000

2000

3000

4000

0 10 20

number of threads

tim

e (s

ec.)

Experimental results Exponential time/space reduction

Non-blocking stack + linearizability

Page 27: Heap Decomposition for Concurrent Shape Analysis

28

Related work Disjoint regions decomposition [TACAS’07]

Fixed decomposition scheme Most precise transformer is FNP-complete

Partial join [Manevich et al. SAS’04]

Orthogonal to decomposition In HeDec we combine decomposition + partial join

[Yang et al.] Handling concurrency for an unbounded

number of threads Thread-modular analysis [Gotsman et al. PLDI’07] Rely-guarantee [Vafeadis et al. CAV’07] Thread quantification (submitted)

Page 28: Heap Decomposition for Concurrent Shape Analysis

29

More related work Local transformers

Works by Reynolds, O’Hearn, Berdine, Yang, Gotsman, Calcagno

Heap analysis by separation[Yahav & Ramalingam PLDI’04] [Hackett & Rugina POPL’05] Decompose verification problem itself and

conservatively approximate contexts Heap decomposition for interprocedural

analysis [Rinetzky et al. POPL’05] [Rinetzky et al. SAS’05] [Gotsman et al. SAS’06] [Gotsman et al. PLDI’07] Decompose/compose at procedure boundaries

Predicate/variable clustering [Clark et al. CAV’00] Statically-determined decomposition

Page 29: Heap Decomposition for Concurrent Shape Analysis

30

Conclusion Parametric framework for shape

analysis Scaling analyses of program with fine-

grained concurrency Generalizes thread-modular analysis Key idea: state decomposition Also useful for sequential programs

Used prove intricate properties like linearizability

HeDec tool http://www.cs.tau.ac.il/~tvla#HEDEC

Page 30: Heap Decomposition for Concurrent Shape Analysis

31

Future/ongoing work Extended analysis for an unbounded

number of threads via thread quantification Orthogonal technique Both techniques compose very well

Can we automatically infer good decompositions?

Can we automatically tune transformers?

Can we ruse ideas to non-shape analyses?

Page 31: Heap Decomposition for Concurrent Shape Analysis

32

Invited questions How do you choose a decomposition? How do you choose transformers? How does it compare to separation

logic? What is a general principle and what

is specific to shape analysis? Caveats / limitations?

Page 32: Heap Decomposition for Concurrent Shape Analysis

33

How do you choose a decomposition? In general this an open problem

Perhaps ctrex. refinement can help Depends on property you want to prove Aim at causes of combinatorial explosion

Threads Iterators

For linearizability we used For each thread t

Thread node, objects referenced by local variables, objects referenced by global variables

Objects referenced by global variables and objects correlated with seq. execution

Locks component: for each lock thread that acquires it

Page 33: Heap Decomposition for Concurrent Shape Analysis

34

How do you choose transformers? In general challenging problem

Have to balance efficiency and precision Have some heuristics

Core subdomains

Page 34: Heap Decomposition for Concurrent Shape Analysis

35

How does it compare to separation logic? Relevant separating conjunction *r

Like * but without the disjointness requirement Do you have an analog of the frame rule?

For disjoint regions decomposition [TACAS’07] In general no, but instead we can use

transformers of different level of precision

#(I1 I2) = #precise(I1) #less-precise(I2)

where #less-precise is cheap to compute Perhaps can find conditions for which

#(I1 I2) = #precise(I1) I2 Relativized formulae

Page 35: Heap Decomposition for Concurrent Shape Analysis

36

What is a general principle and what is specific to shape analysis? Decomposing abstract domains is

general Substate abstraction + Cartesian product

Parametric transformers for Cartesian abstractions is general

Chopping down heaps by heterogeneous abstractions is shape-analysis specific

Page 36: Heap Decomposition for Concurrent Shape Analysis

37

Caveats / limitations? Decomposition + transformers defined by

user Not specialized for program/property

Too much overlap between substates can lead to more expensive analyses

Too fine decomposition requires lots of composition

Partial composition is a bottle neck We have the theory for finer grained

compositions + incremental transformers but no implementation

Instantiated framework for just one abstraction (Canonical Abstraction) Can this be useful for separation logic-based

analyzers?