Summarizing Procedures in Concurrent Programs Shaz Qadeer Sriram K. Rajamani Jakob Rehof Microsoft Research
Mar 26, 2015
Summarizing Procedures in Concurrent Programs
Shaz Qadeer
Sriram K. Rajamani
Jakob Rehof
Microsoft Research
Motivation
• How do you scale program analyses for sequential programs?– Summarize at procedure boundaries
• Sharir-Pnueli ‘81, Reps-Horwitz-Sagiv ‘95
– Used in compiler dataflow analyses– Used in error detection tools
• SLAM (Ball-Rajamani ‘00)• ESP (Das-Lerner-Seigle ‘02)
Summarization is efficient!
• Boolean program with:– g globals– n procedures, each with at most m locals– |E| = size of the CFG of the program
• Complexity : O( |E| 2 O(g+m)
)
• Complexity linear in the number of procedures!
Summarization gives termination!
• Possibly recursive boolean programs
• Infinite state systems
• Checking terminates with summarization!
Question
Can summarization help analysis of concurrent programs?
Difficulty
Assertion checking for multithreaded programs is undecidable– Even if all variables are boolean– Further, even if only two threads!– Reduce emptiness of intersection of two CFLs
to this problem(Ramalingam 00)
Our work
• New model checking algorithm using summarization – useful for concurrent programs
• Summaries provide re-use and efficiency for analyzing concurrent programs
• Enable termination of analysis in a large class of concurrent programs – includes programs with recursion, shared variables
and concurrency
Difficulties in summarizing concurrent programs
• What is a summary?– For sequential programs
• Summary of procedure P = Set of all pre-post state pairs (s,s’) obtained by invoking P
– This doesn’t work for concurrent programs• Does not model concurrent updates by other
threads
Insight
• In a well synchronized concurrent program– A thread’s computation can be viewed as a
sequence of transactions– While analyzing a transaction, interleavings
with other threads need not be considered– Key idea: Summarize transactions!
How do you identify transactions?
Lipton’s theory of reduction
B: both right + left movers– variable access holding lock
N: non-movers – access unprotected variable
Four atomicities
•R: right movers– lock acquire
S0 S1 S2
acq(this) x
S0 T1 S2
x acq(this)
S7T6S5
rel(this) z
S7S6S5
rel(this)z
L: left movers– lock release
S2 S3 S4
r=bal y
S2 T3 S4
r=baly
S2 T3 S4
r=bal x
S2 S3 S4
r=balx
Transaction
Any sequence of actions whose atomicities are in R*(N+)L* is a transaction
S0 S1 S2
R R
S5 S6
LS3 S4
R N LS7
R
Precommit
Transaction
Postcommit
Transactions and summaries
Corollary of Lipton’s theorem:
No need to schedule other threads in the middle of a transaction
If a procedure body occurs in a transaction, we can summarize it!
Resource allocator (1) bool available[N]; mutex m;
int getResource() { int i = 0; L0: acquire(m); L1: while (i < N) { L2: if (available[i]) { L3: available[i] = false; L4: release(m); L5: return i; } L6: i++; } L7: release(m); L8: return i; }
Choose N = 2
Summaries:<pc, i, m, (a[0],a[1])> <pc’, i’, m’, (a[0]’,a[1]’)>
<L0, 0, 0, (0, 0)> <L8, 2, 0, (0,0)><L0, 0, 0, (0, 1)> <L5, 1, 0, (0,0)><L0, 0, 0, (1, 0)> <L5, 0, 0, (0,0)><L0, 0, 0, (1, 1)> <L5, 0, 0, (0,1)>
What if transaction boundaries and procedure boundaries do not coincide?
Two level model checking algorithm
Two level algorithm
• First level maintains stack
• Second level maintains stack-less summaries
• Summaries can start and end anywhere in a procedure
Resource allocator (2) bool available[N]; mutex m[N];
int getResource() { int i = 0; L0: while (i < N) { L1: acquire(m[i]); L2: if (available[i]) { L3: available[i] = false; L4: release(m[i]); L5: return i; } else { L6: release(m[i]); } L7: i++; } L8: return i; }
Choose N = 2
Summaries:<pc,i,(m[0],m[1]),(a[0],a[1]> <pc’,i’,(m[0]’,m[1]’),(a[0]’,a[1]’)>
<L0, 0, (0,0), (0,0)> <L1, 1, (0,0), (0,0)>
<L0, 0, (0,0), (0,1)> <L1, 1, (0,0), (0,1)>
<L0, 0, (0,0), (1,0)> <L5, 0, (0,0), (0,0)>
<L0, 0, (0,0), (1,1)> <L5, 0, (0,0), (0,1)>
<L1, 1, (0,0), (0,0)> <L8, 2, (0,0), (0,0)>
<L1, 1, (0,0), (0,1)> <L5, 1, (0,0), (0,0)>
<L1, 1, (0,0), (1,0)> <L8, 2, (0,0), (1,0)>
<L1, 1, (0,0), (1,1)> <L5, 1, (0,0), (1,0)>
Two level model checking algorithm: in pictures
Lets first review the sequential CFL algorithm…
bar()
main( ) bar( )
bar()
main( ) bar( )
Two level model checking algorithm: in pictures
bar()
main( ) bar( )
bar()
main( ) bar( )
main
T1
main
T2
End of transaction
bar
Three kinds of summaries:
1. MAX
2. MAXCALL
3. MAXRETURNMAXCALL
MAXRETURN
MAXRETURN
MAX
Concurrency + recursion
void foo(int r) {L0: if (r == 0) {L1: foo(r); } else {L2: acquire(m);L3: g++;L4: release(m); }L5: return;}
Summaries for foo:<pc,r,m,g> <pc’,r’,m’,g’><L0,1,0,0> <L5,1,0,1><L0,1,0,1> <L5,1,0,2>
Summaries for main:<pc,q,m,g> <pc’,q’,m’,g’><M0,1,0,0> <M1,1,0,1><M0,1,0,1> <M1,1,0,2>
<M1,1,0,1> <M4,1,0,1><M1,1,0,2> <M4,1,0,2>
void main() {
int q = choose({0,1});
M0: foo(q);
M1: acquire(m)
M2: assert(g >= 1);
M3: release(m);
M4: return;
}
P = main() || main()
int g = 0;
mutex m;
What if the same procedure is called from different phases of a transaction?
Instrument the transaction phase into the state of the program
Transactional context
void foo1() {L0: acquire(n);L1: gn++;L2: bar();L3: release(n);L4: return;}
void foo2() {
M0: acquire(n);
M1: gn++;
M2: release(n);
M3: bar();
M4: return;
}
P = foo1() || foo2()
int gm = 0, gn = 0;
mutex m, n;
void bar() {
N0: acquire(m);
N1: gm++;
N2: release(m);
}
Recap of technical problems
• How do you identify transactions– Using the theory of reduction (Lipton ’75)
• What if transaction boundaries do not coincide with procedure boundaries?– Two level model checking algorithm– First level maintains stack– Second level maintains stack-less summaries
• Procedure can be called from different phases of a transaction– Instrument the transaction phase into the state of
program
Termination
• A function is transactional if no transaction ends in the “middle” of its exectution (includes all transitive callees)
• Theorem: For concurrent boolean programs, if all recursive functions are transactional, then the algorithm terminates.
Sequential case
• If we feed a sequential program to our algorithm it functions exactly like the Reps-Sagiv-Horwitz-POPL95 algorithm
• Our algorithm generalizes the RHS algorithm to concurrent programs!
Related work
• Summarizing sequential programs– Sharir-Pnueli ‘81, Reps-Horwitz-Sagiv ‘95,
Ball-Rajamani ‘00
• Concurrency+Procedures– Bouajjani-Esparza-Touili ‘02– Esparza-Podeslki ‘00
• Reduction– Lipton ‘75– Qadeer-Flannagan ‘03
(joint work with Tony Andrews)
Sequential C program
Finite state machines
Source code
FSM
abstraction
modelchecker
C data structures, pointers,procedure calls, parameter passing,scoping,control flow
Automatic abstraction
Boolean program
Data flow analysis implemented using BDDs
SLAM
Push down model
Source code
abstraction
modelchecker
Zing
Rich control constructs: thread creation, function call, exception, objects, dynamic allocation
Model checking is undecidable!
Device driver (taking concurrency into account), web services code
What is Zing?
• Zing is a framework for software model-checking– Language, compiler, runtime, tools
• Supports key software concepts– Enables easier extraction of models from code
• Supports research in exploring large state spaces
• Operates seamlessly with the VS.Net design environment
Current status• Summarization:
– Theory: to appear in POPL 04– Implementation: in progress
• Zing: – Compiler, model checker and conformance checker
operational– State-delta and transaction-based reduction
implemented– Plans:
• Symbolic reasoning• Automatic abstraction
Bluetooth demo
BPEL4WS checking
Zing Model
Zing State
Explorer
BuyerBuyer SellerSeller
AuctionAuctionHouseHouse
RegRegServiceService
BPEL ProcessesBPEL Processes