Java Race Finder Checking Java Programs for Sequential Consistency Tuba Yavuz-Kahveci Fall 2013
Feb 25, 2016
Java Race FinderChecking Java Programs for
Sequential ConsistencyTuba Yavuz-Kahveci
Fall 2013
Outline The Problem: Getting Multithreaded Java
Programs Right Java Memory Model Our Solution: Java Race Finder
What is model checking anyway? Representing Happens-before Heuristic-based Search Code Modification Suggestions
What is Sequential Consistency? Program statements are executed according to
program order Each thread’s statements are executed according
to the program order in that thread’s code Write atomicity
Each read operation on a variable sees the most recent write operation on that variable
What is a Memory Model? Constrains the behavior of memory operations
What value can a read operation see? Example memory models
Sequential Consistency Easy to understand
Relaxed Consistency Models Relaxation of
Program order Write atomicity
Who Should Care? Programmers
Understanding how to achieve sequential consistency, if possible
Reasoning about correctness Compiler writers
Optimizing code within the restrictions of the memory model
Problem: Getting Multi-threaded Java Programs Right Important Questions Any Java Programmer Should
Ask Is my multithreaded program correctly
synchronized? Beware!!! Sequential consistency is not guaranteed for
incorrectly synchronized Java programs! If my multithreaded program is not correctly
synchronized, how can I fix it? If my multithreaded program is not correctly
synchronized for a good reason, should I still be worried?
Automated tool support is needed to answer these nontrivial questions
An Example: Peterson’s Mutual Exclusion Algorithm - Version 1
Initialization: flag[0] = flag[1] = turn = shared = 0 /* all non-volatile */
s1: flag[0] = 1;
Thread 1
Thread 2
s2: turn = 1;s3: while (flag[1] == 1 && turn == 1) { /*spin*/}s4: shared++; /* critical section */s5: flag[0] = 0;
s6: flag[1] = 1;s7: turn = 0;s8: while (flag[0] == 1 && turn == 0) { /*spin*/}s9: shared++; /* critical section */s10: flag[0] = 0;
Outline The Problem: Getting Multithreaded Java
Programs Right Java Memory Model Our Solution: Java Race Finder
What is model checking anyway? Representing Happens-before Heuristic-based Search Code Modification Suggestions
What is Java Memory Model (JMM)? A relaxed memory model Sequential consistency is guaranteed only for
correctly synchronized programs For programs without data races
Incorrectly synchronized programs can show extra behavior that is not sequentially consistent Still subject to some safety rules
Synchronization Rules in Java Some synchronization actions and their relationship in Java:
Unlocking a monitor lock synchronizes with locking that monitor lock.
Writing a volatile variable synchronizes with reading of that variable.
Starting a thread synchronizes with the first action of that thread.
Final action in a thread synchronizes with any action of a thread that detects termination of that thread.
Initialization of a field synchronizes with the first access to the field in every thread.
In general a release action synchronizes with a matching acquire action.
Happens-Before Relation An action a1 happens-before action a2, a1 ≤hb
a2, due to one of the following: a1 comes before a2 according to program order: a1
≤po a2. a1 synchronizes with a2: a1 ≤sw a2. a1 happens-before a’ that happens-before a2:
Exists a’. a1 ≤hb a’ and a’ ≤hb a2 (transitivity).
Happens-before, ≤hb = ( ≤po U ≤sw )+ , is a partial-order on all actions in an execution.
Happens-before Consistency A read operation r can see results of a write
operation w provided that: r does not happen-before w: not (r ≤hb w). There is no intervening write operation: not (exists
w’. w r ≤hb w’ ≤hb r).
Anatomy of a Data Race Definition: If two actions a1 and a2 from different
threads access the same memory location loc, the actions are not ordered by happens-before and if one of the actions is a write, then there is a data race on loc.
Example:
≤hb ≤hb
≤hb
≤hb
Thread 1
Thread 2
Initialization: boolean done = false; /* non-volatile */
done = true; if (done)
// use result
Race on
done!!!
result = compute();
A Simple Fix A write to a volatile variable synchronizes with a
read of that variable. Example:
≤hb ≤hb
≤hb ≤hb
Thread 1
Thread 2
Initialization: volatile boolean done = false;
done = true; if (done)
// use result
result = compute();≤hb
≤hb
≤hb
Not in a race
Outline The Problem: Getting Multithreaded Java
Programs Right Java Memory Model Our Solution: Java Race Finder
What is model checking anyway? Representing Happens-before Heuristic-based Search Code Modification Suggestions
Our Solutions/Contributions Is my multi-threaded program correctly synchronized?
Kim K., Yavuz-Kahveci T., Sanders B.Precise Data Race detection in Relaxed Memory Model using Heuristic based Model Checking [ASE Conf. 2009]
If my multi-threaded program is not correctly synchronized, how can I fix it?Kim K., Yavuz-Kahveci T., Sanders B. JRF-E: Using Model Checking to give Advice on Eliminating Memory Model-related Bugs [ASE Conf. 2010, ASE Journal 2012]
If my program is not correctly synchronized for a good reason, should I still be worried?Jin H., Yavuz-Kahveci T., Sanders B. Java Path Relaxer: Extending JPF for JMM-aware model checking [JPF Workshop]Jin H., Yavuz-Kahveci T., Sanders B. Java Memory Model-Aware Model Checking [TACAS 2012]
Outline The Problem: Getting Multithreaded Java
Programs Right Java Memory Model Our Solution: Java Race Finder
What is model checking anyway? Representing Happens-before Heuristic-based Search Code Modification Suggestions
State/Snapshot of a Running Java Program
Values of Static
FieldsHeap
(objects)
Thread states
Bytecode for the Java program
JAVA VIRTUAL MACHINE
Model Checking Java Programs
Values of Static
FieldsHeap
(objects)
Thread states
Main ThreadThread1Thread2Thread3
…
Main ThreadThread2Thread1Thread3
…
Main ThreadThread3Thread2Thread1
…
Model Checking for Sequential Consistency
Java Race Finder (JRF)
Java Path Finder (JPF)
Multi-threaded Java
application
DataRace?
yes
no
• a model-checker for Javaprograms• checks for generalcorrectness properties• assumes sequentialconsistency• explores all possible thread interleaving
• extends JPF’s state representation to detect data races
Our Approach for Detecting Data Races
Algorithm:
for each execution path EPj=<a1, a2, …, an> of program P do
initialize happens-before relation for each action ai , i= 1 to n, do
let loc be the memory location ai accesses
if (it is safe (without a data race) for ai to access loc)
generate DATA RACE error
execute ai
update happens-before relation
Representing Happens-Before We define an h-function that captures the
happens-before relation in an implicit way. h: SyncAddr U Thread -> 2Addr .
SyncAddr: Volatile variables and locks Addr: Non-volatile variables
Is it safe for aj of thread ti to access loc? does h(ti ) contain loc?
Which variables can be safely accessed if acquire on s (with a matching release on s) is executed? h(s).
The h-function Initialization:
At the beginning there is only the main thread: h0 = λz.if z = main then static(P) else φ
Update: Executing an action updates the h-function: action(t, x) h = h’
h: h-function before executing action t: the thread the action belongs to x: synchronization variable (volatile or a lock) h’: the updated h function
Updating the h-functionaction an by thread t hn+1
write a volatile field v release(t,v) hn
read a volatile field v acquire(t, v) hn
lock the lock variable lck acquire(t, lck) hn
unlock the lock variable lck release(t,lck) hn
start thread t′ release(t,t′) hn
join thread t′ acquire(t, t′) hn
t′.isAlive() returns false acquire(t, t′) hn
write a non-volatile field x invalidate(t, x) hn
read a non-volatile field x hn
instantiate an object containing non-volatile fields fields and volatile fields volatiles new (t, fields , volatiles ) hn
Action Semantics Variables that can be safely accessed from thread t copied to the set for
synchronization variable x
release(t, x)h = h[x →h(t)∪h(x)] Variables in the set of synchronization variable x will now be safely accessed by
thread t
acquire(t, x)h = h[t →h(t)∪h(x)] Only thread t which changed x can safely access it.
invalidate(t, x) h = λz. if (t = z) then h(z) else h(z)\{x} The non-volatile fields of the newly created object can be safely accessed by the
thread who created it. The volatile fields are initialized to refer to empty sets.
new(t, fields, volatiles)h = λz. if (t = z) then h(t) ∪ fields else if (z ∈ volatiles) then{} else h(z)
Implementation of the h-function
How JRF extends JPF
Test ProgramsSources # of examples # of examples found
to have data racesTextbook by Herhily and Shavitz.
65 19
Amino Concurrent Building Blocks Library
10 9
Google Concurrent Data Structures Workshop.
12 10
Java Grande Forum Benchmark Suite
10 6
Webserver Simulator – Student Projects
28 7
Time Overhead of JRF
0 2 4 6 8 10 121
10
100
1000
JPFJRF
Test Programs
Tim
e (s
ecs)
Space Overhead of JRF
0 2 4 6 8 10 121
10
100
1000
JPFJRF
Test Programs
Mem
ory
(MB)
Outline The Problem: Getting Multithreaded Java
Programs Right Java Memory Model Our Solution: Java Race Finder
What is model checking anyway? Representing Happens-before Heuristic-based Search Code Modification Suggestions
Finding the data race quickly
race
State space of a programinitialstate
Each path from initial state to a leaf state represents a separate execution.
race
race
Finding the data race using DFS
race
State space of a programinitialstate
Each path from initial state to a leaf state represents a separate execution.
race
race
DFScounter-example
Finding the data race using BFS
race
State space of a programinitialstate
Each path from initial state to a leaf state represents a separate execution.
race
race
BFScounter-example
Heuristic-Based Data Race Search Our goal is to reach a state that has a data race as
quick as possible. Assign a traversal priority to each program state based
on how close it may be to a racy state. Writes-First (WF): Prefer write statements to read
statements Watch-Written (WW): Prefer access to memory locations
recently written by another thread Avoid Release/Acquire (ARA): Avoid scheduling threads
that perform proper synchronization. Acquire-First (AF): Prefer acquire operations that do not
have a matching release operation.
An Example: Peterson’s Mutual Exclusion Algorithm - Version 1
Initialization: flag[0] = flag[1] = turn = shared = 0 /* all non-volatile */
s1: flag[0] = 1;
Thread 1
Thread 2
s2: turn = 1;s3: while (flag[1] == 1 && turn == 1) { /*spin*/}s4: shared++; /* critical section */s5: flag[0] = 0;
s6: flag[1] = 1;s7: turn = 0;s8: while (flag[0] == 1 && turn == 0) { /*spin*/}s9: shared++; /* critical section */s10: flag[0] = 0;
DFS vs Heuristic Search
s1: flag[0] = 1;
Thread 1
s2: turn = 1;
s3: while (flag[1] == 1 && turn == 1) { /*spin*/}s4: shared++; /* critical section */ s5: flag[0] =
0;Thread 2
s6: flag[1] = 1;s7: turn = 0;
Thread 1
s1: flag[0] = 1;s2: turn = 1;
Thread 2
s6: flag[1] = 1;s7: turn = 0;
Race!turn not in h(thread2)!
DFSSearchPath
Heuristic
Search Path
Experimental Results: Heuristic Search
Code(lines of code)
Search State Length
Time(sec)
Memory(MB)
DisBarrier(232)
DFSHeuristicBFS
109 792589
109 39 36
4 3256
53 46644
Moldyn(1252)
DFSHeuristicBFS
282118965127
2821
950>574*
231257
1014
579518988
DEQueue(334)
DFSHeuristicBFS
331930
2812
9
112
272631
BinaryStaticTreeBarrier (1910)
DFSHeuristicBFS
61137
2275
6152
>18*
79
2221
6686
986*: JPF ran out of memory
Outline The Problem: Getting Multithreaded Java
Programs Right Java Memory Model Our Solution: Java Race Finder
What is model checking anyway? Representing Happens-before Heuristic-based Search Code Modification Suggestions
What went wrong?
Thread 1
s1: flag[0] = 1;s2: turn = 1;
Thread 2
s6: flag[1] = 1;s7: turn = 0;
source
statement
manifest stateme
nt
removes turn from h(thread2)
accesses turn when turn is not in h(thread2)
How to fix it? Data races are due to absence of happens-before
relationship Suggest code modifications that will create
happens-before relationship between the source and manifest statements Change the variable to volatile Change the array to an atomic array Move the source statement to make use of existing
happens-before relationships due to transitivity Perform the same synchronization Change another variable to volatile to create
happens-before relationships due to transitivity
Change to atomic arrayThread 1
s1: flag[0] = 1;s2: turn = 1;
Thread 2
s6: flag[1] = 1;
source
statement
manifest stateme
nt
removes flag[1] from h(thread1)
Accesses flag[1] when flag[1] is not in h(thread1)
Change flag to atomic array
Peterson’s ME Alg.
turn and flag are volatile
s3: while (flag[1] == 1 && turn == 1) { /*spin*/}
Thread 1
An Example for move sourceInitialization: goFlag = false; volatile Data publish;
s1: r = new Data();
Thread 1
Thread 2
s2: publish = r;s3: r.setDesc(e);s4: goFlag = true;
t1: if (publish != null) {t2: while (!goFlag);t3: String s = publish.getDesc(); t4: assert(s.equals(“e”); }
• Updates published object after making the reference visible• Compiler may reorder s3 and s4
• May use the published object when it is in an inconsistent state
Move source statement
s1: r = new Data();
Thread 1
Thread 2
s2: publish = r;s3: r.setDesc(e);s4: goFlag = true;
t1: if (publish != null) {t2: while (!goFlag);
publish is volatile
goFlag is not volatile
source
statement
removes goFlag from h(thread2)
manifest stateme
nt
Accesses goFlag when goFlag is not in h(thread2)
Move s4 before s2
s4: goFlag = true;
An Example for perform the same synchronization operation
Initialization: int data; final Object lock = new Object();
s1: print (data);
Thread 1
Thread 2
t1: synchronized (lock) { /*lock*/t2: data = 1;t3: } /*unlock*/
• For every non-volatile variable v, acquireHistory(v) stores the set of safe accesses by thread t via a synchronization operation on s.• Thread2’s safe access on data is noted as an example behavior.
Perform that synchronized block
s1: print (data);
Thread 1
Thread 2
t1: synchronized (lock) { /*lock*/t2: data = 1;t3: } /*unlock*/
data is not volatile
Perform synchronized (lock) to
access data
source
statement
removes data from h(thread1)
manifest stateme
nt
Accesses data when data is not in h(thread1)s0: synchronized (lock) {
/*lock*/
s2: } /*unlock*/
An Example for change another to volatile
Initialization: int x; boolean done = false; /* both non-volatile*/
s1: x = 1;
Thread 1
Thread 2
t1: while (!done);t2: assert(x == 1);
s2: done = true;
• Potential data races both on x and done.• Should we really change both to x and done to
volatile?• Can we get away by changing only one?
Change other to volatile
s1: x = 1;
Thread 1
Thread 2
t1: while (!done);t2: assert(x == 1);
s2: done = true;
x and done are not volatile
source
statement
removes x from h(thread2)
manifest stateme
ntaccesses x when
x is not in h(thread2)
Change done to volatile
JRF-E: Eliminating Data Races JRF is configured to produce threshold # of
counter-example paths and write to a file JRF-E works on the output of JRF and analyzes the
counter-example paths to generate code modification suggestions
For each race reports intersection of suggestions on all the
relevant counter-example paths For each specific code modification suggestion
reports the frequency
JRF-E RESULT====================================================== data race #1jrf.hbset.util.HBDataRaceException . . .______________________________________________________ analyze counter example data race source statement : "putstatic" at simple/SimpleRace.java:64 : "x = 1;"data race manifest statement : "getstatic" at simple/SimpleRace.java:74: "assert (x==1);"
Change the field "simple.SimpleRace.x from INITIALIZER" to volatile.Change the field "simple.SimpleRace.done from INITIALIZER" to volatile.
______________________________________________________ advice from acquiring historyNONE====================================================== data race #2jrf.hbset.util.HBDataRaceException . . .______________________________________________________ analyze counter example data race source statement : "putstatic" at simple/SimpleRace.java:65 : "done = true;"data race manifest statement : "getstatic" at simple/SimpleRace.java:73: "while(!done) { /*spin*/ }"
Change the field "simple.SimpleRace.done from INITIALIZER" to volatile.
______________________________________________________ advice from acquiring historyNONE______________________________________________________ frequency of advice[1times] Change the field "simple.SimpleRace.x from INITIALIZER" to volatile.[2times] Change the field "simple.SimpleRace.done from INITIALIZER" to volatile.______________________________________________________ statisticJRF takes 0:0:1 to find 2 equivalent races with 9 counterexample traces.JRF-E takes 0:0:0 in 9 races analysis.
How did it
happen?
How many times a suggestion has
been made considering all the
races?
feedback ona single race
feedback on all races
How to fix it?
feedback onanother race
JRF-E - Analyzing threshold # of races
DisBarr
ier
LockFr
eeHash
Set
Optimist
icList
MCSLock*
Linea
rSense
Barrie
rVolat
ile
Iterat
or_EB
Deque
Lufact sor
Webser
werSim
0.11
10100
1000
Threshold=100Threshold=10Threshold=1
Tim
e (s
ecs)
In all except MCSLock, the right suggestion made when Threshold <= 10.
Suggestions that workedLength
Threshold
# of Racy Fields
Change (other) to volatile
Change to atomic array
Use synchronized block
DisBarrier 40 1 2 1LockFreeHashSet
50 1 4 1
OptimisticList
42 1 3 1
MCSLock 65 100 3 2LinearSenseBarrier.
61 10 2 1
Iterator_EBDeque
11 1 1 1
Lufact 19 1 1 1Sor 44 1 2 1Webserver Sim.
68 1 2 1
Conclusion Even experts can benefit from tool support for
detecting data races. JRF can also analyze synchronization idioms that do
not use locking. Has become an official extension of Java Path Finder http://babelfish.arc.nasa.gov/trac/jpf
JRF-E makes working suggestions for most of the data races in our experiments.
JRF-E can teach programmers the intricacies of Java Memory Model.
Thank You
Questions?