Sangjin Lee & Debashis Saha eBay Inc.
Agenda Introduction
Patterns & anti-patterns Warm-up: “double-checked locking” on collections Many readers, few writers Many writers, few readers
Bonus: configuring a ThreadPoolExecutor
Closing...
2
Introduction The main goal is two-fold: correctness first, and
performance/scalability next
Problems tend to repeat themselves: anti-patterns work as visual “crutches” to spot bad smell
3
Agenda Introduction
Patterns & anti-patterns Warm-up: “double-checked locking” on collection Many readers, few writers Many writers, few readers
Bonus: configuring a ThreadPoolExecutor
Closing...
4
“Double-checked locking” on collection Initialize a collection lazily
class Unsafe { private Map<String,Object> map = null;
public void useMap() { if (map == null) { initMap(); } // read the map; get(), iterate, ... }
private synchronized void initMap() { if (map == null) { map = new HashMap<String,Object>(); // populate the map with initial data } } }
5
“Double-checked locking” on collection It’s worse than the real double-checked locking pattern
Why would one do this? Delay the expensive operation of populating the data You don’t want to incur penalty on reads: once the map is set
up, it’s read-only
But is laziness really necessary?
6
“Double-checked locking” on collection “Eager” fix
class Safe { private final Map<String,Object> map;
public Safe() { map = new HashMap<String,Object>(); // populate the map with initial data }
public void useMap() { // read the map; get(), iterate, ... } }
7
“Double-checked locking” on collection Fix using volatile if the data is optional & large
class Safe { private volatile Map<String,Object> map = null;
public void useMap() { if (map == null) { initMap(); } // read the map; get(), iterate, ... }
private synchronized void initMap() { if (map == null) { Map<String,Object> temp = new HashMap<String,Object>(); // populate temp with initial data map = temp; // make it available after it’s ready } } }
8
Agenda Introduction
Patterns & anti-patterns Warm-up: “double-checked locking” on collections Many readers, few writers Many writers, few readers
Bonus: configuring a ThreadPoolExecutor
Closing...
9
Many readers, few writers Use cases: change data only on demand (e.g.
configuration), ...
Implementation choices 1. Synchronized data structure 2. Concurrent collections (e.g. ConcurrentHashMap) 3. ReadWriteLock 4. “Copy-on-write”
10
Many readers, few writers Example: using synchronization
class Synchronized { private final List<String> list = new ArrayList<String>();
// the entire iteration must be synchronized public synchronized void iterateOnList() { for (String s: list) { // do something with s } }
public synchronized void add(String value) { list.add(value); } }
11
Many readers, few writers Example: using ReadWriteLock
class UsingReadWriteLock { private final List<String> list = new ArrayList<String>(); private final ReadWriteLock lock = new ReentrantReadWriteLock();
public void iterateOnList() { lock.readLock().lock(); try { for (String s: list) { // do something with s } } finally { lock.readLock().unlock(); } }
// continued...
12
Many readers, few writers Example: using ReadWriteLock
// continued public void add(String value) { lock.writeLock().lock(); try { list.add(value); } finally { lock.writeLock().unlock(); } } }
13
Many readers, few writers Copy-on-write
If writes are truly few and far between, and you want reads to be as fast as possible, copy-on-write is an option
You copy and replace the entire data on every write You eliminate synchronization on reads, and shift the burden
to writes
Writes usually become much more expensive example: java.util.concurrent.CopyOnWriteArrayList
14
Many readers, few writers Example: using copy-on-write
class CopyOnWrite { private volatile List<String> list = new ArrayList<String>();
public void iterateOnList() { // no locking needed for (String s: list) { // do something with s } }
public synchronized void add(String value) { // need mutual exclusion List<String> copy = new ArrayList<String>(list); // create a copy copy.add(value); list = copy; } }
15
Many readers, few writers What’s wrong with this?
class BadCopyOnWrite { private volatile List<String> list = new ArrayList<String>();
public void iterateOnList() { // no locking needed for (int i = 0; i < list.size(); i++) { String s = list.get(i); // do something with s } }
public synchronized void add(String value) { // need mutual exclusion List<String> copy = new ArrayList<String>(list); // create a copy copy.add(value); list = copy; } }
16
Many readers, few writers Of course you can simply use CopyOnWriteArrayList!
class CopyOnWrite2 { private final List<String> list = new CopyOnWriteArrayList<String>();
public void iterateOnList() { // no locking needed for (String s: list) { // do something with s } }
public void add(String value) { list.add(value); } }
17
Many readers, few writers
Type Concurrency Staleness behavior
Fully synchronized Not concurrent Updates CANNOT occur during read
ConcurrentHashMap Reads concurrent; writes can be concurrent; read-write can be concurrent
May reflect SOME updates during read
ReadWriteLock Reads concurrent; writes not concurrent; read-write not concurrent
Updates CANNOT occur during read
Copy-on-write Reads concurrent; writes not concurrent; read-write concurrent
Reflects NO updates during read
18
Many readers, few writers For Maps, copy-on-write is less useful as
ConcurrentHashMap is usually good enough
ReadWriteLock is an option, but is less concurrent than and performs more poorly than ConcurrentHashMap
Copy-on-write has the best read performance
19
Many readers, few writers Copy-on-write: caveats
The write performance The staleness behavior should be acceptable (it usually is) The direct reference to the underlying data that is copied
should not escape the object Stale data Memory leaks
20
Many readers, few writers What should we use?
If the (read) concurrency is low, synchronization is often good enough
Choose concurrent collections (ConcurrentHashMap, etc.) if applicable
Use copy-on-write if concurrent collections are not applicable and write performance is not a concern
21
Many readers, few writers How about copy-on-write on MULTIPLE variables?
22
Many readers, few writers Multi-variable example: using synchronization
class Synchronized { private Map<String,String> current = new HashMap<String,String>(); private Map<String,String> previous = null;
public synchronized void shift() { previous = current; current = new HashMap<String,String>(); }
public synchronized void putValue(String key, String value) { current.put(key, value); }
public synchronized void getValue(String key) { return current.get(key); } }
23
Many readers, few writers Copy-on-write on multiple variables
Use a container class with those variables Do a volatile copy-and-replace with the container object
24
Many readers, few writers Multi-variable example: use a container class
class ShiftingWindow { final Map<String,String> current; final Map<String,String> previous;
public ShiftingWindow(Map<String,String> c, Map<String,String> p) { current = c; previous = p; } }
25
Many readers, few writers Multi-variable example: use a container class
class CopyOnWrite { private volatile ShiftingWindow window = new ShiftingWindow(new ConcurrentHashMap<String,String>(), null);
public synchronized void shift() { // copy on write ShiftingWindow newWindow = new ShiftingWindow(new ConcurrentHashMap<String,String>(), window.current); window = newWindow; }
public void putValue(String key, String value) { // no locking window.current.put(key, value); }
public void getValue(String key) { // no locking return window.current.get(key); } }
26
Agenda Introduction
Patterns & anti-patterns Warm-up: “double-checked locking” on collections Many readers, few writers Many writers, few readers
Bonus: configuring a ThreadPoolExecutor
Closing...
27
Many writers, few readers Use cases: logging, counters, statistics, ...
Produce secondary data (e.g. URL counts) from primary operations (serving URLs)
Many writers: all servlet threads will update the data frequently
Few readers: the data will be read on demand (reporting) or periodically
Impact on the primary operations must be minimized
28
Many writers, few readers Implementation choices
1. Synchronized data structure 2. ConcurrentHashMap (for a map or set) 3. Asynchronous (background) processor
29
Many writers, few readers 1. Synchronized data structure
Not recommended Can induce a hotly contended lock under high level of
concurrency, and turn into a scalability hot spot
2. ConcurrentHashMap Normally the best solution Scales well under high level of concurrency
3. Asynchronous (background) processor Useful pattern if ConcurrentHashMap is not an option or
write operations are serial in nature
30
Many writers, few readers Synchronized data structure
class SynchronizedCounter { private final Map<String,Integer> map = new HashMap<String,Integer>();
public synchronized void addCount(String page) { Integer value = map.get(page); value = (value == null) ? 1 : value+1; map.put(page, value); }
public synchronized int getCount(String page) { Integer value = map.get(page); return (value == null) ? 0 : value; } }
31
Many writers, few readers ConcurrentHashMap
class ConcurrentHashMapCounter { private final ConcurrentMap<String,AtomicInteger> map = new ConcurrentHashMap<String,AtomicInteger>();
public void addCount(String page) { AtomicInteger value = map.get(page); if (value == null) { value = new AtomicInteger(0); AtomicInteger old = map.putIfAbsent(page, value); if (old != null) { value = old; } } value.incrementAndGet(); }
// continued...
32
Many writers, few readers ConcurrentHashMap
// continued public int getCount(String page) { AtomicInteger value = map.get(page); return (value == null) ? 0 : value.get(); } }
33
Many writers, few readers Asynchronous (background) processor
A single background processor thread owns the data Primary threads produce tasks for the background processor Writes and reads are actually done on the background
processor thread
34
Many writers, few readers Asynchronous (background) processor: benefits
Latency on the primary threads is minimized Contention is greatly reduced: can yield much better
throughput than synchronization Trivially thread safe: exploits safety via thread confinement
Example: logging to disk/console
35
Many writers, few readers Asynchronous (background) processor: caveats
The data structure should not escape the background thread The actual tasks should be thread-agnostic Performs poorly against a more concurrent solution Code becomes bit more complicated You need to manage saturation: tasks may be produced faster
than they can be handled by the processor
36
Many writers, few readers Asynchronous (background) processor
class BackgroundCounter { // background thread private final ExecutorService executor = Executors.newSingleThreadExecutor(); // map is exclusively used by the executor thread private final Map<String,Integer> map = new HashMap<String,Integer>();
public void addCount(String page) { executor.execute(new AddTask(page)); }
public int getCount(String page) { Future<Integer> future = executor.submit(new GetTask(page)); return future.get(); // exception handling omitted } // continued...
37
Many writers, few readers Asynchronous (background) processor
// continued private class AddTask implements Runnable { private final String page; AddTask(String page) { this.page = page; }
public void run() { Integer value = map.get(page); value = (value == null) ? 1 : value+1; map.put(page, value); } } // continued...
38
Many writers, few readers Asynchronous (background) processor
// continued private class GetTask implements Callable<Integer> { private final String page; GetTask(String page) { this.key = page; }
public Integer call() { Integer value = map.get(page); return (value == null) ? 0 : value; } } }
39
Agenda Introduction
Patterns & anti-patterns Warm-up: “double-checked locking” on collections Many readers, few writers Many writers, few readers
Bonus: configuring a ThreadPoolExecutor
Closing...
40
Configuring a ThreadPoolExecutor Right configuration that fits your use case and demand is
extremely important
Badly configured ThreadPoolExecutors cause exceptions and performance issues RejectedExecutionExceptions anyone?
41
Configuring a ThreadPoolExecutor Simple rules for ThreadPoolExecutor behavior
When a task is submitted: 1. If the core size has not been reached, a new thread is always
created 2. If the core size is reached, the task is queued 3. If the core size is reached and the queue becomes full, a new
thread is created until the max size is reached 4. If the max size is reached and the queue is full, the rejected
execution policy kicks in
42
Configuring a ThreadPoolExecutor Importance of core size
ThreadPoolExecutor changes behavior dramatically around the core size
Below core size, threads are always created even if there are idle threads
Above core size, the preferred behavior shifts to queuing
Core size should be big enough to accommodate the anticipated average task throughput demand
43
Configuring a ThreadPoolExecutor Thread pool size and queue size are competing parameters
Queuing increases latency but conserves resource A queued task in general consumes less resource than an
active task
Reduce latency Conserve resource
Larger pool sizeSmaller queue
Smaller pool sizeLarger queue
44
Closing... Power of static analysis
Whenever we find an issue, we try to turn it into a static analysis rule
FindBugs already has many useful thread-safety rules Intent is the most difficult part with thread-safety analysis:
annotations help
Continued training helps as well
45
Closing...
46
Thank you! Questions?
47
TPE: Cancelling tasks Cancelling tasks: more complicated than you think
Cancelling tasks is your job Timing out from Future.get() does NOT cancel the task by
itself Some TPE methods cancel outstanding tasks for you:
invokeAll() with timeout, invokeAny()
Cancelling tasks uses interruption: you should write your task to respond to cancellation promptly (i.e. “interruptible”)
48
TPE & UncaughtExceptionHandler UncaughtExceptionHandler doesn’t mix with
ThreadPoolExecutor
49
TPE & UncaughtExceptionHandler Multi-threaded test with vanilla thread
class TestWithThreads extends TestCase { @Test public void test() { MyHandler h = new MyHandler(); Thread th = new Thread(someRunnable); th.setUncaughtExceptionHandler(h); th.start(); th.join(); // check MyHandler for any exception on thread th }
private static class MyHandler implements UncaughtExceptionHandler { public void uncaughtException(Thread t, Throwable e) { // store the exception } } }
50
TPE & UncaughtExceptionHandler Multi-threaded test with TPE stops working: why?
class BrokenTestWithExecutor extends TestCase { private ExecutorService executor = Executors.newSingleThreadExecutor();
@Test public void test() { MyHandler h = new MyHandler(); Thread.setDefaultUncaughtExceptionHandler(h); executor.submit(someRunnable).get(); // check MyHandler for any exception on thread th }
private static class MyHandler implements UncaughtExceptionHandler { public void uncaughtException(Thread t, Throwable e) { // store the exception } } }
51
TPE & UncaughtExceptionHandler Remember what UncaughtExceptionHandlers are for!
UncaughtExceptionHandlers are invoked only if the thread is being terminated due to an uncaught exception
Some (not all) TPE methods catch and handle all exceptions
ThreadPoolExecutor execute(): triggers UncaughtExceptionHandlers submit(): does not trigger them ScheduledThreadPoolExecutor: does not trigger them
52
TPE & UncaughtExceptionHandler Simply don’t rely on UncaughtExceptionHandlers with TPE
Using Future and ExecutionException is the right way with TPE
53
TPE & UncaughtExceptionHandler Multi-threaded test with TPE: correct
class CorrectTestWithExecutor extends TestCase { private ExecutorService executor = Executors.newSingleThreadExecutor();
@Test public void test() { try { executor.submit(someRunnable).get(); } catch (ExecutionException e) { // its cause is the original exception Throwable cause = e.getCause(); // assert failure } catch (InterruptedException e2) { ... } } }
54