Concurrency grab bag: JavaOne 2010

Sangjin Lee & Debashis Saha eBay Inc.

Agenda   Introduction

  Patterns & anti-patterns   Warm-up: “double-checked locking” on collections   Many readers, few writers   Many writers, few readers

  Bonus: configuring a ThreadPoolExecutor

  Closing...

2

Introduction   The main goal is two-fold: correctness first, and

performance/scalability next

  Problems tend to repeat themselves: anti-patterns work as visual “crutches” to spot bad smell

3


  Patterns & anti-patterns   Warm-up: “double-checked locking” on collection   Many readers, few writers   Many writers, few readers


  Closing...

4

“Double-checked locking” on collection   Initialize a collection lazily

class Unsafe { private Map<String,Object> map = null;

public void useMap() { if (map == null) { initMap(); } // read the map; get(), iterate, ... }

private synchronized void initMap() { if (map == null) { map = new HashMap<String,Object>(); // populate the map with initial data } } }

5

“Double-checked locking” on collection   It’s worse than the real double-checked locking pattern

  Why would one do this?   Delay the expensive operation of populating the data   You don’t want to incur penalty on reads: once the map is set

up, it’s read-only

  But is laziness really necessary?

6

“Double-checked locking” on collection   “Eager” fix

class Safe { private final Map<String,Object> map;

public Safe() { map = new HashMap<String,Object>(); // populate the map with initial data }

public void useMap() { // read the map; get(), iterate, ... } }

7

“Double-checked locking” on collection   Fix using volatile if the data is optional & large

class Safe { private volatile Map<String,Object> map = null;

public void useMap() { if (map == null) { initMap(); } // read the map; get(), iterate, ... }

private synchronized void initMap() { if (map == null) { Map<String,Object> temp = new HashMap<String,Object>(); // populate temp with initial data map = temp; // make it available after it’s ready } } }

8




  Closing...

9

Many readers, few writers   Use cases: change data only on demand (e.g.

configuration), ...

  Implementation choices 1.  Synchronized data structure 2.  Concurrent collections (e.g. ConcurrentHashMap) 3.  ReadWriteLock 4.  “Copy-on-write”

10

Many readers, few writers   Example: using synchronization

class Synchronized { private final List<String> list = new ArrayList<String>();

// the entire iteration must be synchronized public synchronized void iterateOnList() { for (String s: list) { // do something with s } }

public synchronized void add(String value) { list.add(value); } }

11

Many readers, few writers   Example: using ReadWriteLock

class UsingReadWriteLock { private final List<String> list = new ArrayList<String>(); private final ReadWriteLock lock = new ReentrantReadWriteLock();

public void iterateOnList() { lock.readLock().lock(); try { for (String s: list) { // do something with s } } finally { lock.readLock().unlock(); } }

// continued...

12

Many readers, few writers   Example: using ReadWriteLock

// continued public void add(String value) { lock.writeLock().lock(); try { list.add(value); } finally { lock.writeLock().unlock(); } } }

13

Many readers, few writers   Copy-on-write

  If writes are truly few and far between, and you want reads to be as fast as possible, copy-on-write is an option

  You copy and replace the entire data on every write   You eliminate synchronization on reads, and shift the burden

to writes

  Writes usually become much more expensive   example: java.util.concurrent.CopyOnWriteArrayList

14

Many readers, few writers   Example: using copy-on-write

class CopyOnWrite { private volatile List<String> list = new ArrayList<String>();

public void iterateOnList() { // no locking needed for (String s: list) { // do something with s } }

public synchronized void add(String value) { // need mutual exclusion List<String> copy = new ArrayList<String>(list); // create a copy copy.add(value); list = copy; } }

15

Many readers, few writers   What’s wrong with this?

class BadCopyOnWrite { private volatile List<String> list = new ArrayList<String>();

public void iterateOnList() { // no locking needed for (int i = 0; i < list.size(); i++) { String s = list.get(i); // do something with s } }

public synchronized void add(String value) { // need mutual exclusion List<String> copy = new ArrayList<String>(list); // create a copy copy.add(value); list = copy; } }

16

Many readers, few writers   Of course you can simply use CopyOnWriteArrayList!

class CopyOnWrite2 { private final List<String> list = new CopyOnWriteArrayList<String>();

public void iterateOnList() { // no locking needed for (String s: list) { // do something with s } }

public void add(String value) { list.add(value); } }

17

Many readers, few writers

Type Concurrency Staleness behavior

Fully synchronized Not concurrent Updates CANNOT occur during read

ConcurrentHashMap Reads concurrent; writes can be concurrent; read-write can be concurrent

May reflect SOME updates during read

ReadWriteLock Reads concurrent; writes not concurrent; read-write not concurrent

Updates CANNOT occur during read

Copy-on-write Reads concurrent; writes not concurrent; read-write concurrent

Reflects NO updates during read

18

Many readers, few writers   For Maps, copy-on-write is less useful as

ConcurrentHashMap is usually good enough

  ReadWriteLock is an option, but is less concurrent than and performs more poorly than ConcurrentHashMap

  Copy-on-write has the best read performance

19

Many readers, few writers   Copy-on-write: caveats

  The write performance   The staleness behavior should be acceptable (it usually is)   The direct reference to the underlying data that is copied

should not escape the object   Stale data   Memory leaks

20

Many readers, few writers   What should we use?

  If the (read) concurrency is low, synchronization is often good enough

  Choose concurrent collections (ConcurrentHashMap, etc.) if applicable

  Use copy-on-write if concurrent collections are not applicable and write performance is not a concern

21

Many readers, few writers   How about copy-on-write on MULTIPLE variables?

22

Many readers, few writers   Multi-variable example: using synchronization

class Synchronized { private Map<String,String> current = new HashMap<String,String>(); private Map<String,String> previous = null;

public synchronized void shift() { previous = current; current = new HashMap<String,String>(); }

public synchronized void putValue(String key, String value) { current.put(key, value); }

public synchronized void getValue(String key) { return current.get(key); } }

23

Many readers, few writers   Copy-on-write on multiple variables

  Use a container class with those variables   Do a volatile copy-and-replace with the container object

24

Many readers, few writers   Multi-variable example: use a container class

class ShiftingWindow { final Map<String,String> current; final Map<String,String> previous;

public ShiftingWindow(Map<String,String> c, Map<String,String> p) { current = c; previous = p; } }

25

Many readers, few writers   Multi-variable example: use a container class

class CopyOnWrite { private volatile ShiftingWindow window = new ShiftingWindow(new ConcurrentHashMap<String,String>(), null);

public synchronized void shift() { // copy on write ShiftingWindow newWindow = new ShiftingWindow(new ConcurrentHashMap<String,String>(), window.current); window = newWindow; }

public void putValue(String key, String value) { // no locking window.current.put(key, value); }

public void getValue(String key) { // no locking return window.current.get(key); } }

26




  Closing...

27

Many writers, few readers   Use cases: logging, counters, statistics, ...

  Produce secondary data (e.g. URL counts) from primary operations (serving URLs)

  Many writers: all servlet threads will update the data frequently

  Few readers: the data will be read on demand (reporting) or periodically

  Impact on the primary operations must be minimized

28

Many writers, few readers   Implementation choices

1.  Synchronized data structure 2.  ConcurrentHashMap (for a map or set) 3.  Asynchronous (background) processor

29

Many writers, few readers 1.  Synchronized data structure

  Not recommended   Can induce a hotly contended lock under high level of

concurrency, and turn into a scalability hot spot

2.  ConcurrentHashMap   Normally the best solution   Scales well under high level of concurrency

3.  Asynchronous (background) processor   Useful pattern if ConcurrentHashMap is not an option or

write operations are serial in nature

30

Many writers, few readers   Synchronized data structure

class SynchronizedCounter { private final Map<String,Integer> map = new HashMap<String,Integer>();

public synchronized void addCount(String page) { Integer value = map.get(page); value = (value == null) ? 1 : value+1; map.put(page, value); }

public synchronized int getCount(String page) { Integer value = map.get(page); return (value == null) ? 0 : value; } }

31

Many writers, few readers   ConcurrentHashMap

class ConcurrentHashMapCounter { private final ConcurrentMap<String,AtomicInteger> map = new ConcurrentHashMap<String,AtomicInteger>();

public void addCount(String page) { AtomicInteger value = map.get(page); if (value == null) { value = new AtomicInteger(0); AtomicInteger old = map.putIfAbsent(page, value); if (old != null) { value = old; } } value.incrementAndGet(); }

// continued...

32

Many writers, few readers   ConcurrentHashMap

// continued public int getCount(String page) { AtomicInteger value = map.get(page); return (value == null) ? 0 : value.get(); } }

33

Many writers, few readers   Asynchronous (background) processor

  A single background processor thread owns the data   Primary threads produce tasks for the background processor   Writes and reads are actually done on the background

processor thread

34

Many writers, few readers   Asynchronous (background) processor: benefits

  Latency on the primary threads is minimized   Contention is greatly reduced: can yield much better

throughput than synchronization   Trivially thread safe: exploits safety via thread confinement

  Example: logging to disk/console

35

Many writers, few readers   Asynchronous (background) processor: caveats

  The data structure should not escape the background thread   The actual tasks should be thread-agnostic   Performs poorly against a more concurrent solution   Code becomes bit more complicated   You need to manage saturation: tasks may be produced faster

than they can be handled by the processor

36


class BackgroundCounter { // background thread private final ExecutorService executor = Executors.newSingleThreadExecutor(); // map is exclusively used by the executor thread private final Map<String,Integer> map = new HashMap<String,Integer>();

public void addCount(String page) { executor.execute(new AddTask(page)); }

public int getCount(String page) { Future<Integer> future = executor.submit(new GetTask(page)); return future.get(); // exception handling omitted } // continued...

37


// continued private class AddTask implements Runnable { private final String page; AddTask(String page) { this.page = page; }

public void run() { Integer value = map.get(page); value = (value == null) ? 1 : value+1; map.put(page, value); } } // continued...

38


// continued private class GetTask implements Callable<Integer> { private final String page; GetTask(String page) { this.key = page; }

public Integer call() { Integer value = map.get(page); return (value == null) ? 0 : value; } } }

39




  Closing...

40

Configuring a ThreadPoolExecutor   Right configuration that fits your use case and demand is

extremely important

  Badly configured ThreadPoolExecutors cause exceptions and performance issues   RejectedExecutionExceptions anyone?

41

Configuring a ThreadPoolExecutor   Simple rules for ThreadPoolExecutor behavior

  When a task is submitted: 1.  If the core size has not been reached, a new thread is always

created 2.  If the core size is reached, the task is queued 3.  If the core size is reached and the queue becomes full, a new

thread is created until the max size is reached 4.  If the max size is reached and the queue is full, the rejected

execution policy kicks in

42

Configuring a ThreadPoolExecutor   Importance of core size

  ThreadPoolExecutor changes behavior dramatically around the core size

  Below core size, threads are always created even if there are idle threads

  Above core size, the preferred behavior shifts to queuing

  Core size should be big enough to accommodate the anticipated average task throughput demand

43

Configuring a ThreadPoolExecutor   Thread pool size and queue size are competing parameters

  Queuing increases latency but conserves resource   A queued task in general consumes less resource than an

active task

Reduce latency Conserve resource

Larger pool sizeSmaller queue

Smaller pool sizeLarger queue

44

Closing...   Power of static analysis

  Whenever we find an issue, we try to turn it into a static analysis rule

  FindBugs already has many useful thread-safety rules   Intent is the most difficult part with thread-safety analysis:

annotations help

  Continued training helps as well

45

Closing...

46

Thank you!   Questions?

47

TPE: Cancelling tasks   Cancelling tasks: more complicated than you think

  Cancelling tasks is your job   Timing out from Future.get() does NOT cancel the task by

itself   Some TPE methods cancel outstanding tasks for you:

invokeAll() with timeout, invokeAny()

  Cancelling tasks uses interruption: you should write your task to respond to cancellation promptly (i.e. “interruptible”)

48

TPE & UncaughtExceptionHandler   UncaughtExceptionHandler doesn’t mix with

ThreadPoolExecutor

49

TPE & UncaughtExceptionHandler   Multi-threaded test with vanilla thread

class TestWithThreads extends TestCase { @Test public void test() { MyHandler h = new MyHandler(); Thread th = new Thread(someRunnable); th.setUncaughtExceptionHandler(h); th.start(); th.join(); // check MyHandler for any exception on thread th }

private static class MyHandler implements UncaughtExceptionHandler { public void uncaughtException(Thread t, Throwable e) { // store the exception } } }

50

TPE & UncaughtExceptionHandler   Multi-threaded test with TPE stops working: why?

class BrokenTestWithExecutor extends TestCase { private ExecutorService executor = Executors.newSingleThreadExecutor();

@Test public void test() { MyHandler h = new MyHandler(); Thread.setDefaultUncaughtExceptionHandler(h); executor.submit(someRunnable).get(); // check MyHandler for any exception on thread th }

private static class MyHandler implements UncaughtExceptionHandler { public void uncaughtException(Thread t, Throwable e) { // store the exception } } }

51

TPE & UncaughtExceptionHandler   Remember what UncaughtExceptionHandlers are for!

  UncaughtExceptionHandlers are invoked only if the thread is being terminated due to an uncaught exception

  Some (not all) TPE methods catch and handle all exceptions

  ThreadPoolExecutor   execute(): triggers UncaughtExceptionHandlers   submit(): does not trigger them   ScheduledThreadPoolExecutor: does not trigger them

52

TPE & UncaughtExceptionHandler   Simply don’t rely on UncaughtExceptionHandlers with TPE

  Using Future and ExecutionException is the right way with TPE

53

TPE & UncaughtExceptionHandler   Multi-threaded test with TPE: correct

class CorrectTestWithExecutor extends TestCase { private ExecutorService executor = Executors.newSingleThreadExecutor();

@Test public void test() { try { executor.submit(someRunnable).get(); } catch (ExecutionException e) { // its cause is the original exception Throwable cause = e.getCause(); // assert failure } catch (InterruptedException e2) { ... } } }

54

Concurrency grab bag: JavaOne 2010

Technology

Concurrency grab bag: JavaOne 2010