Speculative Locking: Breaking the Scale Barrier (JAOO 2005)

Post on 10-May-2015

1099 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

This is a 2005 presentation on the use of transactional memory to support parallelism through synchronized block semantics. Measurements done on Azul's Vega hardware, which was the first commercial hardware to ship with HTM support. Many lessons learned since then, but a good reference point in time, and with Intel x86 now supporting similar HTM capabilities, we're sure to see this subject revived.

Transcript

© 2005 Azul Systems, Inc. | Confidential

Speculative Locking:Breaking the Scale Barrier

Gil Tene, VP Technology, CTO

Ivan Posva, Senior Staff Engineer

Azul Systems

www.azulsystems.com

©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.2

New JVM capabilities improvemulti-threaded application scalability.

How can this affect the way you code?

Speculative locking reduces effects of Amdahl's law

Multi-threaded Java Apps can Scale

www.azulsystems.com

©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.3

Agenda

Why do we care?

Lock contention vs. Data contention

Transactional synchronized {…}

Measurements

Effects on how you code

Summary

www.azulsystems.com

©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.4

Multithreaded everywhere

Why do we care?

• Java™ Applications naturally multi-threaded─ Thread pools, work queues, shared Collections

• Multi-core CPUs from all major vendors─ 2 or more cores per chip─ 2 or more threads per core─ A commodity 4 chip server will soon have 16 threads─ Heavily multicore/multithreaded chips are here

• Amdahl’s law affects everyone─ Serialized portions of program limit scale

www.azulsystems.com

©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.5

Serialized portions of program limit scaleAmdahl’s Law

• efficiency = 1/(N*q + (1-q))─ N = # of concurrent threads─ q = fraction of serialized code

www.azulsystems.com

©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.6

Amdahl’s Law Effect on Throughput

www.azulsystems.com

©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.7

Amdahl’s Law Example

• The theoretical limit is usually intuitive ─ Assume 10% serialization─ At best you can do 10x the work of 1 CPU

• Efficiency drops are dramatic and may be less intuitive─ Assume 10% Serialization─ 10 CPUs will not scale past a speedup of 5.3x (Eff. 0.53)─ 16 CPUs will not scale past a speedup of 6.4x (Eff. 0.48)─ 64 CPUs will not scale past a speedup of 8.8x (Eff. 0.14)─ 99 CPUs will not scale past a speedup of 9.2x (Eff. 0.09)─ …─ It will take a whole lot of inefficient CPUs to [never] reach a 10x

www.azulsystems.com

©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.8

Agenda

Why do we care?

Lock contention vs. Data contention

Transactional synchronized {…}

Measurements

Effects on how you code

Summary

www.azulsystems.com

©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.9

• Lock contention:An attempt by one thread to acquire a lock when another thread is holding it

• Data contention:An attempt by one thread to atomically access data when another thread expects to manipulate the same data atomically

Lock Contention vs. Data Contention

www.azulsystems.com

©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.10

Data Contention in aShared Data Structure

• Readers do not contend

• Readers and writers don’t always contend

• Even writers may not contend with other writers

www.azulsystems.com

©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.11

• Need synchronization for correct execution─Critical sections, shared data structures

• Intent is to protect against data contention

• Can’t easily tell in advance─ That’s why we lock…

• Lock contention >= Data contention─ In reality: lock contention >>= Data contention

Locks are typically very conservative

Synchronization and Locking

www.azulsystems.com

©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.12

The industry has already solved a similar problem

Database Transactions

• Semantics of potential failure exposed to the application

• Transactions: atomic group of DB commands ─ All or nothing─ From “BEGIN TRANSACTION” to “COMMIT”

• Data contention results in a rollback─ Leaves no trace

• Application can re-execute until successful

• Optimistic concurrency does scale

www.azulsystems.com

©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.13

Agenda

Why do we care?

Lock contention vs. Data contention

Transactional synchronized {…}

Measurements

Effects on how you code

Summary

There is no spoon.

www.azulsystems.com

www.azulsystems.com

©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.15

What does synchronized mean?

• It does not actually mean:grab lock, execute block, release lock

• It does mean:execute block atomically in relation to other blocks

synchronizing on the same object

• It can be satisfied by the more conservative:execute block atomically in relation to all other threads

• That looks a lot like a transaction

“The Java Language Specification”, “The Java Virtual Machine Specification”, JSR133

www.azulsystems.com

©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.16

Transactional synchronized {…}

• Two basic requirements─ Detect data contention within the block─ Roll back synchronized block on data contention

•synchronized can run concurrently─ Azul uses hardware assist to detect data contention─ Azul VM rolls back synchronized blocks that

encounter data contention

www.azulsystems.com

©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.17

Transactional synchronized {…}

• The Azul VM maintains the semantic meaning of:execute block atomically in relation to all other threads

• Uncontended synchronized blocksrun just as fast as before

• Data contended synchronized blocksstill serialize execution

• synchronized blocks without data contentioncan execute in parallel

www.azulsystems.com

©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.18

It’s all transparent

Transactional synchronized {…}

• No changes to Java code─ The VM handles everything

• Nested synchronized blocks─ Roll back to outermost transactional synchronized

• Reduces serialization

• Amdahl’s Law now only reflects data contention─ Desire to reduce data contention

www.azulsystems.com

©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.19

How does it fit in the current locking schemes?

Implementation in a VM

• Thin locks handle uncontended synchronized blocks─ Most common case─ Uses CAS, no OS interaction

• Thick locks handle data contended synchronized blocks─ Blocks in the OS

• Transactional monitors handle contended synchronized blocks that have no data contention─ Execute synchronized blocks in parallel─ Uses HW support

www.azulsystems.com

©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.20

Agenda

Why do we care?

Lock contention vs. Data contention

Transactional synchronized {…}

Measurements

Effects on how you code

Summary

www.azulsystems.com

©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.21

Data Contention and Hashtables

• Examples of no data contention in a Hashtable─ 2 readers─ 1 reader, 1 writer, different hash buckets─ 2 writers, different hash buckets

• Examples of data contention in a Hashtable─ 1 reader, 1 writer in same hash bucket─ 2 writers in same hash bucket

www.azulsystems.com

©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.22

Measurements: Hashtable (0% writes)

www.azulsystems.com

©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.23

Measurements: Hashtable (5% writes)

www.azulsystems.com

©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.24

Agenda

Why do we care?

Lock contention vs. Data contention

Transactional synchronized {…}

Measurements

Effects on how you code

Summary

www.azulsystems.com

©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.25

How to make use of this new reality?

Coding Techniques

• Use coarse grain synchronization─ Simpler data structures, simpler code─ Simplicity equals stability─ Easier to optimize

• Focus on data contention, not on lock contention

• Reduce unavoidable data contention

• wait() and notify() can become the dominant reason for serialized execution─ Stripe queues and other uses of wait()/notify()

www.azulsystems.com

©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.26

• You can spend effort to reduce lock contention─ Reader/writer lock─ Stripe locks per bucket─ Stripe reader/writer locks per bucket─ How do you grow the table?─ Gets complex fast

• But there is no lock, it’s a synchronized block

• With transactional synchronized─ Keep synchronization coarse─ Focus on data contention

Why Coarse Grain Synchronization?

www.azulsystems.com

©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.27

Minimizing Data Contention 1

private Object table[];private int size;

public synchronized void put(Object key, Object val) { … // missed, insert into table table[idx] = new HashEntry(key, val, table[idx]); size++; // writer data contention}

public synchronized int size() { return size;}

www.azulsystems.com

©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.28

Minimizing Data Contention 2

private Object table[];private int sizes[];

public synchronized void put(Object key, Object val) { … // missed, insert into table table[idx] = new HashEntry(key, val, table[idx]); sizes[idx]++; // reduced writer data contention}

public synchronized int size() { int size = 0; for (int i=0; i<sizes.length; i++) size += sizes[i]; return size;}

www.azulsystems.com

©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.29

private Object table[];private int sizes[];private int cachedSize;

public synchronized void put(Object key, Object val) { … // missed, insert into table table[idx] = new HashEntry(key, val, table[idx]); sizes[idx]++; cachedSize = -1; // clear the cache}

public synchronized int size() { if (cachedSize < 0) { // reduce size recalculation cachedSize = 0; for (int i=0; i<sizes.length; i++) cachedSize += sizes[i]; } return cachedSize;}

Minimizing Data Contention 3

www.azulsystems.com

©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.30

private Object table[];private int sizes[];private int cachedSize;

public synchronized void put(Object key, Object val) { … // missed, insert into table table[idx] = new HashEntry(key, val, table[idx]); sizes[idx]++; if (cachedSize >= 0) cachedSize = -1; // avoid contention}

public synchronized int size() { if (cachedSize < 0) { cachedSize = 0; for (int i=0; i<sizes.length; i++) cachedSize += sizes[i]; } return cachedSize;}

Minimizing Data Contention 4

www.azulsystems.com

©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.31

Singleton pattern

Double Checked Locking Avoided

• Needs to be synchronized at initialization─ Further synchronization seems to be a waste─ Web is full of examples of how wrong you can go

• Transactional synchronized keeps it simple

public class Simple { private Helper helper = null; public synchronized Helper getHelper() { if (helper == null) { helper = new Helper(); } return helper; // no data contention once initialized } // other functions and members …}http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html

www.azulsystems.com

©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.32

Unavoidable Data Contention

Accounts

public synchronized void deposit(long amount) { balance += amount;}

Points

public synchronized void translate(int dx, int dy) { x += dx; y += dy;}

www.azulsystems.com

©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.33

Example: striping work queues

wait()/notify()

• Stripe work across multiple queues

Task task = new WorkTask(…);Queue queue = queues[task.hashCode() % queues.length];

synchronized (queue) { queue.enqueue(task); queue.notify();}

• Workers can be statically assigned to a queue

synchronized (queues) { queue = queues[num_workers++ % queues.length]; }

while (true){ synchronized (queue) { queue.wait(); Task task = queue.dequeue(); }

task.execute(); }

www.azulsystems.com

©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.34

Agenda

Why do we care?

Lock contention vs. Data contention

Transactional synchronized {…}

Measurements

Effects on how you code

Summary

www.azulsystems.com

©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.35

Summary

• Transparent transactional synchronized() is available

• Simplify data structures, save development time─ Use coarse grain locking─ Let the VM deal with the scaling problem

• Further optimization─ Be aware of data contention─ Stripe stats gathering─ Stripe wait() and notify()

www.azulsystems.com

©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.36

Our Lawyers made us say this…

"Azul Systems, Azul, and the Azul arch logo are trademarks of AzulSystems, Inc. in the United States and other countries. Sun, SunMicrosystems, Java and all Java based trademarks and logos aretrademarks or registered trademarks of Sun Microsystems, Inc. in theUnited States and other countries. Other marks are the property of theirrespective owners and are used here only for identification purposes."

© 2005 Azul Systems, Inc. | Confidential

Q & A

© 2005 Azul Systems, Inc. | Confidential

Thank you.

gil@azulsystems.com

ivan@azulsystems.com

top related