Virtual Queues as a Trade Processing Pattern Uri Cohen @uri1803 | github.com/uric Head of Product @ GigaSpaces
Feb 23, 2016
Virtual Queues as a Trade Processing Pattern
Uri Cohen @uri1803 | github.com/uric
Head of Product @ GigaSpaces
Event Processing at Massive Scale Approaches to Concurrency
Uri Cohen @uri1803 | github.com/uric
Head of Product @ GigaSpaces
This is What It Used to Be Like
That’s What
It’s Like Now
Some Numbers
15 Billion Trades / Day on
NYSE alone
http://www.nytimes.com/2011/08/27/business/as-trade-volumes-soar-exchanges-cash-in.html
Some Numbers
That’s
641K Trades / Second
http://www.nytimes.com/2011/08/27/business/as-trade-volumes-soar-exchanges-cash-in.html
Some Numbers
12 Billion Shares change
hands every day
http://www.bloomberg.com/news/2012-01-23/stock-trading-is-lowest-in-u-s-since-2008.html
Some Numbers
$4 Million The cost of 1
millisecond of latency to a broker
http://www.tabbgroup.com/PublicationDetail.aspx?PublicationID=346
The Problem
Massive stream of events
Time is money, literally
Can’t lose a single message
Fairness is a must
Order Book -
Simplistic Example
Buy Sell 50, $12 60, $10
60, $11 100, $11
30, $10 30, $12
Buy Sell 50, $12 60, $10
60, $11 100, $11
30, $10 30, $12
Price: $10
Order Book -
Simplistic Example
Buy Sell 60, $11 10, $10
30, $10 100, $11
30, $12
Price: $10
Order Book -
Simplistic Example
Buy Sell 60, $11 10, $10
30, $10 100, $11
30, $12
Price: $10
Order Book -
Simplistic Example
Buy Sell 50, $11 100, $11
30, $10 30, $12
Price: $10
Order Book -
Simplistic Example
Buy Sell 50, $11 100, $11
30, $10 30, $12
Price: $11
Order Book -
Simplistic Example
Buy Sell 30, $10 50, $11
30, $12
Price: $11
Order Book -
Simplistic Example
What it Really Means
Low latencyIn memory, GC tuning
Scalability Multi-coreMulti-node
OrderingBy price, order time
Exclusivity
Resiliency
Trading is Just One Use Case
All things FCFS, with a limited stock
Flight booking
Betting
Online Auctions
Cloud Spot Instances
eCommerce
Let’s Talk Solutions
Queue (SEDA/Actor Style)
Not Validated Validated Processed
Validator Processor
Queue (SEDA /Actor Style)
The Good: Ordered (Is it fair?)Multi-threaded
The Bad: Not very scalable
LockingContext
switchingTransient
The Cost of Locking
Method Time in msec
Single Thread 300
Single Thread w/ Lock 10.000
2 Threads w/ Lock 224.000
Single Thread w/ CAS 5.700
2 Threads w/ CAS 30.000
Single Thread w/ Volatile Write 4.700
http://disruptor.googlecode.com/files/Disruptor-1.0.pdf
Queue (Lack of) Fairness
Consumer Thread 1
Consumer Thread 2
60
50 Buy Sell 100
Queue (Lack of) Fairness
Consumer Thread 1
Consumer Thread 2
50
60
Buy Sell 100
Queue (Lack of) Fairness
Consumer Thread 1
Consumer Thread 2
Buy Sell 100
60 50
Can you tell which order will be executed 1st?
Single-Threaded Queue
Validator Processor
Single- Threaded
Queue
The Good: Fast, very fast
No contentionNo context
switchesAlways fair
The Bad: Multi-core?
Not fit for Intense compute & I/O
Need to be async.Transient
Single- Threaded
Queue
They do it…
Disruptor (LMAX)
Segmented Queue
Symbol=A-H Symbol=I-S Symbol=T-Z
Validator Processor Processor
Processor thread pool per segment
Segmented Queue - Optimization Single Processor thread pool, pick random segment
Symbol=A-H Symbol=I-S Symbol=T-Z
Processor
Segmented Queue
The Good: Scalable
But segments can get hot
Minimizes contention
The Bad: Not trivial to
implementStill unfair
Is total ordering needed?
Transient
What about
Fairness?
Exclusivity is Key Process one message for each segment at the same time
No exclusivity across segments
Implicit ExclusivitySingle processor thread per segment
Symbol=A-H Symbol=I-S Symbol=T-Z
Processor Processor Processor
Explicit ExclusivityShared thread pool, mark segments under processing (CAS)
Segment 1 Segment 2 Segment 3
Processor
Segment 1
Segment 2
Segment 3
Explicit ExclusivityShared thread pool, mark segments under processing (CAS)
Segment 1 Segment 2 Segment 3
Processor
Segment 1 X
Segment 2
Segment 3
Explicit ExclusivityShared thread pool, mark segments under processing (CAS)
Segment 1 Segment 2 Segment 3
Processor
Segment 1 X
Segment 2
Segment 3 X
Explicit ExclusivityShared thread pool, mark segments under processing (CAS)
Segment 1 Segment 2 Segment 3
Processor
Segment 1
Segment 2
Segment 3 X
Explicit Exclusivity
Num. of segments is keyToo few: little concurrencyToo many: wasting memory
Dynamic Segmentation Segments are created and removed as needed
Processor
Dynamic Segmentation Segments are created and removed as needed
“GOOG”
Processor
“GOOG”
Dynamic Segmentation Segments are created and removed as needed
“GOOG”
Processor
“GOOG”
Dynamic Segmentation Segments are created and removed as needed
“GOOG”
Processor
GOOG
AAPL
“AAPL”
Dynamic Segmentation Segments are created and removed as needed
“GOOG”
Processor
GOOG X
AAPL
AMZN
“AAPL” “AMZN”
Dynamic Segmentation
Segments created as needed
Randomize on segments until available one found
Fast, scalable, fair
We call it “FIFO groups” or “Virtual Queues”
It Can (and Does) Get Much
More Complex
Memory state can get corrupt on errors
It’s not always as simple as “pop off the queue”
limits, priorities, circuit breakers, etc.
Resiliency is always a pain
A Bit about Usability
What you don’t want to do
Implement data structuresHandle concurrencyHandle HA Handle transactions
A Bit about Usability
What you want to control
Event flow Grouping attribute (e.g
symbol)Event handlers
Data Grid as a
Foundation
Transactional
Highly available
Supports complex matching
How We Thought
of It
How We Thought
of It
How We Thought
of It
How We Thought
of It
How We Thought
of It
Thank You!
References: http://martinfowler.com/articles/lmax.htmlhttp://www.nytimes.com/2011/08/27/business/as-trade-volumes-soar-exchanges-cash-in.html http://disruptor.googlecode.com/files/Disruptor-1.0.pdfhttp://www.gigaspaces.com/wiki/display/XAP9/FIFO+Grouping