So you think you know pub/sub ? Udi Dahan in Particular @udidahan
So you think you knowpub/sub ?
Udi Dahanin Particular
@udidahan
Agenda
Basics Patterns Distribution
Publish/subscribe basics
Enables one-to-many communication
Pub
Sub1
Sub2
Sub3
Publish/subscribe basics
Enables one-to-many communication
Should really be called “subscribe/publish”
Pub
Sub1
Sub2
Sub3
sub
sub
sub
Publish/subscribe basics
Enables one-to-many communication
Should really be called “subscribe/publish”
Not the same as multicast – it’s more reliable
Publisher
Subscriber
Subscriber
Subscriber
Subscriber
Publish/subscribe basics
Enables one-to-many communication
Should really be called “subscribe/publish”
Not the same as multicast – it’s more reliable
Is about logical, not physical data distribution
Each event should be processed once
Publisher
Sub3_2
Sub3_1
Sub2
Sub1
Sub3 LB
But what about data sync?
Keeping in-proc caches synced in a web farm
Use a distributed cache for that (Redis, etc)
Do not build your own distributed cacheNot unless you absolutely HAVE to
Subscribers can be publishers too
Think peer-to-peer, not client/server
PS1
PS2
PS3PS4
Avoid shared resources
Shared databases create tight coupling
PS1
PS2
PS3PS4
DB
Seek out autonomy
But preserve the “single source of truth”
PS1
PS2
PS3PS4
Basics
Patterns
Distributio
n
Events – not commands
Always publish events – not commands
Examples:OrderCancelled, AccountCreated
Something that already happened – a factSubscribers can’t invalidate events
But what about failures?
Technological failures
Deserialization failuresMove off to “error” queue for admin to handleLikely to be returned for reprocessing later
Transient failures (deadlocks & other exceptions)
Retry + backoff & escalate to error queue
Process & server crashesTX processing for complete rollback*
Insufficient transactionality risks
DB
Q
Q
Entity ID
Entity IDnot in DB
System gets out of sync
Careful with XYZ_Updated events
Simple CRUD domains less suitable for implementation on top of pub/sub
In-order event processing usually not guaranteed
Can be mitigated with sequence numbers
… and logic which matches them to entity versions
Consider “Valid-to/from” semantics
Auditing / Journaling
Copy msg to another queue after processing
Supported out-of-the-box by most queuesExtract to longer-term storage
So the queue doesn’t “explode”
A central log of everything that happened
Can be difficult to interpret by itself
Leveraging message headers
Endpoint 1
Message ID: 1
Conversation ID: 1Message ID: 2
Conversation ID: 1Message ID: 3
Audit
Endpoint 2 Endpoint 3
Maintain a conversation ID header for cross-endpoint message flows
Basics
Patterns
Distributio
n
Content-based “pub/sub”
When subscriber-side filtering won’t scale
User defines rules about what’s “interesting”
And that can change at runtime
It’s primarily about physical data distribution
Not logical division of responsibilities
Finance
Subscribing to updates of specific stocks
Industrial / Internet of Things
Subscribing to events about sensor states
Solutions – well, it depends
For small numbers of users (internal employees)
Keep a single distributed cache up to dateHave user machines poll the cache every second
Across multiple sites, have a cache at each site
User machines poll the cache of their site
In short – no real use of pub/sub
“Clicks & mortar” Retail
Distributing price changes / end-of-day orders
“Pub/sub” between geographic sites
Also focused on data distribution
Often want visibility into progress of distribution
Which sites haven’t received the data yet
Geographic sites tend to have business meaning
“Clicks & mortar” Retail
Cross-site distribution done within a SOA service
Not really pub/sub
Summary
Basics
Patterns
Distributio
n
Q&A
Thank you
Udi Dahanin Particular