Top Banner
Disque A new distributed message queue @antirez - Redis Labs
37

Disque: a detailed overview of the distributed implementation - Salvatore Sanfilippo

Jan 17, 2017

Download

Data & Analytics

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Disque: a detailed overview of the distributed implementation - Salvatore Sanfilippo

DisqueA new distributed message queue

@antirez - Redis Labs

Page 2: Disque: a detailed overview of the distributed implementation - Salvatore Sanfilippo

Redis roots

• In memory, optional persistence.

• Same protocol.

• BSD license.

Page 3: Disque: a detailed overview of the distributed implementation - Salvatore Sanfilippo

Asynchronous jobs execution, micro services

bus, distributed timer

Page 4: Disque: a detailed overview of the distributed implementation - Salvatore Sanfilippo

API

Page 5: Disque: a detailed overview of the distributed implementation - Salvatore Sanfilippo

ADDJOB queue job <timeout>

Page 6: Disque: a detailed overview of the distributed implementation - Salvatore Sanfilippo

Disque Job IDsDI8497c0098d456946843784d3ea41af5525c741bf05a0SQ

Node ID prefix (32 bit)

Unique Message ID (128 bit)

TTL in minutes (16 bit)

Page 7: Disque: a detailed overview of the distributed implementation - Salvatore Sanfilippo

GETJOB FROM q1 q2

Page 8: Disque: a detailed overview of the distributed implementation - Salvatore Sanfilippo

ACKJOB id1 id2 …

Disque is all about explicit acknowledges.

Page 9: Disque: a detailed overview of the distributed implementation - Salvatore Sanfilippo

Delivery semantics

• At least once by default.

• At most once also available.

Page 10: Disque: a detailed overview of the distributed implementation - Salvatore Sanfilippo

ADDJOB queue job 0 RETRY 3600

Page 11: Disque: a detailed overview of the distributed implementation - Salvatore Sanfilippo

ADDJOB queue job 0 TTL 86400

Page 12: Disque: a detailed overview of the distributed implementation - Salvatore Sanfilippo

ADDJOB queue job 0 DELAY 3600

Page 13: Disque: a detailed overview of the distributed implementation - Salvatore Sanfilippo

GUARANTEES

Page 14: Disque: a detailed overview of the distributed implementation - Salvatore Sanfilippo

Synchronous replicationADDJOB myqueue task1 REPLICATE 3

Page 15: Disque: a detailed overview of the distributed implementation - Salvatore Sanfilippo

ADDJOB queue job 0 ASYNC

Page 16: Disque: a detailed overview of the distributed implementation - Salvatore Sanfilippo

Persistent (optionally)

LOADJOB … data …DELJOB … id …

Append Only File

Page 17: Disque: a detailed overview of the distributed implementation - Salvatore Sanfilippo

Disque & CAP

• AP.

• Immutable messages (mostly).

• Converge to ACK state.

• CAP “A” availability (single node partition).

Page 18: Disque: a detailed overview of the distributed implementation - Salvatore Sanfilippo

At least once delivery

• Liveness: eventually the message will be delivered.

• Safety: messages not yet delivered at least one time will never be evicted from the cluster.

• (But if message TTL is reached).

Page 19: Disque: a detailed overview of the distributed implementation - Salvatore Sanfilippo

At most once delivery

• Safety: messages already dequeued will never be queued a second time.

• An immediate result of replicating to just one node, enqueue just one time (retry time set to zero).

Page 20: Disque: a detailed overview of the distributed implementation - Salvatore Sanfilippo

Federation: all nodes are really the same

Page 21: Disque: a detailed overview of the distributed implementation - Salvatore Sanfilippo

Best effort orderingMain Design Sacrifice

Page 22: Disque: a detailed overview of the distributed implementation - Salvatore Sanfilippo

NACK and retries counters

• Alternative for explicit dead letters.

• Counters consistency is best effort.

• (but it does not matters).

• GETJOB exposes the two counters.

Page 23: Disque: a detailed overview of the distributed implementation - Salvatore Sanfilippo

Disque tries hard to avoid multiple deliveries.

Page 24: Disque: a detailed overview of the distributed implementation - Salvatore Sanfilippo

WHY?

• Costly: think at spikes after partitions or at CP stores to de-dup.

• No de-dup, nor idempotency, in certain uses, if duplication rate is acceptable.

• Not so hard: worth it.

Page 25: Disque: a detailed overview of the distributed implementation - Salvatore Sanfilippo

INTERNALS

Page 26: Disque: a detailed overview of the distributed implementation - Salvatore Sanfilippo

Message states

ACTIVEQUEUEDACKED

Page 27: Disque: a detailed overview of the distributed implementation - Salvatore Sanfilippo

ACTIVE

• Node has a copy.

• Not available for delivery.

• ACTIVE -> QUEUED (On retry timer)

• ACTIVE -> ACKED (On ACK received)

Page 28: Disque: a detailed overview of the distributed implementation - Salvatore Sanfilippo

QUEUED

• Node has a copy.

• Will deliver via GETJOB.

• QUEUED -> ACTIVE (On delivery)

• QUEUED -> ACKED (On ACK received)

Page 29: Disque: a detailed overview of the distributed implementation - Salvatore Sanfilippo

ACKED

• Propagate via SETACK!

• Perform Garbage Collection of message.

• ACKED -> EVICTED (on succesful GC)

Page 30: Disque: a detailed overview of the distributed implementation - Salvatore Sanfilippo

QUEUED

ACTIVE

WILLQUEUE

QUEUED

Sent 500 ms before ACTIVE -> QUEUED

WILLQUEUE MESSAGE

Page 31: Disque: a detailed overview of the distributed implementation - Salvatore Sanfilippo

QUEUED

ACTIVE

QUEUED MESSAGE on ACTIVE -> QUEUED state change

ACKED

QUEUED

Reset retry timer

QUEUED

Dequeue if ID1 > ID2

QUEUED

SETACK

Page 32: Disque: a detailed overview of the distributed implementation - Salvatore Sanfilippo

KNOWN SOURCE

ANY OTHER NODE

NEEDJOBS

YOURJOBS

Exponential delay + Broadcast & ad-hoc

NEEDJOBS

Page 33: Disque: a detailed overview of the distributed implementation - Salvatore Sanfilippo

NEEDJOBS triggers

• Clients blocked with GETJOBS(and queues are empty)

• Queue drops to zero messages(and import rate > 0)

Page 34: Disque: a detailed overview of the distributed implementation - Salvatore Sanfilippo

Message owners

Each node has, for each message,

a list* of owners

* a possibly inconsistent list

Page 35: Disque: a detailed overview of the distributed implementation - Salvatore Sanfilippo

Ehm… some C code./* Job representation in memory. */

typedef struct job {

char id[JOB_ID_LEN]; /* Job ID. */

unsigned int state:4; /* Job state: one of JOB_STATE_* states. */

unsigned int gc_retry:4;/* GC attempts counter, for exponential delay. */

uint8_t flags; /* Job flags. */

uint16_t repl; /* Replication factor. */

uint32_t etime; /* Job expire time. */

uint64_t ctime; /* Job creation time in ms+counter. */

uint32_t delay; /* Delay before to queue this job for 1st time. */

uint32_t retry; /* Job re-queue time. */

uint16_t num_nacks; /* Number of NACKs this node observed. */

uint16_t num_deliv; /* Number of deliveries this node observed. */

Immutable, converging, inconsistent

Page 36: Disque: a detailed overview of the distributed implementation - Salvatore Sanfilippo

Ehm… some C code. robj *queue; /* Job queue name. */

sds body; /* Body, or NULL if job is just an ACK. */

dict *nodes_delivered; /* Nodes that may have a copy. */

dict *nodes_confirmed; /* Nodes that confirmed copy or ack.

mstime_t qtime; /* Next queue time */

mstime_t awakeme; /* Time at which we need to take actions. */

} job;

Page 37: Disque: a detailed overview of the distributed implementation - Salvatore Sanfilippo

github.com/antirez/disque