Top Banner
The Zen of High Performance Messaging with NATS Waldemar Quevedo / @wallyqs Strange Loop 2016
90

The Zen of High Performance Messaging with NATS (Strange Loop 2016)

Jan 22, 2017

Download

Technology

wallyqs
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

 The Zen of High Performance

Messaging with NATS

Waldemar Quevedo / @wallyqs

Strange Loop 2016

Page 2: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

ABOUTWaldemar Quevedo / Software Developer at in SF

Development of the Apcera PlatformPast: PaaS DevOps at Rakuten in TokyoNATS client maintainer (Ruby, Python)

@wallyqsApcera

Page 3: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

ABOUT THIS TALKWhat is NATSDesign from NATSBuilding systems with NATS

Page 4: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

What is NATS?

Page 5: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

NATSHigh Performance Messaging SystemCreated by First written in in 2010

Originally built for Cloud FoundryRewritten in in 2012

Better performanceOpen Source, MIT License

Derek CollisonRuby

Go

https://github.com/nats-io

Page 6: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

THANKS GO TEAM

Small binary → Lightweight Docker image

No deployment dependencies ✌

Page 7: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

Acts as an always available dial-tone

Page 8: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

Performance

Page 9: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

single byte message

Around 10M messages/second

Page 10: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

MUCH BETTER BENCHMARK

From 's awesome blog@tyler_treat

(2014)http://bravenewgeek.com/dissecting-message-queues/

Page 11: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

NATS = Performance + Simplicity

Page 12: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

Design from NATS

Page 13: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

NATSDesign constrained to keep it as operationally simple andreliable as possible while still being both performant andscalable.

Page 14: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

Simplicity Matters!

Page 15: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

Simplicity buys you opportunity.

ー Rich Hickey, Cognitect

Link: https://www.youtube.com/watch?v=rI8tNMsozo0

Page 16: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

LESS IS BETTER

Concise feature set (pure pub/sub)No built-in persistence of messagesNo exactly-once-delivery promises eitherThose concerns are simpli�ed away from NATS

Page 17: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

THOUGHT EXERCISE

Page 18: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

What could be the fastest,simplest and most reliableway of writing and readingto a socket to communicate

with N nodes?

Page 19: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

DESIGNTCP/IP basedPlain text protocolPure pub/sub

�re and forgetat most once

Page 20: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

PROTOCOLPUBSUBUNSUBMSGPINGPONGINFOCONNECT-ERR+OK

Page 21: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

EXAMPLE

Page 22: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

Connecting to the public demo server…

telnet demo.nats.io 4222

INFO {"tls_required":false,"max_payload":1048576,...}

Page 23: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

Optionally giving a name to the client

connect {"name":"nats-strangeloop-client"}+OK

Page 24: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

Pinging the server, we should get a pong back

pingPONG

Page 25: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

Not following ping/pong interval, results in serverdisconnecting us.

INFO {"auth_required":false,"max_payload":1048576,...}PINGPING-ERR 'Stale Connection'Connection closed by foreign host.

Page 26: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

Subscribe to the hello subject identifying it with thearbitrary number 10

sub hello 10+OK

Page 27: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

Publishing on hello subject a payload of 5 bytes

sub hello 10+OKpub hello 5world

Page 28: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

Message received!

telnet demo.nats.io 4222

sub hello 10+OKpub hello 5worldMSG hello 10 5world

Payload is opaque to the server!

It is just bytes, but could be json, msgpack, etc…

Page 29: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

REQUEST/RESPONSEis also pure pub/sub

Page 30: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

PROBLEMHow can we send a request and expect a response backwith pure pub/sub?

Page 31: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

NATS REQUESTSInitially, client making the request creates a subscriptionwith a string:unique identi�er

SUB _INBOX.ioL1Ws5aZZf5fyeF6sAdjw 2+OK

NATS clients libraries have helpers for generating these:

nats.NewInbox()// => _INBOX.ioL1Ws5aZZf5fyeF6sAdjw

Page 32: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

NATS REQUESTSThen it expresses in the topic:limited interest

SUB _INBOX.ioL1Ws5aZZf5fyeF6sAdjw 2UNSUB 2 1

tells the server to unsubscribe from subscription with sid=2after getting 1 message

Page 33: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

NATS REQUESTSThen the request is published to a subject (help), tagging itwith the ephemeral inbox just for the request to happen:

SUB _INBOX.ioL1Ws5aZZf5fyeF6sAdjw 2UNSUB 2 1PUB help _INBOX.ioL1Ws5aZZf5fyeF6sAdjw 6please

Page 34: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

NATS REQUESTSThen i� there is another subscriber connected andinterested in the help subject, it will receive a messagewith that inbox:

# Another client interested in the help subjectSUB help 90

# Receives from server a messageMSG help 90 _INBOX.ioL1Ws5aZZf5fyeF6sAdjw 6please

# Can use that inbox to reply backPUB _INBOX.ioL1Ws5aZZf5fyeF6sAdjw 11I can help!

Page 35: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

NATS REQUESTSFinally, i� the client which sent the request is still connectedand interested, it will be receiving that message:

SUB _INBOX.ioL1Ws5aZZf5fyeF6sAdjw 2UNSUB 2 1PUB help _INBOX.ioL1Ws5aZZf5fyeF6sAdjw 6pleaseMSG _INBOX.ioL1Ws5aZZf5fyeF6sAdjw 2 11I can help!

Page 36: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

Simple Protocol == Simple Clients

Page 37: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

Given the protocol is simple, NATS clients librariestend to have a very small footprint as well.

Page 38: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

CLIENTS

Page 39: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

RUBYrequire 'nats/client'

NATS.start do |nc| nc.subscribe("hello") do |msg| puts "[Received] #{msg}" end

nc.publish("hello", "world")end

Page 40: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

GOnc, err := nats.Connect()// ...nc.Subscribe("hello", func(m *nats.Msg){ fmt.Printf("[Received] %s", m.Data)})nc.Publish("hello", []byte("world"))

Page 41: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

MANY MORE AVAILABLEC C# Java

Python NGINX Spring

Node.js Elixir Rust

Lua Erlang PHP

Haskell Scala Perl

Many thanks to the community!

Page 42: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

ASYNCHRONOUS IONote: Most clients have asynchronous behavior

nc, err := nats.Connect()// ...nc.Subscribe("hello", func(m *nats.Msg){ fmt.Printf("[Received] %s", m.Data)})for i := 0; i < 1000; i ++ { nc.Publish("hello", []byte("world"))}// No guarantees of having sent the bytes yet!// They may still just be in the flushing queue.

Page 43: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

ASYNCHRONOUS IOIn order to guarantee that the published messages havebeen processed by the server, we can do an extraping/pong to con�rm they were consumed:

nc.Subscribe("hello", func(m *nats.Msg){ fmt.Printf("[Received] %s", m.Data)})for i := 0; i < 1000; i ++ { nc.Publish("hello", []byte("world"))}// Do a PING/PONG roundtrip with the server.nc.Flush()

SUB hello 1\r\nPUB hello 5\r\nworld\r\n..PING\r\n

Then �ush the bu�er and wait for PONG from server

Page 44: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

ASYNCHRONOUS IOWorst way of measuring NATS performance

nc, _ := nats.Connect(nats.DefaultURL)msg := []byte("hi")nc.Subscribe("hello", func(_ *nats.Msg) {})for i := 0; i < 100000000; i++ { nc.Publish("hello", msg)}

Page 45: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

 

Page 46: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

The client is a slow consumer since it is not consuming themessages which the server is sending fast enough.

Whenever the server cannot �ush bytes to a client fastenough, it will disconnect the client from the system asthis consuming pace could a�ect the whole service and restof the clients.

NATS Server is protecting itself

Page 47: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

NATS = Performance + Simplicity + Resiliency

Page 48: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

ALSO INCLUDEDSubject routing with wildcards

AuthorizationDistribution queue groups for balancingCluster mode for high availability

Auto discovery of topologySecure TLS connections with certi�cates/varz monitoring endpoint

used by nats-top

Page 49: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

SUBJECTS ROUTINGWildcards: *

SUB foo.*.bar 90PUB foo.hello.bar 2hiMSG foo.hello.bar 90 2hi

e.g. subscribe to all NATS requests being made on the demosite:

telnet demo.nats.io 4222INFO {"auth_required":false,"version":"0.9.4",...}

SUB _INBOX.* 99MSG _INBOX.ioL1Ws5aZZf5fyeF6sAdjw 99 11I can help!

Page 50: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

SUBJECTS ROUTINGFull wildcard: >

SUB hello.> 90PUB hello.world.again 2hiMSG hello.world.again 90 2hi

Subscribe to all subjects and see whole tra�c goingthrough the server:

telnet demo.nats.io 4222INFO {"auth_required":false,"version":"0.9.4",...}sub > 1+OK

Page 51: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

SUBJECTS AUTHORIZATIONClients are not allowed to publish on _SYS for example:

PUB _SYS.foo 2hi-ERR 'Permissions Violation for Publish to "_SYS.foo"'

Page 52: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

SUBJECTS AUTHORIZATIONCan customize disallowing pub/sub on certain subjects viaserver con�g too:

authorization { admin = { publish = ">", subscribe = ">" } requestor = { publish = ["req.foo", "req.bar"] subscribe = "_INBOX.*" }

users = [ {user: alice, password: foo, permissions: $admin} {user: bob, password: bar, permissions: $requestor} ]}

Page 53: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

Building systems with NATS

Page 54: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

FAMILIAR SCENARIOService A needs to talk to services B and C

Page 55: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

FAMILIAR SCENARIOHorizontally scaled…

Page 56: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

COMMUNICATING WITHIN ADISTRIBUTED SYSTEM

Just use HTTP everywhere?Use some form of point to point RPC?What about service discovery and load balancing?What if sub ms latency performance is required?

Page 57: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

 

Page 58: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

WHAT NATS GIVES USpublish/subscribe based low latency mechanism forcommunicating with 1 to 1, 1 to N nodes

An established TCP connection to a server

A dial tone

Page 59: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

COMMUNICATING THROUGH NATSUsing NATS for internal communication

Page 60: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

HA WITH NATS CLUSTERAvoid SPOF on NATS by assembling a full mesh cluster

Page 61: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

HA WITH NATS CLUSTERClients reconnect logic is triggered

Page 62: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

HA WITH NATS CLUSTERConnecting to a NATS cluster of 2 nodes explicitly

srvs := "nats://10.240.0.11:4222,nats://10.240.0.21:4223"nc, _ := nats.Connect(srvs)

Bonus: Cluster topology can be discovered dynamically too!

Page 63: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

CLUSTER AUTO DISCOVERY

Page 64: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

We can start with a single node…

Page 65: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

Then have new nodes join the cluster…

Page 66: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

As new nodes join, server announces INFO to clients.

Page 67: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

Clients auto recon�gure to be aware of new nodes.

Page 68: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

Clients auto recon�gure to be aware of new nodes.

Page 69: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

Now fully connected!

Page 70: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

On failure, clients reconnect to an available node.

Page 71: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

COMMUNICATING USING NATS

Page 72: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

HEARTBEATSFor announcing liveness, services could publish heartbeats

Page 73: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

HEARTBEATS → DISCOVERYHeartbeats can help too for discovering services viawildcard subscriptions.

nc, _ := nats.Connect(nats.DefaultURL)// SUB service.*.heartbeats 1\r\nnc.Subscribe("service.*.heartbeats", func(m *nats.Msg) { // Heartbeat from service received})

Page 74: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

DISTRIBUTION QUEUESBalance work among nodes randomly

Page 75: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

DISTRIBUTION QUEUESBalance work among nodes randomly

Page 76: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

DISTRIBUTION QUEUESBalance work among nodes randomly

Page 77: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

DISTRIBUTION QUEUESService A workers subscribe to service.A and create workers distribution queue group for balancing the work.

nc, _ := nats.Connect(nats.DefaultURL)// SUB service.A workers 1\r\nnc.QueueSubscribe("service.A", "workers", func(m *nats.Msg) { nc.Publish(m.Reply, []byte("hi!")) })

Page 78: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

DISTRIBUTION QUEUESNote: NATS does not assume the audience!

Page 79: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

DISTRIBUTION QUEUESAll interested subscribers receive the message

Page 80: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

LOWEST LATENCY RESPONSEService A communicating with fastest node from Service B

Page 81: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

LOWEST LATENCY RESPONSEService A communicating with fastest node from Service B

Page 82: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

LOWEST LATENCY RESPONSENATS requests were designed exactly for this

nc, _ := nats.Connect(nats.DefaultURL)t := 250*time.Millisecond// Request sets to AutoUnsubscribe after 1 responsemsg, err := nc.Request("service.B", []byte("help"), t)if err == nil { fmt.Println(string(msg.Data)) // => sure!}

nc, _ := nats.Connect(nats.DefaultURL)

nc.Subscribe("service.B", func(m *nats.Msg) { nc.Publish(m.Reply, []byte("sure!"))})

Page 83: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

UNDERSTANDING NATS TIMEOUTNote: Making a request involves establishing a clienttimeout.

t := 250*time.Millisecond_, err := nc.Request("service.A", []byte("help"), t)fmt.Println(err)// => nats: timeout

This needs special handling!

Page 84: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

UNDERSTANDING NATS TIMEOUTNATS is �re and forget, reason for which a client times outcould be many things:

No one was connected at that timeservice unavailable

Service is actually still processing the requestservice took too long

Service was processing the request but crashedservice error

Page 85: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

CONFIRM AVAILABILITY OFSERVICE NODE WITH REQUEST

Each service node could have its own inbox

A request is sent to service.B to get a single response,which will then reply with its own inbox, (no payloadneeded).If there is not a fast reply before client times out, thenmost likely the service is unavailable for us at that time.

If there is a response, then use that inbox in a request

SUB _INBOX.123available 90

PUB _INBOX.123available _INBOX.456helpplease...

Page 86: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

Summary

Page 87: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

NATS is a simple, fast and reliable solution for the internalcommunication of a distributed system.

It chooses simplicity and reliability over guaranteeddelivery.

Though this does not necessarily mean that guarantees of asystem are constrained due to NATS!

→ https://en.wikipedia.org/wiki/End-to-end_principle

Page 88: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

We can always build strong guarantees on top, butwe can't always remove them from below.

Tyler Treat, Simple Solutions to Complex Problems

Replayability can be better than guaranteed delivery

Idempotency can be better than exactly once delivery

Commutativity can be better than ordered delivery

Related NATS project: NATS Streaming

Page 89: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

REFERENCESHigh Performance Systems in Go (Gophercon 2014)

Derek Collison ( )Dissecting Message Queues (2014)

Tyler Treat ( )Simplicity Matters (RailsConf 2012)

Rich Hickey ( )Simple Solutions for Complex Problems (2016)

Tyler Treat ( )End To End Argument (1984)

J.H. Saltzer, D.P. Reed and D.D. Clark ( )

link

link

link

link

link

Page 90: The Zen of High Performance Messaging with NATS (Strange Loop 2016)

THANKS! / github.com/nats-io @nats_io

Play with the demo site!

telnet demo.nats.io 4222