@samnewman #geecon Surfing The Event Stream Sam Newman ThoughtWorks Sunday, 21 July 13
May 10, 2015
@samnewman#geecon
Surfing The Event StreamSam Newman
ThoughtWorks
Sunday, 21 July 13
@samnewman#geeconSunday, 21 July 13
@samnewman#geeconSunday, 21 July 13
@samnewman#geecon
Operational Data
Sunday, 21 July 13
@samnewman#geecon
Operational Data
CPU
Sunday, 21 July 13
@samnewman#geecon
Operational Data
CPU Memory Use
Sunday, 21 July 13
@samnewman#geecon
Operational Data
CPU Memory Use
Threads
Sunday, 21 July 13
@samnewman#geecon
Operational Data
CPU
Disk IO
Memory Use
Threads
Sunday, 21 July 13
@samnewman#geecon
Collection & Display
• sar
• syslog
• collectd
• syslog-ng
• nagios
• ganglia
Sunday, 21 July 13
@samnewman#geecon
Server
Server
Server
Server
Sunday, 21 July 13
@samnewman#geecon
Server
Server
Server
Server
Sunday, 21 July 13
@samnewman#geecon
Server
Server
Server
Server
Sunday, 21 July 13
@samnewman#geecon
Server
Server
Server
Server
Sunday, 21 July 13
@samnewman#geecon
Business Data
Sunday, 21 July 13
@samnewman#geecon
Business Data
Orders Placed
Sunday, 21 July 13
@samnewman#geecon
Business Data
Orders Placed Revenue
Sunday, 21 July 13
@samnewman#geecon
Business Data
Orders Placed Revenue
Fraud Cases
Sunday, 21 July 13
@samnewman#geecon
Business Data
Orders Placed
Bounce Rate
Revenue
Fraud Cases
Sunday, 21 July 13
@samnewman#geecon
How did we handle them?
• Google Analytics
• Data Warehouse Systems
• Log files!
Sunday, 21 July 13
@samnewman#geecon
Something Happened!
Sunday, 21 July 13
@samnewman#geecon
Something Happened!
What Should We Do?
Sunday, 21 July 13
@samnewman#geecon
Something Happened!
What Should We Do?
Sunday, 21 July 13
@samnewman#geecon
Something Happened!
What Should We Do?
Sunday, 21 July 13
@samnewman#geeconSunday, 21 July 13
@samnewman#geecon
http://blog.jgc.org/2006/05/what-slashdot-effect-looks-like.html
Sunday, 21 July 13
@samnewman#geeconSunday, 21 July 13
@samnewman#geecon
Fast
Sunday, 21 July 13
@samnewman#geecon
Fast
And Easy...
Sunday, 21 July 13
@samnewman#geecon
Fast
And Easy...
At Scale
Sunday, 21 July 13
@samnewman#geecon
Aggregation Is Key
Sunday, 21 July 13
@samnewman#geecon
Mark McGranaghan: "Logs as Data"
http://blip.tv/clojure/mark-mcgranaghan-logs-as-data-5953857
Sunday, 21 July 13
@samnewman#geecon
Paul Ingles: "Users as Data"
http://vimeo.com/45136211
Sunday, 21 July 13
@samnewman#geecon
Log Stash + Graylog2
Sunday, 21 July 13
@samnewman#geecon
Log Stash + Graylog2
Sunday, 21 July 13
@samnewman#geecon
Log Stash + Graylog2
Sunday, 21 July 13
@samnewman#geecon
Log Stash + Graylog2
Sunday, 21 July 13
@samnewman#geeconSunday, 21 July 13
@samnewman#geecon
Graphite
Sunday, 21 July 13
@samnewman#geeconSunday, 21 July 13
@samnewman#geeconSunday, 21 July 13
@samnewman#geecon
www01.cpuUsage 42 1286269200
Sunday, 21 July 13
@samnewman#geeconSunday, 21 July 13
@samnewman#geeconSunday, 21 July 13
@samnewman#geeconSunday, 21 July 13
@samnewman#geeconSunday, 21 July 13
@samnewman#geecon
???
Sunday, 21 July 13
@samnewman#geeconSunday, 21 July 13
@samnewman#geeconSunday, 21 July 13
@samnewman#geecon
Graphite
Sunday, 21 July 13
@samnewman#geecon
Graphite
Server
collectd
Sunday, 21 July 13
@samnewman#geecon
Graphite
AppServer
collectd
Sunday, 21 July 13
@samnewman#geecon
Graphite
App
Server
Server
collectd
Sunday, 21 July 13
@samnewman#geecon
Graphite
App
Server
Server
collectd Yammer Metrics
Sunday, 21 July 13
@samnewman#geecon
Graphite
App
Server
Server
collectd Yammer Metrics
Sunday, 21 July 13
@samnewman#geecon
Volume!
Sunday, 21 July 13
@samnewman#geecon
Aggregation!
Sunday, 21 July 13
@samnewman#geecon
www01.cpuUsage 42 1286269200
Sunday, 21 July 13
@samnewman#geecon
orderplaced 1 1286269200
Sunday, 21 July 13
@samnewman#geecon
orderplaced 1 1286269200
orderplaced 1 1286269200
Sunday, 21 July 13
@samnewman#geecon
orderplaced 1 1286269200
orderplaced 1 1286269200
orderplaced = 1
Sunday, 21 July 13
@samnewman#geecon
StatsD
Sunday, 21 July 13
@samnewman#geecon
Counters
ordersplaced:1|c
Sunday, 21 July 13
@samnewman#geecon
timings
orderduration:140|ms
Sunday, 21 July 13
@samnewman#geecon
StatsD
Client Client
Graphite
Sunday, 21 July 13
@samnewman#geecon
StatsD
Client Client
Graphite
Sunday, 21 July 13
@samnewman#geecon
StatsD
Client Client
Graphite
Sunday, 21 July 13
@samnewman#geecon
Riemann
Sunday, 21 July 13
@samnewman#geecon
Riemann
Sunday, 21 July 13
@samnewman#geecon
Riemann
Sunday, 21 July 13
@samnewman#geecon
Riemann
Sunday, 21 July 13
@samnewman#geecon
Riemann
Client Client
Graphite
Sunday, 21 July 13
@samnewman#geeconSunday, 21 July 13
@samnewman#geecon
(service "api req") (percentiles 5 [0.5 0.95 0.99] index))
Sunday, 21 July 13
@samnewman#geecon
(service "api req") (percentiles 5 [0.5 0.95 0.99] index))
Sunday, 21 July 13
@samnewman#geecon
(def tell-ops (rollup 5 3600 (email "[email protected]")))
(streams (where (state "critical") tell-ops))
Sunday, 21 July 13
@samnewman#geecon
(let [client (tcp-client :host "aggregator")] (by [:host :service] (changed :state (forward client))))
Sunday, 21 July 13
@samnewman#geecon
Riemann Server
Client Client
Sunday, 21 July 13
@samnewman#geecon
Riemann Server
Client Client
Riemann Server
Client Client
Sunday, 21 July 13
@samnewman#geecon
Riemann Server
Client Client
Riemann Server
Client Client
Riemann Server
Sunday, 21 July 13
@samnewman#geecon
So What Do We Have?
Sunday, 21 July 13
@samnewman#geecon
Server Server
GraphiteGraylog 2
Server
Sunday, 21 July 13
@samnewman#geeconSunday, 21 July 13
@samnewman#geeconSunday, 21 July 13
@samnewman#geeconSunday, 21 July 13
@samnewman#geecon
Server Server
Graphite Graylog 2Dashboard A
Dashboard B
Dashboard C
Server
Sunday, 21 July 13
@samnewman#geecon
Server Server
StatsD/Riemann
Graylog 2
Graphite
Dashboard A
Dashboard B
Dashboard C
Sunday, 21 July 13
@samnewman#geecon
http://shopify.github.io/dashing/
Sunday, 21 July 13
@samnewman#geeconSunday, 21 July 13
@samnewman#geeconSunday, 21 July 13
@samnewman#geeconSunday, 21 July 13
@samnewman#geeconSunday, 21 July 13
@samnewman#geecon
RealtimeAggregator
Sunday, 21 July 13
@samnewman#geecon
RealtimeAggregator
Sunday, 21 July 13
@samnewman#geecon
RealtimeAggregator
Sunday, 21 July 13
@samnewman#geecon
RealtimeAggregator
Data is lost!
Sunday, 21 July 13
@samnewman#geecon
RealtimeAggregator
Data is lost!
Sunday, 21 July 13
@samnewman#geecon
Real-time metrics requires upfront
knowledge
Sunday, 21 July 13
@samnewman#geecon
RealtimeAggregator
Sunday, 21 July 13
@samnewman#geecon
RealtimeAggregator
Sunday, 21 July 13
@samnewman#geecon
RealtimeAggregator
Lossless Event Store
Sunday, 21 July 13
@samnewman#geecon
RealtimeAggregator
Lossless Event Store
Sunday, 21 July 13
@samnewman#geecon
RealtimeAggregator
Lossless Event Store
HadoopHBase
Cassandra
Sunday, 21 July 13
@samnewman#geecon
Riemann Server
Client Client
Sunday, 21 July 13
@samnewman#geecon
Riemann Server
Client Client
Lossless Event Store
Sunday, 21 July 13
@samnewman#geecon
Event Sourcing
Sunday, 21 July 13
@samnewman#geecon
But...
Sunday, 21 July 13
@samnewman#geecon
RealtimeAggregator
Sunday, 21 July 13
@samnewman#geecon
Lossless Event Store
RealtimeAggregator
Sunday, 21 July 13
@samnewman#geecon
Can I have one view?
Lossless Event Store
RealtimeAggregator
Sunday, 21 July 13
@samnewman#geecon
http://nathanmarz.com/
Sunday, 21 July 13
@samnewman#geecon
Lossless Event Store
Realtime Aggregator
Sunday, 21 July 13
@samnewman#geecon
Lossless Event Store
Realtime Aggregator
Sunday, 21 July 13
@samnewman#geecon
Lossless Event Store
Realtime Aggregator
Up to date, but only for a small window
Sunday, 21 July 13
@samnewman#geecon
Lossless Event Store
Realtime Aggregator
Consistent, but out of date
Up to date, but only for a small window
Sunday, 21 July 13
@samnewman#geecon
Lossless Event Store
Realtime Aggregator
Unified Query
Consistent, but out of date
Up to date, but only for a small window
Sunday, 21 July 13
@samnewman#geecon
Lossless Event Store
Realtime Aggregator
Lambda Architecture
Unified Query
Consistent, but out of date
Up to date, but only for a small window
Sunday, 21 July 13
@samnewman#geecon
The Future?
Sunday, 21 July 13
@samnewman#geecon
Server Server
Aggregating Relay
Graphite
Graylog 2
Hadoop
Sunday, 21 July 13
@samnewman#geecon
Server Server
Aggregating Relay
Graphite
Graylog 2
Hadoop
Unified Query
Sunday, 21 July 13
@samnewman#geeconSunday, 21 July 13
@samnewman#geecon
All Your Data
Sunday, 21 July 13
@samnewman#geecon
All Your Data
In Realtime
Sunday, 21 July 13
@samnewman#geecon
All Your Data
In Realtime
Sunday, 21 July 13
@samnewman#geeconSunday, 21 July 13
@samnewman#geecon
Find and free your data
Sunday, 21 July 13
@samnewman#geecon
Find and free your data
Start simple
Sunday, 21 July 13
@samnewman#geecon
Find and free your data
Start simple
Create different views for different stakeholders
Sunday, 21 July 13
@samnewman#geecon
Find and free your data
Start simple
Create different views for different stakeholders
Don’t be scared of real-time!
Sunday, 21 July 13