Top Banner
Monitoring Challenges Monitorama June 2016 Adrian Cockcroft @adrianco
58

Monitoring Challenges - Monitorama 2016 - Monitoringless

Apr 16, 2017

Download

Software

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Monitoring Challenges - Monitorama 2016 - Monitoringless

Monitoring Challenges

Monitorama June 2016 Adrian Cockcroft

@adrianco

Page 2: Monitoring Challenges - Monitorama 2016 - Monitoringless

What does @adrianco do?

@adrianco

Technology Due Diligence on

Deals

Presentations at Companies and

Conferences

Tech and Board Advisor

Support for Portfolio

Companies

Consulting and Training

Networking with Interesting PeopleTinkering with

Technologies

Vendor Relationships

Previously: Netflix, eBay, Sun Microsystems, Cambridge Consultants, City University London - BSc Applied Physics

Page 3: Monitoring Challenges - Monitorama 2016 - Monitoringless

Monitorama 2014…

Page 4: Monitoring Challenges - Monitorama 2016 - Monitoringless

Monitorama 2016

What problems does monitoring address? Why isn’t this a solved problem already?

Who gets disrupted by what? Stuff I’ve been tinkering with

Page 5: Monitoring Challenges - Monitorama 2016 - Monitoringless

Measuring business value

Problem detection and diagnosis

Page 6: Monitoring Challenges - Monitorama 2016 - Monitoringless

“Ultimately business value is what the business values, and that is that.”

Mark Schwartz CIO DHS/DCIS

Page 7: Monitoring Challenges - Monitorama 2016 - Monitoringless

Business Value of Monitoring

Customer happiness Cost efficiency

Safety and security Compliance

Page 8: Monitoring Challenges - Monitorama 2016 - Monitoringless

Business Value of Monitoring

Customer happiness Cost efficiency

Safety and security Compliance

Page 9: Monitoring Challenges - Monitorama 2016 - Monitoringless

Customer Happiness

Time to value Availability

Response time

Page 10: Monitoring Challenges - Monitorama 2016 - Monitoringless

Cost Efficiency

Utilization Optimization Automation

Page 11: Monitoring Challenges - Monitorama 2016 - Monitoringless

Why isn’t this a solved problem already?

Page 12: Monitoring Challenges - Monitorama 2016 - Monitoringless

Why isn’t there one standard for monitoring?

Page 13: Monitoring Challenges - Monitorama 2016 - Monitoringless

Why isn’t there one standard for monitoring?We tried that once, immediately obsoleted by rise of Windows NT

X/Open Universal Measurement Architecture - 1997 http://pubs.opengroup.org/onlinepubs/009657299/c427-1/front.htm

Page 14: Monitoring Challenges - Monitorama 2016 - Monitoringless

Monitoring Evolution Challenges

Platform - Entities - Hierarchy Interfaces - Metrics - Schema

Scale - Ephemerality

Different vendors and tools in each generation…

Page 15: Monitoring Challenges - Monitorama 2016 - Monitoringless

1970’s Mainframes

Monitoring Evolution Challenges

Platform - Entities - Hierarchy Interfaces - Metrics - Schema

Scale - Ephemerality

Different vendors and tools in each generation…

Page 16: Monitoring Challenges - Monitorama 2016 - Monitoringless

1970’s Mainframes

1980’s Minicomputers

Monitoring Evolution Challenges

Platform - Entities - Hierarchy Interfaces - Metrics - Schema

Scale - Ephemerality

Different vendors and tools in each generation…

Page 17: Monitoring Challenges - Monitorama 2016 - Monitoringless

1990’s Unix Servers

1970’s Mainframes

1980’s Minicomputers

Monitoring Evolution Challenges

Platform - Entities - Hierarchy Interfaces - Metrics - Schema

Scale - Ephemerality

Different vendors and tools in each generation…

Page 18: Monitoring Challenges - Monitorama 2016 - Monitoringless

1990’s Unix Servers

1970’s Mainframes

2000’s Windows on x86

1980’s Minicomputers

Monitoring Evolution Challenges

Platform - Entities - Hierarchy Interfaces - Metrics - Schema

Scale - Ephemerality

Different vendors and tools in each generation…

Page 19: Monitoring Challenges - Monitorama 2016 - Monitoringless

1990’s Unix Servers

1970’s Mainframes

2000’s Windows on x86

1980’s Minicomputers

2000’s Linux on x86

Monitoring Evolution Challenges

Platform - Entities - Hierarchy Interfaces - Metrics - Schema

Scale - Ephemerality

Different vendors and tools in each generation…

Page 20: Monitoring Challenges - Monitorama 2016 - Monitoringless

1990’s Unix Servers

1970’s Mainframes

2000’s Windows on x86

1980’s Minicomputers

2000’s Linux on x86

2000’s VMware on blades

Monitoring Evolution Challenges

Platform - Entities - Hierarchy Interfaces - Metrics - Schema

Scale - Ephemerality

Different vendors and tools in each generation…

Page 21: Monitoring Challenges - Monitorama 2016 - Monitoringless

1990’s Unix Servers

1970’s Mainframes

2000’s Windows on x86

1980’s Minicomputers

2000’s Linux on x86

2000’s VMware on blades

2010’s Public cloud

Monitoring Evolution Challenges

Platform - Entities - Hierarchy Interfaces - Metrics - Schema

Scale - Ephemerality

Different vendors and tools in each generation…

Page 22: Monitoring Challenges - Monitorama 2016 - Monitoringless

1990’s Unix Servers

1970’s Mainframes

2000’s Windows on x86

1980’s Minicomputers

2000’s Linux on x86

2000’s VMware on blades

2010’s Public cloud2010’s Containers

Monitoring Evolution Challenges

Platform - Entities - Hierarchy Interfaces - Metrics - Schema

Scale - Ephemerality

Different vendors and tools in each generation…

Page 23: Monitoring Challenges - Monitorama 2016 - Monitoringless

1990’s Unix Servers

1970’s Mainframes

2000’s Windows on x86

1980’s Minicomputers

2000’s Linux on x86

2000’s VMware on blades

2010’s Public cloud2010’s Containers

2010’s Serverless

Monitoring Evolution Challenges

Platform - Entities - Hierarchy Interfaces - Metrics - Schema

Scale - Ephemerality

Different vendors and tools in each generation…

Page 24: Monitoring Challenges - Monitorama 2016 - Monitoringless

Why don’t monitoring vendors adapt and survive?

Page 25: Monitoring Challenges - Monitorama 2016 - Monitoringless

Cost per node drops Revenue opportunity decreases

Waves of disruption

New vendors have new schema’s, an order of

magnitude lower cost per node, and many more shorter lived

nodes to monitor

Page 26: Monitoring Challenges - Monitorama 2016 - Monitoringless

$Millions (illustrative order of magnitude costs)

Cost per node drops Revenue opportunity decreases

Waves of disruption

New vendors have new schema’s, an order of

magnitude lower cost per node, and many more shorter lived

nodes to monitor

Page 27: Monitoring Challenges - Monitorama 2016 - Monitoringless

$Millions (illustrative order of magnitude costs)

$1M

Cost per node drops Revenue opportunity decreases

Waves of disruption

New vendors have new schema’s, an order of

magnitude lower cost per node, and many more shorter lived

nodes to monitor

Page 28: Monitoring Challenges - Monitorama 2016 - Monitoringless

$100K

$Millions (illustrative order of magnitude costs)

$1M

Cost per node drops Revenue opportunity decreases

Waves of disruption

New vendors have new schema’s, an order of

magnitude lower cost per node, and many more shorter lived

nodes to monitor

Page 29: Monitoring Challenges - Monitorama 2016 - Monitoringless

$100K

$Millions (illustrative order of magnitude costs)

$10K

$1M

Cost per node drops Revenue opportunity decreases

Waves of disruption

New vendors have new schema’s, an order of

magnitude lower cost per node, and many more shorter lived

nodes to monitor

Page 30: Monitoring Challenges - Monitorama 2016 - Monitoringless

$100K

$Millions (illustrative order of magnitude costs)

$10K

$1M

$5K

Cost per node drops Revenue opportunity decreases

Waves of disruption

New vendors have new schema’s, an order of

magnitude lower cost per node, and many more shorter lived

nodes to monitor

Page 31: Monitoring Challenges - Monitorama 2016 - Monitoringless

$100K

$Millions (illustrative order of magnitude costs)

$10K

$1M

$5K

$1K per core

Cost per node drops Revenue opportunity decreases

Waves of disruption

New vendors have new schema’s, an order of

magnitude lower cost per node, and many more shorter lived

nodes to monitor

Page 32: Monitoring Challenges - Monitorama 2016 - Monitoringless

$100K

$Millions (illustrative order of magnitude costs)

$10K

$1M

$5K

$1K per core

$100’s per month

Cost per node drops Revenue opportunity decreases

Waves of disruption

New vendors have new schema’s, an order of

magnitude lower cost per node, and many more shorter lived

nodes to monitor

Page 33: Monitoring Challenges - Monitorama 2016 - Monitoringless

$100K

$Millions (illustrative order of magnitude costs)

$10K

$1M

$5K

$1K per core

$100’s per month$10’s per month

Cost per node drops Revenue opportunity decreases

Waves of disruption

New vendors have new schema’s, an order of

magnitude lower cost per node, and many more shorter lived

nodes to monitor

Page 34: Monitoring Challenges - Monitorama 2016 - Monitoringless

$100K

$Millions (illustrative order of magnitude costs)

$10K

$1M

$5K

$1K per core

$100’s per month$10’s per month

$1’s per month

Cost per node drops Revenue opportunity decreases

Waves of disruption

New vendors have new schema’s, an order of

magnitude lower cost per node, and many more shorter lived

nodes to monitor

Page 35: Monitoring Challenges - Monitorama 2016 - Monitoringless

Vendor Landscape

Page 36: Monitoring Challenges - Monitorama 2016 - Monitoringless

A Tragic Quadrant

Ability to scale

Ability to handle rapidly changing microservices

In-house tools at web scale companies

Most current monitoring & APM

tools

Next generation APM

Next generation Monitoring

Datacenter

Cloud

Containers

100s 1,000s 10,000s 100,000s

Lambda

Page 37: Monitoring Challenges - Monitorama 2016 - Monitoringless

A Tragic Quadrant

Ability to scale

Ability to handle rapidly changing microservices

In-house tools at web scale companies

Most current monitoring & APM

tools

Next generation APM

Next generation Monitoring

Datacenter

Cloud

Containers

100s 1,000s 10,000s 100,000s

Lambda

Vendors - tell me where you belong on this plot…

Page 38: Monitoring Challenges - Monitorama 2016 - Monitoringless

Tinkering

Page 39: Monitoring Challenges - Monitorama 2016 - Monitoringless

Simulated MicroservicesModel and visualize microservices Simulate interesting architectures Generate large scale configurations Stress test real monitoring tools

Code: github.com/adrianco/spigo Simulate Protocol Interactions in Go Simian Army Visualizations

ELB Load Balancer

ZuulAPI Proxy

KaryonBusiness Logic

StaashData Access Layer

PriamCassandra Datastore

ThreeAvailabilityZones

DenominatorDNS Endpoint

Page 40: Monitoring Challenges - Monitorama 2016 - Monitoringless

Zipkin Trace for one Spigo Flow

Page 41: Monitoring Challenges - Monitorama 2016 - Monitoringless

Response Times

Page 42: Monitoring Challenges - Monitorama 2016 - Monitoringless

See http://www.getguesstimate.com/models/1307 Guesstimate

Page 43: Monitoring Challenges - Monitorama 2016 - Monitoringless

memcached hit %

memcached response mysql response

service cpu time

memcached hit mode

mysql cache hit mode

mysql disk access mode

Hit rates: memcached 40% mysql 70%Guesstimate

Page 44: Monitoring Challenges - Monitorama 2016 - Monitoringless

Spigo Histogram Results name: storage.*.*..load00...load.denominator_serv quantiles: [{50 47103} {99 139263}] From To Count Prob Bar 20480 21503 2 0.0007 : 21504 22527 2 0.0007 | 23552 24575 1 0.0003 : 24576 25599 5 0.0017 | 25600 26623 5 0.0017 | 26624 27647 1 0.0003 | 27648 28671 3 0.0010 | 28672 29695 5 0.0017 | 29696 30719 127 0.0421 |#### 30720 31743 126 0.0418 |#### 31744 32767 74 0.0246 |## 32768 34815 281 0.0932 |######### 34816 36863 201 0.0667 |###### 36864 38911 156 0.0518 |##### 38912 40959 185 0.0614 |###### 40960 43007 147 0.0488 |#### 43008 45055 161 0.0534 |##### 45056 47103 125 0.0415 |#### 47104 49151 135 0.0448 |#### 49152 51199 99 0.0328 |### 51200 53247 82 0.0272 |## 53248 55295 77 0.0255 |## 55296 57343 66 0.0219 |## 57344 59391 54 0.0179 |# 59392 61439 37 0.0123 |# 61440 63487 45 0.0149 |# 63488 65535 33 0.0109 |# 65536 69631 63 0.0209 |## 69632 73727 98 0.0325 |### 73728 77823 92 0.0305 |### 77824 81919 112 0.0372 |### 81920 86015 88 0.0292 |## 86016 90111 55 0.0182 |# 90112 94207 38 0.0126 |# 94208 98303 51 0.0169 |# 98304 102399 32 0.0106 |# 102400 106495 35 0.0116 |# 106496 110591 17 0.0056 | 110592 114687 19 0.0063 | 114688 118783 18 0.0060 | 118784 122879 6 0.0020 | 122880 126975 8 0.0027 |

Normalized probability

Response time distribution measured in nanoseconds using High Dynamic Range Histogram

:# Zero counts skipped|# Contiguous buckets

Median and 99th percentile values

service time for load generator

Cache hit Cache miss

Page 45: Monitoring Challenges - Monitorama 2016 - Monitoringless

Serverless

Page 46: Monitoring Challenges - Monitorama 2016 - Monitoringless

Serverless AWS Lambda - lots of production examples

Google Cloud Functions Azure Functions alpha launched

IBM OpenWhisk - open source

Startup activity: iron.io , serverless.com, apex.run toolkit

Page 47: Monitoring Challenges - Monitorama 2016 - Monitoringless

Monitorless ArchitectureAPI Gateway

Kinesis S3DynamoDB

Page 48: Monitoring Challenges - Monitorama 2016 - Monitoringless

Monitorless ArchitectureAPI Gateway

Kinesis S3DynamoDB

Page 49: Monitoring Challenges - Monitorama 2016 - Monitoringless

Monitorless ArchitectureAPI Gateway

Kinesis S3DynamoDB

Monitorable entities only exist during an execution trace

Page 50: Monitoring Challenges - Monitorama 2016 - Monitoringless

AWS Lambda Reference Arch http://www.allthingsdistributed.com/2016/05/aws-lambda-serverless-reference-architectures.html

Page 51: Monitoring Challenges - Monitorama 2016 - Monitoringless

Serverless Programming Model

Event driven functions Role based permissions

Whitelisted API based security Good for simple single threaded code

Page 52: Monitoring Challenges - Monitorama 2016 - Monitoringless

Serverless Cost Efficiencies

100% useful work, no agents, overheads 100% utilization, no charge between requests

No need for extra capacity for peak traffic Anecdotal costs ~1% of conventional system

Ideal for low traffic, Corp IT, spiky workloads

Page 53: Monitoring Challenges - Monitorama 2016 - Monitoringless

Serverless Work in Progress

Tooling for ease of use Multi-region HA/DR patterns

Debugging and testing frameworks Monitoring, end to end tracing

Using AWS Lambda to monitor AWS

Page 54: Monitoring Challenges - Monitorama 2016 - Monitoringless

DIY On-Premise Serverless Operating Challenges

Scheduling and startup latency Execution and monitoring overhead

Charging model Capacity planning

Page 55: Monitoring Challenges - Monitorama 2016 - Monitoringless

Monitoring Challenges

Too much new stuff Too ephemeral

Price disruption

Page 56: Monitoring Challenges - Monitorama 2016 - Monitoringless

Thanks!

Page 57: Monitoring Challenges - Monitorama 2016 - Monitoringless

Thanks!

Also speaking at: Docker Portland Meetup Wednesday Evening @Puppetlabs - Microservices: Whats Missing

Page 58: Monitoring Challenges - Monitorama 2016 - Monitoringless

Security

Visit http://www.battery.com/our-companies/ for a full list of all portfolio companies in which all Battery Funds have invested.

Palo Alto Networks

Enterprise ITOperations & ManagementBig DataCompute

Networking

Storage