Top Banner
Performance Monitoring for Docker environments Monitoring Docker Anomaly detection Live demo
49

Performance monitoring for Docker - Lucerne meetup

Apr 16, 2017

Download

Software

Stijn Polfliet
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Performance monitoring for Docker - Lucerne meetup

Performance Monitoringfor Docker environmentsMonitoring DockerAnomaly detectionLive demo

Page 2: Performance monitoring for Docker - Lucerne meetup

About me

@coscale

www.coscale.com

@spolfliet

[email protected]

Page 3: Performance monitoring for Docker - Lucerne meetup
Page 4: Performance monitoring for Docker - Lucerne meetup
Page 5: Performance monitoring for Docker - Lucerne meetup

• Scale & dynamic behavior:

Number of containers >> number of servers

Containers come and go at a much faster pace

Container monitoring challenges

• Diversity

Different application technologies

Overload of metrics to monitor and alert on

Page 6: Performance monitoring for Docker - Lucerne meetup

Monolithic application monitoring

(Virtualized) OS

Application

End user

System / Infrastructure monitoring

Application performance monitoring (APM)

Real user monitoring (RUM)

Page 7: Performance monitoring for Docker - Lucerne meetup

Microservices monitoring

(Virtualized) OS

End user

System / Infrastructure monitoring

Container monitoring +In-container application monitoring

Real user monitoring (RUM)

Container

Applicationcomponent

Container

Applicationcomponent

Container

Applicationcomponent

Page 8: Performance monitoring for Docker - Lucerne meetup

Hosts (CPU, memory, disk)

Orchestrator (services, volumes, replication controllers, …)

Containers (cpu, memory, disk, network, ...)

Container internals (application, database, caching, etc.)

Impact on user and application performance

What to monitor?

Lightweight monitoring for lightweight microservices environment

Page 9: Performance monitoring for Docker - Lucerne meetup

Docker stats API

$ docker statsCONTAINER CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O1285939c1fd3 0.07% 796 KiB / 64 MiB 1.21% 788 B / 648 B 3.568 MB / 512 KB9c76f7834ae2 0.07% 2.746 MiB / 64 MiB 4.29% 1.266 KB / 648 B 12.4 MB / 0 Bd1ea048f04e4 0.03% 4.583 MiB / 64 MiB 6.30% 2.854 KB / 648 B 27.7 MB / 0 B

Docker API

Page 10: Performance monitoring for Docker - Lucerne meetup

docker run \

--volume=/:/rootfs:ro \

--volume=/var/run:/var/run:rw \

--volume=/sys:/sys:ro \

--volume=/var/lib/docker/:/var/lib/docker:ro \

--publish=8080:8080 \

--detach=true \

--name=cadvisor \

google/cadvisor:latest

open http://<your-hostname>:8080/

CAdvisor

Page 11: Performance monitoring for Docker - Lucerne meetup

agent runs in 1 container or on hostcontainer resource usage basic application monitoring

15$ / month / server

Datadogdatadoghq.com

Page 12: Performance monitoring for Docker - Lucerne meetup

kernel module captures system callscontainer resource usage basic application monitoring

Sysdig sysdig.com

Page 13: Performance monitoring for Docker - Lucerne meetup

Heavyweight, deep application monitoringDesigned for monolithic application in specific programming languageToo many dynamic metrics to handle with static alertsPutting an agent inside a container is an anti-pattern

100+$ / month / server

APM vendors

Page 14: Performance monitoring for Docker - Lucerne meetup

● Extra work in setting up, maintaining, and supporting● Generic tools, no specific container or cluster visualizations ● No Real User Monitoring● No out-of-the-box anomaly detection and predictive analytics

Prometheus

Open source

Page 15: Performance monitoring for Docker - Lucerne meetup

Performance Monitoringfor Docker environmentsAnomaly detection

Page 16: Performance monitoring for Docker - Lucerne meetup

Anomaly: definition

Page 17: Performance monitoring for Docker - Lucerne meetup

Static alerts

TODO : more realistic business examples!

!

!

Page 18: Performance monitoring for Docker - Lucerne meetup

?seasonality

correlations

changing or dynamic environment

Static alert limitations

Page 19: Performance monitoring for Docker - Lucerne meetup

Challenges

statistical significance relevance⇏

Page 20: Performance monitoring for Docker - Lucerne meetup

Simple technique: 3- rule

Exponential smoothing: α=0.03, z=3

Page 21: Performance monitoring for Docker - Lucerne meetup
Page 22: Performance monitoring for Docker - Lucerne meetup
Page 23: Performance monitoring for Docker - Lucerne meetup

Does not work with seasonal data

Page 24: Performance monitoring for Docker - Lucerne meetup

Holt-Winters● seasonal exponential smoothing

● works quite well on ‘laboratory data’

● calculation of prediction intervals relies on normal distribution after removal of seasonality

● => on our real world seasonal data generates too many false positives

Page 25: Performance monitoring for Docker - Lucerne meetup

Sliding window approach

model

evaluation of new data

Page 26: Performance monitoring for Docker - Lucerne meetup
Page 27: Performance monitoring for Docker - Lucerne meetup
Page 28: Performance monitoring for Docker - Lucerne meetup
Page 29: Performance monitoring for Docker - Lucerne meetup
Page 30: Performance monitoring for Docker - Lucerne meetup
Page 31: Performance monitoring for Docker - Lucerne meetup

Local outlier factorExisting instance based machine learning technique (lazy, ~kNN)

Based on concept of local density

local outlier factor(A) = density at point A

average density of kNN of point A

LOF >> 1 ⇒ outlier

en.wikipedia.org/wiki/Local_outlier_factor

Page 32: Performance monitoring for Docker - Lucerne meetup

Load balance detector

Compare multiple signals (mean + variance) in load-balanced environment

Page 33: Performance monitoring for Docker - Lucerne meetup

Anomaly detection @ service level

Page 34: Performance monitoring for Docker - Lucerne meetup
Page 35: Performance monitoring for Docker - Lucerne meetup
Page 36: Performance monitoring for Docker - Lucerne meetup

Lightweight agent• Server metrics from OS

• Container and cluster metrics from Kubernetes and Docker APIs

• Application metrics from log files and management interfaces

• Business & custom metrics from various sources

Contextual events

• Container lifecycle

• Deployments & software releases

• Infrastructure changes

• Custom events

CoScale approach

Page 37: Performance monitoring for Docker - Lucerne meetup

Scalable Architecture

APPAPP

APP

APPAPP

API

APPAPP

RUM

PostgresqlMetadata

CassandraMetric data

ElasticsearchEvent data

HaProxy

LoadbalancerHTTPS handling

Analysis workers

Alerting workers

Data workers

RUM

Boomerang.js

Agent

Log & api parsing

Page 38: Performance monitoring for Docker - Lucerne meetup

DEMO

Page 39: Performance monitoring for Docker - Lucerne meetup
Page 40: Performance monitoring for Docker - Lucerne meetup
Page 41: Performance monitoring for Docker - Lucerne meetup
Page 42: Performance monitoring for Docker - Lucerne meetup
Page 43: Performance monitoring for Docker - Lucerne meetup
Page 44: Performance monitoring for Docker - Lucerne meetup
Page 45: Performance monitoring for Docker - Lucerne meetup
Page 46: Performance monitoring for Docker - Lucerne meetup

Questions?

or contact me at [email protected]

@spolfliet

Page 47: Performance monitoring for Docker - Lucerne meetup

Backup slides

Page 48: Performance monitoring for Docker - Lucerne meetup

Local outlier factor, no strong model assumption

heavy process

Local outlier factor, no strong model assumption

Page 49: Performance monitoring for Docker - Lucerne meetup

Local outlier factor, no free lunch

Scaling: comparing apples and oranges

scale ⇒ distance ⇒ density ⇒ LOF-score

Autoscaling? (Mahalanobis distance) => enlarges dimensions with low variance

“Curse of dimensionality”

dimensionality reduction preprocessing (e.g. PCA), but don’t throw away the anomalies with the bathwater

Choosing cross-sections of data to analyze together, e.g.

different metric on same container

same metric on different containers