Microservice monitoring
Marek Koniew
Reactive manifesto
0
http://www.reactivemanifesto.org/
Bulkheads
https://github.com/Netflix/Hystrix/wiki
Cascading failures
2Circuit breaker, bulkhead pattern, cascading failures
Back pressure
4
https://github.com/Netflix/Hystrix/wiki
Circuit breaker, back pressure, bulkhead pattern, cascading failures, http 429 too many requests
Hystrix
9
- Circuit Breaker
- Request Collapsing
- Request Caching
- Monitoring
Hystrix Dashboard
12Real time information. To store information use Graphite.
Hystrix Command Widget
14
Hello World
17
Migrating a Library to Hystrix
19
- External service call should be wrapped into a HystrixCommand
- Configure thread pools
- Fine grained command names
Circuit breaker, Timeouts, queue size, thread pool size
Patterns and anti-patterns: naming convention
22
https://wiki.hybris.com/display/prodandtech/Hystrix+commands+naming+convention+-+draft
docu-repo-v2.deleteType.db
Service nameversion Command name
External service
Patterns and anti-patterns: thread pools
22
https://github.com/Netflix/Hystrix/wiki/Configuration#threadpool-properties
- Netflix API has 30+ of its threadpools
set at 10, 2 at 20 and 1 at 25
- Queue size affect request times !
Hystrix Dashboard Demo
25
DEMO
Hystrix Dashboard Demo
26
DEMO
Dynatrace
Dynatrace architecture
https://www.youtube.com/watch?v=judLcsVns-s
Memory analysis:
https://www.youtube.com/watch?v=dUn4iBM6Hik
28
Dynatrace
31
Dynatrace
35
Wiki:
https://wiki.hybris.com/display/INFRA/OnDemand+performance+testing+setup
Configure java agent:
java
–agentpath:libdtagent.so=
name=app,
server=dynatrace.yrd.fra.hybris.com,wait=45,
storage=.
Dynatrace – advantages
Direct access to agent machine
Memory dumps
CPU sampling
Much more
JVM and system monitoring
Request tracing across services
Hot sensor placement – class instrumentation can be changed without restart
38
Dynatrace – disadvantages
Performance overhead
Sensor configuration affects all agents
Weak support for java 8 (although dynatrace version 6 is out)
Steep learning curve
41
Dynatrace patterns and anti-patterns
Do NOT use it for monitoring – utilize its profiling power
Much better tools available for monitoring (Graphite, hystrix dashboard)
Applications designed to be monitored does not need dynatrace
Only very simple alerts available
Turn off sensors by default to save performance
Configure separate service instance with extended number of sensors on demand
Route part of the traffic through this service (canary testing)
44
Dynatrace demo scenario
47
1) Start application with agent.
2) Observe allocated memory
3) Observe GC suspension times
4) Create memory dump
5) Analyze memory dump
Dynatrace demo – allocated memory
49
Dynatrace demo – GS suspentions
51
Dynatrace demo – application crash
52
Dynatrace demo – memory dump
53
Dynatrace demo – memory dump
54
Dynatrace demo – analyze memory
55
Dynatrace demo – root cause
57
https://wiki.hybris.com/display/prodandtech/2014/08/26/Your+service+is+leaking+memory
Hystrix Dashboard Demo
60
DEMO
Questions
Workshops agenda
Create hystrix command:
Set command name.
Set command group.
Configure thread pool.
Setup hystrix.stream endpoint.
Setup hystrix dashboard.
Connect application to dynatrace.
Create dynatrace graph:
Memory usage.
GC suspension.
Create memory dump.