Top Banner
Thank you to our Sponsors A Holistic Approach to Monitoring Melanie Cey – Yardi Systems Inc. Media Sponsor:
27
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Holistic Approach To Monitoring

Thank you to our Sponsors

A Holistic Approach to MonitoringMelanie Cey – Yardi Systems Inc.

Media Sponsor:

Page 2: Holistic Approach To Monitoring

@melaniemj

Systems Analyst in DevOps (Web Operations) @ Yardi

• 5 years Programming• 3.5 years Team Lead/Project Manager• 4 years Systems Administration/Analysis

Page 3: Holistic Approach To Monitoring
Page 4: Holistic Approach To Monitoring

Because

• Customers should not alert you to failure• Business metrics matter• When something fails you need enough info to know why• Agile teams release frequently• No one can afford to be reactive

Page 5: Holistic Approach To Monitoring

When you release code…

Page 6: Holistic Approach To Monitoring

Monitoring Cycles

Page 7: Holistic Approach To Monitoring

Definition: What to measure

• Business Metrics & Events- Login/logout- Sign up, buy something- Sent email

• System Events, Performance and Utilization Metrics- Web Service Call details (counter / time taken)- Deployments- Cache system (e.g. Redis or other) hits / misses- Environment performance

• Failure Metrics- Exceptions, segregated by type / app / server of origin- Number and type of errors that reached customers

Page 8: Holistic Approach To Monitoring

Code Collection

Page 9: Holistic Approach To Monitoring

Code Collection – Add / Refine Stats

• Developer Friendly Platform- Stats need to be able to be added ‘without permission’- Create own dashboards- Tools with APIs- Build client library for sending stats

Page 10: Holistic Approach To Monitoring

Code Collection – Graphite

• Using Graphite- (Etsy 2011) StatsD UDP Node.js daemon collects and

aggregates- Sends stats (as strings) to Graphite where they are stored in

Whisper (like RRD) files- Graphite has a web interface, url api (with a json output option)

and built in ability to create dashboards- Can receive stats from anything and is easy to setup- Open source with lots of industry use- Plenty of built in functions to help analyze and visualize data

Page 11: Holistic Approach To Monitoring

Code Collection – Graphite

Page 12: Holistic Approach To Monitoring

Code Collection – Add / Refine Stats

Page 13: Holistic Approach To Monitoring

Code Collection – Graphite Samples

Page 14: Holistic Approach To Monitoring

Code Collection – Logging

• Metrics – what and when• Logging – how and why

Page 15: Holistic Approach To Monitoring

Code Collection – Add / Refine Logging

• Why Log and what to log?- Log when you record a statistic

• Logging Best Practices- Log locally- Don’t log to your production database server- Don’t fail if you can’t log- Log in GMT- Keep your logs, ship them to a central location- Aggregate recent data in real time if you can- Log more than you think you need to- Use a parse friendly format

Page 16: Holistic Approach To Monitoring

Environment Collection

Page 17: Holistic Approach To Monitoring

Environment Collection

• Operating Systems- CPU, Free Memory, Paging, I/O ms speeds, network utilization

• Database Management Systems- Transactions, blocks

• Application Containers- Memory utilization, IIS requests current & queued, restarts,

cache statistics etc.

Page 18: Holistic Approach To Monitoring

Visualization

Page 19: Holistic Approach To Monitoring

Visualization

• Types of Dashboards- Feature based- Resource based (server or container)- Performance- Anomaly detection- Correlation- Root Cause Analysis- “Overview”

Page 20: Holistic Approach To Monitoring

Visualization – Tasseo

• https://github.com/obfuscurity/tasseo

Page 21: Holistic Approach To Monitoring

Visualization – Cubism

• https://github.com/square/cubism

Page 22: Holistic Approach To Monitoring

Visualization – Cubism

• https://github.com/square/cubism

Page 23: Holistic Approach To Monitoring

Action: Putting inside knowledge to work

Page 24: Holistic Approach To Monitoring

Action

• Useful dashboards help create useful alerts• Add / refine anomaly detection & alerting• Know your own boundaries• A fuzzy threshold is better than no threshold• Attach graphs to alerts

• Exploit failures- Add an alerts after RCA- Theorize other possible causes or conditions

Page 25: Holistic Approach To Monitoring

Monitoring Cycles

Page 26: Holistic Approach To Monitoring

More?

• http://graphite.readthedocs.org/en/latest/ • http://codeascraft.com/ • http://vimeo.com/monitorama • Twitter #devops #monitoringlove• https://github.com/monitoringsucks • http://www.opsschool.org/en/latest/

Page 27: Holistic Approach To Monitoring