Top Banner
Using Machine Learning to Optimize DevOps Practices Building Learning into Monitoring and Feedback Peter Varhol
29

Using Machine Learning to Optimize DevOps Practices

Jan 22, 2018

Download

Technology

Peter Varhol
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Using Machine Learning to Optimize DevOps Practices

Using Machine Learning to Optimize DevOps Practices

Building Learning into Monitoring and Feedback

Peter Varhol

Page 2: Using Machine Learning to Optimize DevOps Practices

About me

• International speaker and writer• Degrees in Math, CS, Psychology• Technology communicator• Former university professor, tech journalist• Cat owner and distance runner• [email protected]

Page 3: Using Machine Learning to Optimize DevOps Practices

Agenda

• What is machine learning?

• How is machine learning applied to DevOps?

• Challenges in training these systems

• What constitutes an issue?

• Summary and conclusions

Page 4: Using Machine Learning to Optimize DevOps Practices

What is Machine Learning?

• Layered algorithms that change parameters based on feedback from know data• Can be linear or nonlinear

• Algorithms can be fixed in production or adaptive• Fixed – algorithms do not adjust once deployed

• Adaptive – algorithms continually adjust to new data

• Usually part of a larger system

Page 5: Using Machine Learning to Optimize DevOps Practices

Adaptive Systems

• Airline pricing• Ticket prices change three times a day based on demand

• It can cost less to go farther

• It can cost less later

• Ecommerce systems• Recommendations try to discern what else you might want

• Can I incentivize you to fill up the plane?

Page 6: Using Machine Learning to Optimize DevOps Practices

Why Use Adaptive?

• The “right” result will vary over time

• Trying to optimize a particular result• Revenue

• The problem domain is not static

Confidential, Dynatrace LLC

Page 7: Using Machine Learning to Optimize DevOps Practices

How Are Fixed Systems Used?

• Transportation• Self-driving cars

• Aircraft/Drones

• Ecommerce• Recommendation engines

• Medical• Diagnosis systems

Page 8: Using Machine Learning to Optimize DevOps Practices

Why Use Fixed Machine Learning Systems

• The problem domain is static

• The expectations remain constant

• The right answer is known under most conditions

• The original algorithms remain valid over a long period of time

Page 9: Using Machine Learning to Optimize DevOps Practices

DevOps Practices Generate Data

• During development• Agile metrics, JIRA issues, test case metrics

• During continuous integration• System test metrics

• During continuous deployment• Quality metrics for deployments

• After deployment and into production• Application availability and performance

• Usage log files

Page 10: Using Machine Learning to Optimize DevOps Practices

Focus on Monitoring

• Ongoing data on availability and performance• RUM

• Synthetic tests

• Application monitoring

• Monitoring tackles the back end of DevOps• Identifying unhealthy trends

• Diagnoses failures and poor performance

• Recommends action

• Fixed or adaptive depends on your goals

Page 11: Using Machine Learning to Optimize DevOps Practices

Where Do Predictive Analytics Come In?

• Big data makes possible predictions of future events• Are we going to fail?

• How will we perform with traffic surges?

• As well as past events• What went wrong and how do we fix it

• We can rely on past data• Adaptive systems may not perform as well

• Clear goals needed

Page 12: Using Machine Learning to Optimize DevOps Practices

What Technologies Are Involved?

• Neural networks

• Genetic algorithms

• Rules engines

Page 13: Using Machine Learning to Optimize DevOps Practices

Neural Networks

• Set of layered algorithms whose variables can be adjusted via a learning process

• The learning process involves training with known inputs and outputs

• The algorithms adjust coefficients to converge on the correct answer (or not)

• You freeze the algorithms and coefficients, and deploy• Or you optimize on a particular set of characteristics

Page 14: Using Machine Learning to Optimize DevOps Practices

A Sample Neural Network

Page 15: Using Machine Learning to Optimize DevOps Practices

Genetic Algorithms

• Use the principle of natural selection

• Create a range of possible solutions

• Try out each of them

• Choose and combine two of the better alternatives

• Rinse and repeat as necessary

Page 16: Using Machine Learning to Optimize DevOps Practices

Bringing in DevOps

• DevOps has data that can be used to train neural networks• Health of the application

• Trends in application traffic and responsiveness

• Application failure

Page 17: Using Machine Learning to Optimize DevOps Practices

Machine Learning Helps DevOps

• Decisions are complex• Why is the CPU maxed?

• What is causing disk thrashing?

• Why did the network slow?

• Why did the application fail?

• Data is massive• Potentially thousands of data points a day

Page 18: Using Machine Learning to Optimize DevOps Practices

How Good Are Decisions?

• Expert versus machine

• Given the same data• In many domains they tie

• With additional data, the human can be better

• But machine learning will get better

• But only as good as the data

Page 19: Using Machine Learning to Optimize DevOps Practices

We Want to Do Two Things

• Identify trends that may indicate future problems• Increasing response times

• More page errors

• Diagnose faults once they have happened• Why did the application fail?

• How can we fix it as quickly as possible?

Page 20: Using Machine Learning to Optimize DevOps Practices

Fixed Algorithms Work for Some Problems

• Immediate performance and failure identification

• Diagnosis of failures and performance issues

• These are readily identifiable from known data

Page 21: Using Machine Learning to Optimize DevOps Practices

Adaptive Systems Supplement These Tools

• Predictions of future events• Performance

• Availability

• The target is moving• So we need current data to adjust the algorithms

Page 22: Using Machine Learning to Optimize DevOps Practices

The Machine Helps the DevOps Expert

• The machine learning app provides:• Early warning on possible performance issues and failures

• Immediate notification of failure or impending failure

• Trend analysis of data to predict unhealthy outcomes

• The machine learning is an assistant• It can’t fix anything

• It can’t necessarily identify the root cause

Page 23: Using Machine Learning to Optimize DevOps Practices

What is the Goal?

• We have many ways of monitoring• Many of them are represented at this conference

• Each measures something a little different• Latency, response time, availability, network, DNS . . .

• Too much data can be no better than no data at all

• Machine learning can correlate across measurements• Focus to eliminate false positives

Page 24: Using Machine Learning to Optimize DevOps Practices

Intelligent Systems Are Sometimes Wrong

• The problem domain is ambiguous

• There is no single “right” answer• “Close enough” is good

• We don’t know quite why the software responds as it does• We can’t easily trace code paths

Page 25: Using Machine Learning to Optimize DevOps Practices

Testing Machine Learning Systems

• Have objective acceptance criteria

• Test with new data

• Don’t count on all results being accurate

• Understand the architecture of the network as a part of the testing process

• Communicate the level of confidence you have in the results to management and users

Page 26: Using Machine Learning to Optimize DevOps Practices

A Cautionary Tale

• All events are not created equal

• AI systems treat events equally• A failure of a system during busy season is the same as any other

• DevOps pros know otherwise• And can exert additional effort in response

• And actually fix the problem

• We can’t automate what we don’t understand

• You need the human in the loop

Confidential, Dynatrace LLC

Page 27: Using Machine Learning to Optimize DevOps Practices

Conclusions

• DevOps is a natural environment for machine learning systems• Any activity that generates data and requires a decision is fair game

• Monitoring is low-hanging fruit

• Fixed systems for failure and diagnosis, adaptive for trend analysis

Confidential, Dynatrace LLC

Page 28: Using Machine Learning to Optimize DevOps Practices

References

• https://qz.com/989137/when-a-robot-ai-doctor-misdiagnoses-you-whos-to-blame/

• https://pvarhol.wordpress.com/2017/07/22/what-brought-about-our-ai-revolution/

• https://pvarhol.wordpress.com/2017/06/21/analytics-dont-apply-in-the-clutch/

Confidential, Dynatrace LLC