Top Banner
Data-Driven Operations Practice realtime data analyse @khsing
27

Data-Driven Operations - Practice realtime data analyse

Jun 23, 2015

Download

Technology

Guixing Bai

Grab data from any of logs and operations in realtime. Enable the power to find problem instantly. And make all operations based on data.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Data-Driven Operations - Practice realtime data analyse

Data-Driven OperationsPractice realtime data analyse

@khsing

Page 2: Data-Driven Operations - Practice realtime data analyse

Who am I

• Currently, I am a operations architect in SINA.

• Focus on automation tools and devops method

Page 3: Data-Driven Operations - Practice realtime data analyse

What kind of data is for operations?

Page 4: Data-Driven Operations - Practice realtime data analyse

Before we talk data

Page 5: Data-Driven Operations - Practice realtime data analyse

How is one day of ops?

Page 6: Data-Driven Operations - Practice realtime data analyse

• Check the Dashboard and looks good.

• Start work, write scripts or configurations

• Suddenly, Receiving alert SMS/Email or problem reported by CS.

• Start work with event/problem/outage

Page 7: Data-Driven Operations - Practice realtime data analyse

You are the Fireman http://www.flickr.com/photos/40699207@N05/3838012090/

Page 8: Data-Driven Operations - Practice realtime data analyse

Find the problem

• take a look at Dashboard, Nagios, and monitor

• grep logs from hundreds of host.

• watch the network diagram

• guess what is going wrong

Page 9: Data-Driven Operations - Practice realtime data analyse

Driven by problem

Page 10: Data-Driven Operations - Practice realtime data analyse

Passive

Page 11: Data-Driven Operations - Practice realtime data analyse

Be Active

Page 12: Data-Driven Operations - Practice realtime data analyse

Let’s talk data

Page 13: Data-Driven Operations - Practice realtime data analyse

datas

• Logs

• Access log, error log, exception log, step log

• Configuration Change log, Release log

• Performance Measurement

• Product operations data.

Page 14: Data-Driven Operations - Practice realtime data analyse

Logs

• Success is useless.

• Error is useful.

Page 15: Data-Driven Operations - Practice realtime data analyse

Process logs

• Realtime or near realtime take big benefit

• You can’t waste 1 hour when problem really happen

• You have to feel problem before too many users blame.

Page 16: Data-Driven Operations - Practice realtime data analyse

Process Logs

• Automatically category.

Page 17: Data-Driven Operations - Practice realtime data analyse

Normal logs

Page 18: Data-Driven Operations - Practice realtime data analyse

Categorised logs

Page 19: Data-Driven Operations - Practice realtime data analyse

Performance Measurement

• How fast when end-user visit our website?

• Where are they come from?

• Which datacenter are they visited?

• What the slow/fast user ratio?

Page 20: Data-Driven Operations - Practice realtime data analyse

Product Operations Data

• like DAU

• Drop, Spike, Increase are event, need take action.

Page 21: Data-Driven Operations - Practice realtime data analyse

Change/Release log

• Many problem come with Change or Release

• You have to watch those data after you did a change or release.

• Change/Release log have to visible on dashboard.

Page 22: Data-Driven Operations - Practice realtime data analyse

Change/Release log

Page 23: Data-Driven Operations - Practice realtime data analyse

Be active

Page 24: Data-Driven Operations - Practice realtime data analyse

Don’t defensive

Page 25: Data-Driven Operations - Practice realtime data analyse

–Olbrich Desouza

Attack is the best form of defence

Page 26: Data-Driven Operations - Practice realtime data analyse

Tools

• Splunk - commercial

• Logstash, ElasticSearch, Kibana

• Graphite

• StatsD

Page 27: Data-Driven Operations - Practice realtime data analyse

Q&A