Top Banner
© 2017 Mesosphere, Inc. All Rights Reserved. 1 Containerized, Cloud-Native Operations for Big Data Analytics SoCal DevOps July 19, 2017 Elizabeth K. Joseph, @pleia2
34

Containerized, Cloud-Native Operations for Big Elizabeth K ... · Containerized, Cloud-Native Operations for Big Data Analytics SoCal DevOps July 19, 2017 ... Reduces resource consumption

May 20, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Containerized, Cloud-Native Operations for Big Elizabeth K ... · Containerized, Cloud-Native Operations for Big Data Analytics SoCal DevOps July 19, 2017 ... Reduces resource consumption

© 2017 Mesosphere, Inc. All Rights Reserved. 1

Containerized, Cloud-Native Operations for Big Data Analytics

SoCal DevOpsJuly 19, 2017

Elizabeth K. Joseph, @pleia2

Page 2: Containerized, Cloud-Native Operations for Big Elizabeth K ... · Containerized, Cloud-Native Operations for Big Data Analytics SoCal DevOps July 19, 2017 ... Reduces resource consumption

© 2017 Mesosphere, Inc. All Rights Reserved. 2

❏ Developer Advocate at Mesosphere❏ 15+ years working in open source

communities❏ 10+ years in Linux systems

administration and engineering roles❏ Founder of OpenSourceInfra.org❏ Author of The Official Ubuntu Book

and Common OpenStack Deployments

Elizabeth K. Joseph, Developer Advocate

Page 3: Containerized, Cloud-Native Operations for Big Elizabeth K ... · Containerized, Cloud-Native Operations for Big Data Analytics SoCal DevOps July 19, 2017 ... Reduces resource consumption

© 2017 Mesosphere, Inc. All Rights Reserved. 3

You no longer have a single server with everything running on it.

You have a multi-tier system with various layers and owners down the stack:

❏ Hardware❏ Network❏ Resource abstraction❏ Scheduler❏ Container❏ Virtual network❏ Application❏ ...

Cloud-Native Systems

Page 4: Containerized, Cloud-Native Operations for Big Elizabeth K ... · Containerized, Cloud-Native Operations for Big Data Analytics SoCal DevOps July 19, 2017 ... Reduces resource consumption

© 2017 Mesosphere, Inc. All Rights Reserved. 4

Cloud-native scopes

Application

Container

Host

Page 5: Containerized, Cloud-Native Operations for Big Elizabeth K ... · Containerized, Cloud-Native Operations for Big Data Analytics SoCal DevOps July 19, 2017 ... Reduces resource consumption

© 2017 Mesosphere, Inc. All Rights Reserved. 5

Cloud-native with DC/OS

Application

Universal Container Runtime (UCR), Docker

DC/OS, Apache Mesos

Your

app

here!

Page 6: Containerized, Cloud-Native Operations for Big Elizabeth K ... · Containerized, Cloud-Native Operations for Big Data Analytics SoCal DevOps July 19, 2017 ... Reduces resource consumption

© 2017 Mesosphere, Inc. All Rights Reserved.

MODERN APPLICATION -> FAST DATA BUILT-IN

Data Ingestion

Request/Response

Devices

Client

Sensors

MessageQueue/Bus

Microservices Distributed Storage

Analytics(Streaming) Use Cases:

● Anomaly detection

● Personalization

● IoT Applications

● Predictive Analytics

● Machine Learning

Page 7: Containerized, Cloud-Native Operations for Big Elizabeth K ... · Containerized, Cloud-Native Operations for Big Data Analytics SoCal DevOps July 19, 2017 ... Reduces resource consumption

© 2017 Mesosphere, Inc. All Rights Reserved. 7

OK, got it!

Page 8: Containerized, Cloud-Native Operations for Big Elizabeth K ... · Containerized, Cloud-Native Operations for Big Data Analytics SoCal DevOps July 19, 2017 ... Reduces resource consumption

© 2017 Mesosphere, Inc. All Rights Reserved. 8

Now integrate it with the rest of your technology stack

Photo by michael davis-burchat, CC BY-ND 2.0: https://www.flickr.com/photos/curious_e/17108088858/

Page 9: Containerized, Cloud-Native Operations for Big Elizabeth K ... · Containerized, Cloud-Native Operations for Big Data Analytics SoCal DevOps July 19, 2017 ... Reduces resource consumption

© 2017 Mesosphere, Inc. All Rights Reserved. 9

● Integrates into your existing, familiar infrastructure● Reduces resource consumption (avoids multiple monitoring, logging agents, etc)● Simplifies troubleshooting (tracing a problem through the stack)● Consolidates view for all parties (from operations to app developers)

Unification of tooling

Page 10: Containerized, Cloud-Native Operations for Big Elizabeth K ... · Containerized, Cloud-Native Operations for Big Data Analytics SoCal DevOps July 19, 2017 ... Reduces resource consumption

© 2017 Mesosphere, Inc. All Rights Reserved. 10

Anyone can write a deployment tool.

What’s next?

Day 2 Operations

Page 11: Containerized, Cloud-Native Operations for Big Elizabeth K ... · Containerized, Cloud-Native Operations for Big Data Analytics SoCal DevOps July 19, 2017 ... Reduces resource consumption

© 2017 Mesosphere, Inc. All Rights Reserved. 11

Metrics and Monitoring- Collecting metrics- Downstream processing

- Alerting- Dashboards- Storage (long-term retention)

Logging- Scopes- Local vs. centralized- Security considerations

DAY 2 OPERATIONS

Page 12: Containerized, Cloud-Native Operations for Big Elizabeth K ... · Containerized, Cloud-Native Operations for Big Data Analytics SoCal DevOps July 19, 2017 ... Reduces resource consumption

© 2017 Mesosphere, Inc. All Rights Reserved. 12

Maintenance - Cluster Upgrades- Cluster Resizing- Capacity Planning- User & Package Management- Networking Policies- Auditing- Backups & Disaster Recovery

Troubleshooting- Debugging

- Services- System

- Tracing- Chaos engineering

DAY 2 OPERATIONS

Page 13: Containerized, Cloud-Native Operations for Big Elizabeth K ... · Containerized, Cloud-Native Operations for Big Data Analytics SoCal DevOps July 19, 2017 ... Reduces resource consumption

© 2017 Mesosphere, Inc. All Rights Reserved. 13

METRICS & MONITORING

Page 14: Containerized, Cloud-Native Operations for Big Elizabeth K ... · Containerized, Cloud-Native Operations for Big Data Analytics SoCal DevOps July 19, 2017 ... Reduces resource consumption

© 2017 Mesosphere, Inc. All Rights Reserved. 14

METRICSCONCEPTS

Page 16: Containerized, Cloud-Native Operations for Big Elizabeth K ... · Containerized, Cloud-Native Operations for Big Data Analytics SoCal DevOps July 19, 2017 ... Reduces resource consumption

© 2017 Mesosphere, Inc. All Rights Reserved. 16

METRICSTOOLCHAIN

● storage:

a. Elasticsearch

b. Graphite

c. InfluxDB

d. KairosDB/Cassandra

e. OpenTSDB/HBase

f. others such a local filesystem, Ceph FS,

HDFS, etc.

Page 17: Containerized, Cloud-Native Operations for Big Elizabeth K ... · Containerized, Cloud-Native Operations for Big Data Analytics SoCal DevOps July 19, 2017 ... Reduces resource consumption

© 2017 Mesosphere, Inc. All Rights Reserved. 17

METRICSTOOLCHAIN

● dashboard:

a. D3

b. Grafana

c. signal fx

● alerting:

a. BigPanda

b. PagerDuty

c. signal fx

d. VictorOps

Page 18: Containerized, Cloud-Native Operations for Big Elizabeth K ... · Containerized, Cloud-Native Operations for Big Data Analytics SoCal DevOps July 19, 2017 ... Reduces resource consumption

© 2017 Mesosphere, Inc. All Rights Reserved. 18

INTEGRATEDMETRICSTOOLCHAIN

● Amazon CloudWatch ● AppDynamics ● Azure Monitor ● Circonus ● DataDog ● dcos/metrics● Ganglia ● Google Stackdriver ● Hawkular ● Icinga ● Librato ● Nagios ● New Relic ● OpsGenie ● Pingdom ● Prometheus ● Ruxit Dynatrace● Sensu ● Sysdig● Zabbix

Page 19: Containerized, Cloud-Native Operations for Big Elizabeth K ... · Containerized, Cloud-Native Operations for Big Data Analytics SoCal DevOps July 19, 2017 ... Reduces resource consumption

© 2017 Mesosphere, Inc. All Rights Reserved. 19

LOGGING

Page 20: Containerized, Cloud-Native Operations for Big Elizabeth K ... · Containerized, Cloud-Native Operations for Big Data Analytics SoCal DevOps July 19, 2017 ... Reduces resource consumption

© 2017 Mesosphere, Inc. All Rights Reserved. 20

LOGGINGSCOPES

Application

Container

Host

Page 21: Containerized, Cloud-Native Operations for Big Elizabeth K ... · Containerized, Cloud-Native Operations for Big Data Analytics SoCal DevOps July 19, 2017 ... Reduces resource consumption

© 2017 Mesosphere, Inc. All Rights Reserved. 21

LOGGINGTOOLINGEXAMPLES(PRIMITIVES) ● DC/OS logging overview

● Docker logging drivers

● systemd's journalctl

Page 23: Containerized, Cloud-Native Operations for Big Elizabeth K ... · Containerized, Cloud-Native Operations for Big Data Analytics SoCal DevOps July 19, 2017 ... Reduces resource consumption

© 2017 Mesosphere, Inc. All Rights Reserved. 23

TROUBLESHOOTING

Incl. examples with DC/OS

Page 24: Containerized, Cloud-Native Operations for Big Elizabeth K ... · Containerized, Cloud-Native Operations for Big Data Analytics SoCal DevOps July 19, 2017 ... Reduces resource consumption

© 2017 Mesosphere, Inc. All Rights Reserved. 24

Effective troubleshooting

A high level view to discover where the error or failure has occurred (preferably a unified view)

Tooling for tracing an error through the stack (systems, networks, etc)

Team communication and tooling for delegating solutions responsibility

Page 25: Containerized, Cloud-Native Operations for Big Elizabeth K ... · Containerized, Cloud-Native Operations for Big Data Analytics SoCal DevOps July 19, 2017 ... Reduces resource consumption

© 2017 Mesosphere, Inc. All Rights Reserved. 25

DEBUGGING101 ● Services: typically specific to service, use logging (for

example, dcos task log) and dcos node ssh or

dcos task exec for per-node investigations

● System:

○ Simple diagnostics via dcos node diagnostics

○ Comprehensive dump via clump

○ Services deployment troubleshooting dashboard

Page 26: Containerized, Cloud-Native Operations for Big Elizabeth K ... · Containerized, Cloud-Native Operations for Big Data Analytics SoCal DevOps July 19, 2017 ... Reduces resource consumption

© 2017 Mesosphere, Inc. All Rights Reserved. 26

Debugging Dashboard

Page 27: Containerized, Cloud-Native Operations for Big Elizabeth K ... · Containerized, Cloud-Native Operations for Big Data Analytics SoCal DevOps July 19, 2017 ... Reduces resource consumption

© 2017 Mesosphere, Inc. All Rights Reserved. 27

OTHER TROUBLESHOOTING TECHNIQUES

● Tracing

○ Idea: identify latency issues and perform

root-cause analysis in a distributed setup

○ OpenTracing

● Chaos Engineering

○ Idea: proactively break (parts of) the system to

understand how it reacts

○ Chaos Monkey

○ DRAX

Page 28: Containerized, Cloud-Native Operations for Big Elizabeth K ... · Containerized, Cloud-Native Operations for Big Data Analytics SoCal DevOps July 19, 2017 ... Reduces resource consumption

© 2017 Mesosphere, Inc. All Rights Reserved. 28

MAINTENANCE & BEYOND

Page 29: Containerized, Cloud-Native Operations for Big Elizabeth K ... · Containerized, Cloud-Native Operations for Big Data Analytics SoCal DevOps July 19, 2017 ... Reduces resource consumption

© 2017 Mesosphere, Inc. All Rights Reserved. 29

Overview

● How to install a new version of X?● When to scale what (service-level vs. nodes)● Who gets to access/install which services in what way?

Upgrades

Sizing

User and package management

● Is everything getting where it needs to be? Does some traffic need priority?● What services can talk to each other and in which way?● Who accessed what, when and how?● How is the continuous operation of the cluster and the services accomplished?

What happens when cluster (or critical infra components like ZK) go down?

Networking

Auditing

Disaster Recovery

Page 30: Containerized, Cloud-Native Operations for Big Elizabeth K ... · Containerized, Cloud-Native Operations for Big Data Analytics SoCal DevOps July 19, 2017 ... Reduces resource consumption

© 2017 Mesosphere, Inc. All Rights Reserved. 30

Things will go wrong.

These things can’t be an afterthought.

You must build time into your deployment and maintenance plans.

Planning

Page 31: Containerized, Cloud-Native Operations for Big Elizabeth K ... · Containerized, Cloud-Native Operations for Big Data Analytics SoCal DevOps July 19, 2017 ... Reduces resource consumption

© 2017 Mesosphere, Inc. All Rights Reserved. 31

Cloud-Native Infrastructure “Must Haves”

❏ Metrics collection❏ Centralized logging❏ Debugging tools that cover:

❏ Host❏ Container❏ Application

❏ Upgrade strategy❏ Backups❏ Disaster recovery

Checklist

Page 32: Containerized, Cloud-Native Operations for Big Elizabeth K ... · Containerized, Cloud-Native Operations for Big Data Analytics SoCal DevOps July 19, 2017 ... Reduces resource consumption

© 2017 Mesosphere, Inc. All Rights Reserved. 32

Questions? Feedback?

Elizabeth K. JosephTwitter: @pleia2

Email: [email protected]

@dcos

[email protected]

/dcos/dcos/examples/dcos/demos

chat.dcos.io

Page 33: Containerized, Cloud-Native Operations for Big Elizabeth K ... · Containerized, Cloud-Native Operations for Big Data Analytics SoCal DevOps July 19, 2017 ... Reduces resource consumption

© 2017 Mesosphere, Inc. All Rights Reserved. 33

Let’s process some data!

https://github.com/dcos/demos/tree/master/fastdata-iot/

Page 34: Containerized, Cloud-Native Operations for Big Elizabeth K ... · Containerized, Cloud-Native Operations for Big Data Analytics SoCal DevOps July 19, 2017 ... Reduces resource consumption

© 2017 Mesosphere, Inc. All Rights Reserved.

The SMACK Stack

Data Ingestion

Request/Response

Devices

Client

Sensors

Use Cases:

● Anomaly detection

● Personalization

● IoT Applications

● Predictive Analytics

● Machine Learning

MessageQueue/Bus

Microservices Distributed Storage

Analytics(Streaming)