Top Banner
Spark the future.
61

Adam Zegelin & Richard Banks

Jan 06, 2018

Download

Documents

Amice Beasley

Adam Zegelin & Richard Banks Dockerizing Cassandra on Azure Adam Zegelin & Richard Banks ARC443A
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Adam Zegelin & Richard Banks

Spark the future.

Page 2: Adam Zegelin & Richard Banks

Adam Zegelin & Richard Banks

Dockerizing Cassandra on Azure

ARC443A

Page 3: Adam Zegelin & Richard Banks

Setting the scene

Page 4: Adam Zegelin & Richard Banks
Page 5: Adam Zegelin & Richard Banks
Page 6: Adam Zegelin & Richard Banks

Before Containers…

Page 7: Adam Zegelin & Richard Banks

Shipping Containers InventedWhen?

https://en.wikipedia.org/wiki/Malcom_McLean

In 1956, most cargo was loaded and

unloaded by hand. Hand-loading a ship

cost $5.86 a ton.

Malcom McLean born in 1913

developed the modern intermodal

shipping container, which revolutionized

transport and international trade.

McLean knew "A ship earns money only

when she's at sea," and based his

business on that efficiency.

Using containers, it cost only 16 cents a

ton, a 39-fold savings. Containerization

also greatly reduced the time to load and

unload ships, improving reliability.

Page 8: Adam Zegelin & Richard Banks

Virtual Machine

Start with a VM.

Page 9: Adam Zegelin & Richard Banks

Application

Stuff our application into a single, bloated image.

Page 10: Adam Zegelin & Richard Banks

Application

Everything’s great...

Page 11: Adam Zegelin & Richard Banks

Application

Until the system goes down or traffic goes up.

Page 12: Adam Zegelin & Richard Banks

Two options: scale up, or scale out

Page 13: Adam Zegelin & Richard Banks

Scale out

Application ApplicationApplication

3x Resources, (N)x Dev-ops Complexity

Page 14: Adam Zegelin & Richard Banks

Scale out

3x Resources, (N)x Dev-ops Complexity

Web API Web API Web API

Page 15: Adam Zegelin & Richard Banks

But what if I only need to scale parts of my app?

3x Resources, (N)x Dev-ops Complexity

Web API API API

Page 16: Adam Zegelin & Richard Banks

Web

API

Docker Images Create Containers

1.

2.

3.

Deploy to VMs

Page 17: Adam Zegelin & Richard Banks

Containers are likelight-weight VMs.

Page 18: Adam Zegelin & Richard Banks

Traditional VMapproach

Page 19: Adam Zegelin & Richard Banks

Containerapproach

 Container 1 Container 2 Container 3

Unassigned

Host and VM OS files and libraries

Host

Page 20: Adam Zegelin & Richard Banks

Dependencies: Every application has it’s own dependencies which includes both software (services, libraries) and hardware (CPU, memory, storage)

Virtualization: Container engine is a light weight virtualization mechanism which isolates these dependencies per each application by packaging them into virtual containers

Shared host OS: Processes in containers are isolated from other containers in user space, but share the kernel with the host and other containers

Flexible: Differences in underlying OS and infrastructure are abstracted away, streamlining “deploy anywhere” approach

Fast: Containers can be created almost instantly, enabling rapid scale-up and scale-down in response to changes in demand

Container

App ABins/libraries

App BBins/libraries

Host OSw/Container support

Server

Page 21: Adam Zegelin & Richard Banks

Containerized apps run anywhere containers run: Linux and Windows.

Page 22: Adam Zegelin & Richard Banks

Containers

Developers build and test apps in containers, using development environment

Containers pushed tocentral repository

Operations automates deployment and monitors deployed apps from central repository

Physical/Virtual Servers

1 2

2

3Operations collaborates with developers to provide app metrics and insights

Developers update, iterate, and deploy updated containers

Creation, deployment, and management

Page 23: Adam Zegelin & Richard Banks

A Vibrant Community

Page 24: Adam Zegelin & Richard Banks

Docker integrationJoint strategic investments to drive containers forward

Docker: An open source engine that automates the deployment of any application as a portable, self-sufficient container that can run almost anywherePartnership: Enable the Docker toolset to manage multi-container applications using both Linux and Windows containers, regardless of the hosting environment or cloud provider Investments in the next wave

of Windows ServerOpen source development of theDocker Engine for Windows Server

Azure support for theDocker Open Orchestration APIsFederation of Docker Hub images into the Azure Gallery and Portal

Strategic investments

Docker

Dockerized app

Windows Server

ContainerLinux

Container

CustomerDatacenter

ServiceProvider

MicrosoftAzure

Run anywhere

Page 25: Adam Zegelin & Richard Banks

Docker integrationJoint strategic investments to drive containers forward

Docker Hub in Azure: Huge collection of open and curated applications available for downloadCollaboration: Bring Windows Server containers to the Docker ecosystem to expand the reach of both developer communitiesDocker Engine: Docker Engine for Windows Server containers is being developed under the aegis of the Docker open source projectDocker client: Windows customers can use the same standard Docker client and interface on multiple development environments

Docker Client

Windows Server Linux

Docker Engine(Daemon)

Windows ServerContainer Support

Linux ContainerSupport (LXC)

Docker Engine(Daemon)

Docker.exeExamples:docker rundocker images

Docker Remote APIExamples:GET images/jsonPOST containers/create

Page 26: Adam Zegelin & Richard Banks

Instaclustr

Page 27: Adam Zegelin & Richard Banks

Myself & InstaclustrAdam Zegelin — Founding Software Engineer & Co-founder of [email protected] ∙ @zegelin

Managed DataStax Enterprise and Apache Cassandra in the cloud.

Self-service dashboard — create, manage & monitor clusters.24/7/365 support, on-call engineers, uptime guarantee.You focus on developing awesome apps — we handle the Cassandra.Grew from a need for Cassandra in a project.

Page 28: Adam Zegelin & Richard Banks

CassandraOpen-source NoSQL DBMS for high-performance high-throughput apps.

Fault-tolerant — data is replicated on multiple nodes.Scalable — read & write performance scales linearly with node count.Decentralized — no single point of failure.Multi-DC capable — replication across data centres/regions.Tunable consistency — writes “never fail”/quorum/ack from all nodes.

Page 29: Adam Zegelin & Richard Banks

Cassandra cont’d

3,000 ops/s 6,000 ops/s 9,000 ops/s

Page 30: Adam Zegelin & Richard Banks

Cassandra cont’dVariety of use cases.Internet advertising, social media, internet-of-things, online games.

Used by some big names.Microsoft, Netflix, Ebay, Apple, CERN, GitHub, Reddit.

Page 31: Adam Zegelin & Richard Banks

Cassandra in the cloudPay for what you use.

Hardware is no longer a concern.

Flexible — respond quickly to changes in capacity requirements.

Page 32: Adam Zegelin & Richard Banks

Ubuntu — The Early DaysInitially we ran a customised Ubuntu image.

Custom cloud-init scripts — RAID disks, fetch config, etc.

Cassandra installed with apt-get install cassandra / dse.Version upgrades were done in-place.

It worked, but was painful to manage.

Page 33: Adam Zegelin & Richard Banks

Cassandra Nodes — Software StackCoreOS — lightweight container runtime OS.Docker — containerisation of everything.systemd — service management.journald — logging.D-Bus — controlling systemd from Java from inside containers.

Page 34: Adam Zegelin & Richard Banks

CoreOSOne of the first “Docker Operating Systems”.

Small and minimalist — not much userland (not even man!).Runs systemd (vs. Ubuntu’s at-the-time upstart) & journald & dbus.In-use by some big players (Rackspace, PlayStation, Instaclustr 😀).Recent funding from Google Ventures.

Page 35: Adam Zegelin & Richard Banks

CoreOS cont’dCoreOS is responsible for building images for Azure.One less step in our build process.

In-place updates with rollback on failure.2 system partitions, USR-A and USR-B.One is flagged active, other is inactive.Updates are installed to inactive partition and active flags swapped.Failed updates rolled back by swapping the active flag.

Page 36: Adam Zegelin & Richard Banks

DockerContainer runtime + standardised image distribution & hosting + ecosystem.

Immutable images — Yay! 🎉Images running in dev, test and production environments are equal.Software installs, upgrades and uninstalls are clean.Components are isolated — potentially conflicting components (different library versions, JVM versions, etc.) can co-exist.

Page 37: Adam Zegelin & Richard Banks

Docker cont’dWe containerise everythingC*, internal services, node management and monitoring apps

Single, well understood, image build and deploy processdocker build & docker push

Executed via Makefiles — one Make target per image. make push-all builds and pushes everything.

Page 38: Adam Zegelin & Richard Banks

debian:jessie

common-base

base-openjdk base-oraclejre

cassandra-common

apache-cassandradse-cassandra

Instaclustr apps

≅120MB

≅ 100MB

≅ 300MB

≅ 20KB

≅ 300MB≅ 40MB

≅ 100MB

Common/utility packages.python, openssl, curl, bzip, etc.

Published by Debian

Page 39: Adam Zegelin & Richard Banks

Docker + CoreOSDocker gives us immutable images for our components without instance replacement.CoreOS handles the rest (OS-level) via in-place updates.

Docker is provider agnostic.CoreOS runs on all major cloud providers and bare-metal.

The result ☞ Instaclustr-managed C* can run anywhere!

Page 40: Adam Zegelin & Richard Banks

Cassandra VersioningWe support multiple versions of Cassandra2.0.x / 2.1.x / 2.2.x / 3.xApache (ASF) vs. DataStax Enterprise

1 docker image per C* distribution (ASF/DSE).1 tag per version (e.g., 2.1.5).

Page 41: Adam Zegelin & Richard Banks

Cassandra Versioning cont’dCompare with old model — image per distribution × version × region

We currently support 2 distributions, with a total of 13 versions between them, on 3 cloud providers, with a total of 29 regions (each requiring a separate image)

13 versions × 29 provider regions = 377 images! 😳

Page 42: Adam Zegelin & Richard Banks

Cassandra Versioning cont’dEvery Instaclustr cluster has a specific C* version.Selected by user at creation time.

New & replaced nodes must the exact same version.

Known, sane configuration on every node cluster wide.

Controlled upgrades & rollbacks.

Page 43: Adam Zegelin & Richard Banks

Cassandra Versioning cont’dDocker = immutable images.

New images per version = clean installs every time.

Prevents the “diverging systems” problem.In-place upgrade ≠ clean install.

Page 44: Adam Zegelin & Richard Banks

docker run cassandra:2.1.5

docker run cassandra:2.1.4

docker run cassandra:2.1.3

docker run cassandra:2.1.2

ubuntu + cassandra 2.1.2

image

ubuntu + cassandra 2.1.3

image

ubuntu + cassandra 2.1.4

image

ubuntu + cassandra 2.1.5

image

apt-get install cassandra 2.1.4

apt-get install cassandra 2.1.5

apt-get install cassandra 2.1.5

apt-get install cassandra 2.1.4

apt-get install cassandra 2.1.3

vs.

Page 45: Adam Zegelin & Richard Banks

Upgrade RolloutBuild Docker image for new Cassandra version.

Deploy to our testing environments.Perform clean installs and rolling upgrades of test clusters to verify reliability.

Enable in production to select customers for field testing.

Make generally available — new clusters will run new version by default.

Liaise with customers to perform a rolling, cluster-wide upgrade.

Page 46: Adam Zegelin & Richard Banks

systemdCoreOS uses systemd for service management.

Has powerful service dependency management.

Controllable via D-Bus — in-fact systemctl communicates with systemd via D-Bus. Anything systemctl does can be done via D-Bus calls.

Page 47: Adam Zegelin & Richard Banks

D-BusRPC & notifications between processes.

Socket-based (typically UNIX sockets, but can be TCP).Accessible inside a container — mount the socketdocker run -v /run/dbus:/run/dbus -v /run/systemd:/run/systemd …

Multiple language bindings, including Java.

Page 48: Adam Zegelin & Richard Banks

systemd + D-Bussystemd is controlable via D-Buscontrol host systemd inside a Docker container

No need to fork/exec to run systemctl and co.

Java bindings — dbus-javasystemctl restart cassandra.service ≝systemdManager.RestartUnit("cassandra.service", "replace");

Page 49: Adam Zegelin & Richard Banks

Cassandra on Docker + systemdCassandra runs as PID 1 in the container.1 primary process per container model — process sandboxing not entire userland.

Cassandra runs in foreground mode (-f).Responds to SIGTERM via docker stop, systemctl stop, etc.

Cassandra data and configuration is persistent on host.Survives container restart.Cassandra data and configuration directories mounted from host:

docker run -v /data/etc/cassandra:/etc/cassandra …

Page 50: Adam Zegelin & Richard Banks

Cassandra on Docker + systemd cont’dDocker containers managed via systemd.

cassandra.service execs docker run cassandra …systemctl [start|stop|restart|status|…] cassandra

Cassandra logging configured to write only to stdoutsystemd logging best practice —  systemd forwards stdout/stderr to journal.Message flow: Cassandra ⇢ Docker ⇢ systemd ⇢ journal.

Logs available with journalctl -u cassandra

Page 51: Adam Zegelin & Richard Banks

Docker Tips — PID 1PID 1 gets special treatment by the kernel — traditionally it is init.Both on the host and in a container.

Processes get a default signal(7) handler — except PID 1.

PID 1 ∴ ignores all signals for which there hasn’t been a handler set.SIGTERM, SIGINT and SIGHUP all become no-ops.SIGKILL (9) still works — it’s un-catchable/un-interupptable.

Page 52: Adam Zegelin & Richard Banks

Docker Tips — PID 1 cont’ddocker stop sends a SIGTERM, waits, then SIGKILL.systemctl stop sends SIGTERM, waits, then SIGKILL.

SIGKILL is very bad.It’s a last resort — you don’t want your processes being nuked.

Only effects processes that don’t install their own signal handlers.bash doesn’t, unless you use trap or exec a process that does.

Watch out for wrapper scripts that don’t exec.java & python are safe OOTB — they install custom handlers.

Page 53: Adam Zegelin & Richard Banks

Docker Tips — Streamline apt-getapt-get caches packag source lists — don’t embed these in your images.Contributes to Docker image size and can cause image build failures due to stale data.

dagi — “docker apt-get install”#!/bin/bash

apt-get updateDEBIAN_FRONTEND=noninteractive apt-get -qy \

--no-install-recommends ${APT_GET_OPTS:-} install $@rm -rf /var/lib/apt/lists/*

Page 54: Adam Zegelin & Richard Banks

Docker Limitations & Sore Spotsdocker run is just a TTY proxy (over HTTP no less — woo!)Actual container process is under the docker dæmon process/cgroup.systemd requires startup & watchdog notifications to originate from started process, a child, or process in same cgroup.

docker dæmon crash = all containers go bye-bye

docker [run|push|pull|…] everything, inc. image downloads & builds, runs as root in the dæmon on the host!We run processes inside containers un-elevated.

Page 55: Adam Zegelin & Richard Banks

Docker limitations & sore spotsDockers default bridge networking is very slow.We run our containers with host networking (--net=“host”).Avoids double-NAT.

AUFS performance is slow.Mount an EXT4/XFS disk into the container and write to that.We put C*’s data here, for example.

Page 56: Adam Zegelin & Richard Banks

btrfsWas the default filesystem on CoreOS (not anymore).

One of many storage drivers for Docker.

My advice: AvoidConfusing user-space toolsVery painful when you run out of disk space

Page 57: Adam Zegelin & Richard Banks

btrfs cont’d# df -hFilesystem Size Used Avail Use% Mounted onrootfs 5.7G 2.6G 3.1G 47% /<snip>

# touch /test-filetouch: cannot touch 'test-file': No space left on device

Page 58: Adam Zegelin & Richard Banks

Q&AAdam Zegelin – @zegelinRichard Banks - @rbanks54

Page 59: Adam Zegelin & Richard Banks

Complete your session evaluation on My Ignite for your chance to win one of many daily prizes.

Page 60: Adam Zegelin & Richard Banks

Continue your Ignite learning pathVisit Microsoft Virtual Academy for free online training visit https://www.microsoftvirtualacademy.comVisit Channel 9 to access a wide range of Microsoft training and event recordings https://channel9.msdn.com/Head to the TechNet Eval Centre to download trials of the latest

Microsoft products http://Microsoft.com/en-us/evalcenter/

Page 61: Adam Zegelin & Richard Banks

© 2015 Microsoft Corporation. All rights reserved.Microsoft, Windows and other product names are or may be registered

trademarks and/or trademarks in the U.S. and/or other countries.MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY,

AS TO THE INFORMATION IN THIS PRESENTATION.