Powering Radical Agility with Docker

Post on 18-Jan-2017

471 Views

Category:

Technology

2 Downloads

Preview:

Click to see full reader

Transcript

Docker - Powering RA at ZalandoDocker Meetup - Dortmund 7.6.2016 | jan.mussler@zalando.de | @JanMussler

15 countries3 fulfillment centers18+ million active customers3.0+ billion € revenue135+ million visits per month1.000+ employees in tech

Europe's Leading Fashion Platform

Visit us: tech.zalando.com

Zalando’s Technology History

Platform

80+ Engineering teams

Platform team

deploy

Server needsStorage requests

RADICAL AGILITY

AUTONOMY

Compliance Innovation

STUPS

AWS

STUPS

DOCKERDEPLOY

SSH ACCESS

AUDIT REPORTS

FULL AWS ACCESS

STUPS: A PLATFORM ON TOP OF AMAZON WEB SERVICES

Internet

*.abc.example.org *.xyz.example.org

Team ABC Team XYZ

ISOLATED AWS ACCOUNTS

EC2EC2

ELBELB

EC2

DEPLOYMENT

IMMUTABLE STACKS

ELB myapp-1

myapp.example.org

EC2+ Docker

EC2+ Docker

EC2+ Docker

IMMUTABLE STACKS

ELB myapp-1

EC2+ Docker

EC2+ Docker

EC2+ Docker

ELB myapp-2

EC2+ Docker

EC2+ Docker

myapp.example.org

● Immutable AMI● YAML user data● Docker runtime● Application logging:

LogEntries, Scalyr, CloudWatch Logs

● Prometheus Node Agent for metrics● KMS encrypted env vars

TAUPAGE AMI

TaupageAMI

SENZA: DEFINITION YAML

SenzaInfo:

StackName: hello-world

Parameters:

- ImageVersion:

Description: "Docker image version of Hello World."

SenzaComponents:

- Configuration:

Type: Senza::StupsAutoConfiguration # auto-detect network setup

- AppServer: # will create a launch configuration and ASG with scaling triggers

Type: Senza::TaupageAutoScalingGroup

InstanceType: t2.micro

SecurityGroups: [app-hello-world]

ElasticLoadBalancer: AppLoadBalancer

TaupageConfig:

runtime: Docker

source: "stups/hello-world:{{Arguments.ImageVersion}}"

ports:

8080: 8080

SENZA: STACK DEPLOYMENT

$ senza create hello-world.yaml 1 0.2

Generating Cloud Formation template.. OK

Creating Cloud Formation stack hello-world-1.. OK

$ senza events hello-world.yaml 1Stack Name│Ver.│Resource Type │Resource ID │Status │Status Reason │Event Timehello-world 1 CloudFormation::Stack hello-world-1 CREATE_IN_PROGRESS User Initiated 10m ago

...

hello-world 1 CloudFormation::Stack hello-world-1 CREATE_COMPLETE 6m ago

SENZA: MANAGE STACKS

SSH ACCESS

SSH ACCESS: TIME-LIMITED ACCESS TO ANY TEAM SERVER

LOGGING

AutomationGOCD

Thoughtwork’s GOCD in action

GOCD - Pipeline example - configuration overlay

Plan - B

TheOAuth 2.0 authorization framework enables a third-party applicationto obtain limited access toan HTTP service.

- oauth.net

OAUTH 2.0?

● Robustness & resilience⇒ Cassandra, no SPOF

● Low latency for token validation⇒ Token Info next to application

● Horizontal scalability⇒ Cassandra, “stateless” Token Info

PLAN B: GOALS - Build open source Oauth2 Provider

PLAN B: COMPLETE PICTURE

bobalice

createtoken

Token Infovalidate

Provider

credential storageRevocation

pollpublic keys

pollrevocation listsS3

call with Bearer token

Written in Go

~16 MB Docker image

Stateless application

CPU bound, Go 1.6 ~40x speedup for EC verify

EC2 instance start to healthy: 45sec

Scaling Token Info example

ZMON

Flexible and extendable: Checks & Alerts in Python

Integrate: REST APIs, OAUTH2, AWS Auto Discovery

Fully configurable via UI / API: no restarts required!

Great for teams: team dashboards, alerts inheritance

Fast/scaling metrics: Redis, KairosDB + Grafana3

Hackweek 2015 - iOS app and Android app ;-)

ZMON - High Lights ;-)

Continued ...

Instance Metrics● Memory usage● Disk space usage● CPU usage● Application logs● Application metrics

Monitoring instances on AWS

Scalyr AgentLog shipping

PrometheusNode Agent:9100/metrics

Taupage AMI (Ubuntu base)

Application ContainerGo / Spring Boot / CassandraDocker run time:8080 -> app:7979 -> metrics

Annotated Metric Data in Grafana

Annotated Metric Data in Grafana

Running same Docker Image everywhere

CLAIR - SQS

CoreOS’ Clair with PierOne - Static vulnerability analysis of images

Learnings?

● AWS terminology and behavior● OAuth2 + Security + Security Groups● Ops can be hard -> SaaS?● CF deployment takes time● DNS load balancing and switching :-(

○ Remember timeout config …!!○ ELB soso ...

● Great flexibility and power though

A lot of input to cover ...

Zalando on Github:https://github.com/zalando

STUPS online:https://stups.io

ZMON Demo:https://demo.zmon.io

Zalando Tech:https://tech.zalando.com

top related