Top Banner
Sebastien Goasguen, @sebgoa On CloudStack Docker, Kubernetes and Big Data…Oh my !
92

On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

Jul 15, 2015

Download

Technology

Radhika Nair
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

Sebastien Goasguen,

@sebgoa

On CloudStackDocker, Kubernetes

and Big Data…Oh my !

Page 2: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

Who am I ?

• Joined Citrix OSS team in July 2012

• Associate professor at Clemson

University prior

• High Performance Computing, Grid

computing

• At CERN summer 2009/2010, built their

first cloud on opennebula

• http://sebgoa.blogspot.com

@sebgoa

Page 3: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

• Apache CloudStack and licloud committer + PMC member

• Looking at techs and how they work together

• Half dev, half community manager, + half event planner

What do I do ?

Page 4: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

Today’s talk

Page 5: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

CloudStack

Page 6: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

IaaS Landscape

Page 8: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

CloudStackclouds

Page 9: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

Load Balancers FWs & VPNs

Dashboard Identity Mgmt.Image Mgmt.

ComputeStorage Network

MeteringAPI (EC2 & CS) Self-service Portal

Data Center Orchestrator

Page 10: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

Interfaces and standards

Page 11: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

The Apache Software Foundation

Page 12: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

Apache Software Foundation

Page 13: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

35 projects in incubation:• 12 Hadoop related• ~30% Big Data related• Spark

117 top level projects:• ~16 cloud or bigdata +10%• Deltacloud, Libcloud, Whirr, jclouds• Hadoop, couchdb, cassandra, mesos, aurora• Spark• Bigtop, accumulo, lucene, UIMA• CloudStack

Page 14: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

IaaS History

Page 15: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

VMWare1998

Xen 2003

HW assisted Virt2005

EC22006

OpennebulaEucalyptus2008

CloudStack2010

Openstack2010

GCE2012

Page 16: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

Goals

• Utility computing

• Elasticity of the infrastructure

• On-demand

• Pay as you go

• Multi-tenant

• Programmable access

Page 17: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

So what…

Let’s assume this is solved.

What is not solved:

- Application deployment

- Application scalability

- Application portability

- Application composability

Page 18: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

Docker

Page 19: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

• Linux container (LXC +)

• Application deployment

• PaaS• Portability• Image sharing via

DockerHub• Ease of packaging

applications

Page 20: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

Building docker images

Fair use from http://blog.octo.com/en/docker-registry-first-steps/

Page 21: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

+ configmgmt

Page 22: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

“Bake vs. Fry and what do we do

with configuration management

tools ?”

Page 23: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

CoreOS

Page 24: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

Good job

Page 25: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

Similar projects

Page 26: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

coreOS and CloudStack

Page 27: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

CoreOS

• Linux distribution

• Rolling upgrades

• Minimal OS

• Docker support

• etcd and fleet tools to manage distributed applications based on containers.

• Cloud-init support

• Systemd units

Page 28: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

Building CoreOS

See Brian Harrington Video..

Page 29: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

coreOS “OEM”

http://github.com/coreos/coreos-overlay

Page 30: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

coreOS“OEM”

http://github.com/coreos/coreos-overlay

Page 31: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

The cloudinit magic

Page 32: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

CoreOS on exoscale

Page 33: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

Starting containers

#cloud-config

coreos:

units:

- name: docker.service

command: start

- name: es.service

command: start

content: |

[Unit]

After=docker.service

Requires=docker.service

Description=starts ElasticSearch container

[Service]

TimeoutStartSec=0

ExecStartPre=/usr/bin/docker pull dockerfile/elasticsearch

ExecStart=/usr/bin/docker run -d -p 9200:9200 -p 9300:9300

dockerfile/elasticsearch

Page 34: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

Openvm.eu

http://dl.openvm.eu/cloudstack/coreos/x86_64/

Page 35: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

DEMO ?

Page 36: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

CoreOS clustering

etcd HA key value store• Raft election algorithm

• Writes when majority in cluster has committed update

• e.g 5 nodes, tolerates 2 nodes failure

fleet distributed init system (schedules systemd units in a cluster)

• Submits systemd units cluster wide

• Affinity, anti-affinity, global “scheduling”

Page 37: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

CoreOS Cluster

Page 38: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

“coreOS is the first cloud OS dedicated

to docker based application workloads”

Page 39: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

“Where are you going to run coreOS ?”

“Where are you going to run Docker ?“

Page 40: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

- Bare metal cluster

- Public Clouds

- Private Clouds

Page 41: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

“How are you going to manage containers running on multiple DockerHosts ?”

Page 42: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

Docker schedulers

• Docker Swarm

• Citadel

• CoreOS Fleet

• Lattice from CF

incubator

• Clocker (via

blueprints)

• …

• Kubernetes

Page 43: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

Kubernetes on CloudStack

Page 44: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

Kubernetes• Docker application

orchestration

• Google GCE, rackspace, Azure providers

• Deployable on CoreOS

• Container replication

• HA services

Page 45: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa
Page 46: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

Kubernetes API

Page 47: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

{

"id": "redis-master-2",

"kind": "Pod",

"apiVersion": "v1beta1",

"desiredState": {

"manifest": {

"version": "v1beta1",

"id": "redis-master-2",

"containers": [{

"name": "master",

"image": "dockerfile/redis",

"ports": [{

"containerPort": 6379,

"hostPort": 6379

"labels": {

"name": "redis-master"

}

}

Kubernetes Pod

Page 48: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

Standardizing on pod

Look at the differences between:

- k8s pod

- AWS ECS task

- Ansible Docker playbook

- Fig file

Page 49: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

?- hosts: wordpress

tasks:

- name: Run mysql container

docker:

name=mysql

image=mysql

detach=true

env="MYSQL_ROOT_PASSWORD=wordpressdocker,MYSQL_DATABASE=wordpress, \

MYSQL_USER=wordpress,MYSQL_PASSWORD=wordpresspwd"

- name: Run wordpress container

docker:

image=wordpress

env="WORDPRESS_DB_NAME=wordpress,WORDPRESS_DB_USER=wordpress, \

WORDPRESS_DB_PASSWORD=wordpresspwd"

ports="80:80"

detach=true

links="mysql:mysql"

Page 50: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

?wordpress:

image: wordpress

links:

- mysql

ports:

- "80:80"

environment:

- WORDPRESS_DB_NAME=wordpress

- WORDPRESS_DB_USER=wordpress

- WORDPRESS_DB_PASSWORD=wordpresspwd

mysql:

image: mysql

volumes:

- /home/docker/mysql:/var/lib/mysql

environment:

- MYSQL_ROOT_PASSWORD=wordpressdocker

- MYSQL_DATABASE=wordpress

- MYSQL_USER=wordpress

- MYSQL_PASSWORD=wordpresspwd

Page 51: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

?apiVersion: v1beta1

id: wordpress

desiredState:

manifest:

version: v1beta1

id: wordpress

containers:

- name: wordpress

image: wordpress

ports:

- containerPort: 80

volumeMounts:

# name must match the volume name below

- name: wordpress-persistent-storage

# mount path within the container

mountPath: /var/www/html

env:

- name: WORDPRESS_DB_PASSWORD

# change this - must match mysql.yaml password

value: yourpassword

volumes:

- name: wordpress-persistent-storage

source:

# emptyDir: {}

persistentDisk:

# This GCE PD must already exist.

pdName: wordpress-disk

fsType: ext4

labels:

name: wpfrontend

kind: Pod

Page 52: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

?[

{

"image": "wordpress",

"name": "wordpress",

"cpu": 10,

"memory": 200,

"essential": true,

"links": [

"mysql"

],

"portMappings": [

{

"containerPort": 80,

"hostPort": 80

}

],

"environment": [

{

"name": "WORDPRESS_DB_NAME",

"value": "wordpress"

},

Page 53: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

Kubernetes on CloudStack

Find a CloudStack cloud that supports

CoreOS

Then use:

https://github.com/runseb/ansible-kubernetes

Based on the Ansible cloudstack module

Page 54: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

Cloud API

Libcloud startup scripts

Etcd cluster5 nodes

Discovery service to bootstrap

Kubernetes cluster5 nodes

Start Kube* services via fleetRun guestbook example

PR welcome:https://github.com/runseb/

kubernetes-exoscale

OLD WAY

Page 55: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

Cloud (e.g CloudStack based = exoscale)

coreOS coreOS coreOS

K* K* K*Docker

containerDocker

containerDocker

container

API calls to Kubernetes API

Page 56: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

Demo ?

Page 57: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

A view on Big Data

Page 58: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

Demo …

https://github.com/runseb/ansible-mesos-

cloudstack

Page 59: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

SKA

Page 60: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

http://www.economist.com/node/15557443?story_id=15557443

Page 61: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa
Page 62: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa
Page 63: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa
Page 64: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

How did we get there ?

Page 65: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

A natural evolution

Page 66: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

New Distributed systems for:

Large scale datasets

• From scientific instruments

• From Web apps logs

Complex datasets

• Not necessarily large.

Object stores

• S3 clones

Page 67: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

BigData and map-reduce

• While BigData is often associated with HDFS, Map-Reduce is the algorithm used to parallelize data processing.

• BigData ≠ Map-Reduce ≠ HDFS

• Map-reduce is a way to express embarrassingly parallel work easily.

• You can do Map-Reduce without HDFS.

• e.g Basho map-reduce on riackCS

Page 68: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

A really quick view on Clouds

Page 69: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa
Page 70: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa
Page 71: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

Today

Page 72: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

BigData at peak

Page 73: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

History

2003 –Google File System

2005 – Hadoop

2006 – Hadoop enters ASF incubator (Feb)

2006 – S3 launched

2007 – Paper on Amazon Dynamo

2009 – EMR launched

2013 – CloudStack as a ASF TLP (March)

2013 – Spark/Mesos enters ASF incubator

Page 74: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

The Apache Software Foundation

Page 75: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

Apache Software Foundation

Page 76: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

35 projects in incubation:• 12 Hadoop related

• ~30% Big Data related

• Spark

117 top level projects:• ~16 cloud or bigdata +10%

• Deltacloud, Libcloud, Whirr, jclouds

• Hadoop, couchdb, cassandra, mesos

• Bigtop, accumulo, lucene, UIMA

• CloudStack

Page 77: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

Hadoop Ecosystem

+ Up-coming next generation BD systems

Page 78: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

Stay on top…

Big Data and Cloud

Page 79: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

Big Data and Cloud (Stack)s

Page 80: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

Clouds and BigData

• Object store + compute IaaS to build EC2+S3 clone

• BigData solutions as storage backends for image catalogue and large scale instance storage.

• BigData solutions as workloads to CloudStackbased clouds.

Page 81: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

EC2, S3 clone• An open source IaaS with an EC2

wrapper e.g Opennebula, CloudStack

• Deploy a S3 compatible object store –

separately- e.g riakCS

• Two independent distributed systems

deployed

Cloud = EC2 + S3

Page 82: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

Big Data as IaaS backend

“Big Data” solutions can be used as secondary storage in CloudStack

.

Page 83: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

Example• Open source IaaS + EC2 wrapper, e.g

CloudStack

• Deploy S3 compatible object store, e.griakCS or Ceph or glusterFS

• Use S3 as image store

• Your EC2 service is a customer to your S3 service

+ Logstash + elasticsearch for logs/ monitoring

Page 84: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

Even use Bare Metal

Page 85: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

A note on Scheduling

• Core problem of computer science

• knapsack is NP complete

• Central scheduling has been used for a long time in HPC

• Optimizing the cluster utilization requires multi-level scheduling (e.g backfill, preemption etc..)

• Google Omega paper 2013

• Mesos 2009/2011, ASF Dec 2011

Page 86: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

Past: BOINC/Condor Backfill

Page 87: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

Food for thought

Mesos Framework for managing VM ?

Workload sharing in your data-center:

• Big Data

• VM

• Services

Cloud and BigData

Page 88: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

Conclusions

• Big Data is “catching up”

• Tackle the “big three” head on:

• BigData, Cloud and DevOps

• Add a big data backend to your cloud

from the start

• Provide Big Data services on your cloud

Page 89: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

Still

behind !

Page 90: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

Get Involved with Apache CloudStack

Web: http://cloudstack.apache.org/

Mailing Lists: cloudstack.apache.org/mailing-lists.html

IRC: irc.freenode.net: 6667 #cloudstack #cloudstack-dev

Twitter: @cloudstack

LinkedIn: www.linkedin.com/groups/CloudStack-Users-Group-3144859

If it didn’t happen on the mailing list, it didn’t happen.

Page 91: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

The Velocity ConferenceSanta Clara, May 27-29

• 2 days of keynotes & sessions

• 1 day of tutorials

• New full-day trainings

• Amazing presenters – Jez Humble, Patrick Meenan, Mesosphere, Fastly & more

Use discount code CLOUDSTACK20 during registration for 20% off

http://velocityconf.com/velocity2015

Page 92: On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen, @sebgoa

Free O’Reilly Resources

Get 50% off ebooks & 40% off print

books at oreilly.com with coupon

code DSUG