Top Banner
Alternatives to layer-based image distribution using a CERN filesystem for distributing container images George Lestaris @glestaris
28

Alternatives to layer-based image distribution: using CERN filesystem for images

Feb 11, 2017

Download

Technology

George Lestaris
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Alternatives to layer-based image distribution: using CERN filesystem for images

Alternatives to layer-based

image distributionusing a CERN filesystem for distributing container images

George Lestaris@glestaris

Page 2: Alternatives to layer-based image distribution: using CERN filesystem for images

• Software engineer at Pivotal

• working for Cloud Foundry GrootFS

• github.com/cloudfoundry/grootfs

• ex-CERNois

About me

Page 3: Alternatives to layer-based image distribution: using CERN filesystem for images

dockerrunelasticsearch:2.3.5Container image

Page 4: Alternatives to layer-based image distribution: using CERN filesystem for images

Container image format

• Container images are formalized in: Docker, AppC (ACI) and OCI Image spec

• Generally: image is the combination of:

• a set of layers

• metadata

Page 5: Alternatives to layer-based image distribution: using CERN filesystem for images

Building an imageFROMpython:3.5

ADD./myapp

RUNpipinstall\-r/myapp/requirements.txtENTRYPOINTpython/myapp/manage.py\runserver0.0.0.0:8000

Page 6: Alternatives to layer-based image distribution: using CERN filesystem for images

Container images are

composed of layers

Layer is a set of files and

directories

Page 7: Alternatives to layer-based image distribution: using CERN filesystem for images

FROMpython:3.5

Layers help us to inherit images

Page 8: Alternatives to layer-based image distribution: using CERN filesystem for images

FROMpython:3.5

ADD./myapp

RUNpipinstall…

Page 9: Alternatives to layer-based image distribution: using CERN filesystem for images

• Different image formats - different distributions mechanisms

• Docker: download layers through HTTP connections from a registry

• Helps reusing layers of base images

• Efficient container image fetching by parallelizing the downloads

Container image distribution

Page 10: Alternatives to layer-based image distribution: using CERN filesystem for images

Registry ClientClientClientClient

New image

From cached base

Update dependencies

Page 11: Alternatives to layer-based image distribution: using CERN filesystem for images

Distributing software in HEP

Page 12: Alternatives to layer-based image distribution: using CERN filesystem for images
Page 13: Alternatives to layer-based image distribution: using CERN filesystem for images

Data

Data

Data

Data

Data

Data

Data

Data

Data

Frequent releases

Simulation engine

Analysis framework

Experiment geometry

Experiment software

Dependencies

Simulation engine

Analysis framework

Experiment geometry

Experiment software

Dependencies

Simulation engine

Analysis framework

Experiment geometry

Experiment software

Dependencies

Simulation engine

Analysis framework

Experiment geometry

Experiment software

Dependencies

Simulation engine

Analysis framework

Experiment geometry

Experiment software

Dependencies

Simulation engine

Analysis framework

Experiment geometry

Experiment software

Dependencies

Simulation engine

Analysis framework

Experiment geometry

Experiment software

Dependencies

Simulation engine

Analysis framework

Experiment geometry

Experiment software

Dependencies

Page 14: Alternatives to layer-based image distribution: using CERN filesystem for images

WLCG

170 computing centres

in 42 countries

Page 15: Alternatives to layer-based image distribution: using CERN filesystem for images

CernVM-FS

Page 16: Alternatives to layer-based image distribution: using CERN filesystem for images

• Network file system

• no packages and layers —> files and directories

• FUSE

• Lazily downloads the used files

• Deduplication Downloaded files get cached using a content addressable storage

using a network filesystem

Page 17: Alternatives to layer-based image distribution: using CERN filesystem for images

User application

VFS FUSE kernel module

CernVM-FS

FUSE

CernVM-FS service

GET catalog

Cachestatsha256:…

GET /blob/sha256:…

open/dir/file catalog

/dir/file—>sha256:…

Page 18: Alternatives to layer-based image distribution: using CERN filesystem for images

Similarities between HEP software and container images

Page 19: Alternatives to layer-based image distribution: using CERN filesystem for images

• Most images are based on a Linux distribution

• redis 3.2.3

• Image size: 190 MB (Compressed 74 MB)

• Used to boot: 11 MB - 5.7 %

• node 6.5.0 5.4 %

• nginx 1.11 3.1 %

Applications use a small fragment of the image

Page 20: Alternatives to layer-based image distribution: using CERN filesystem for images

• nginx 1.10 to 1.11:

• Real changes: 4.02 MB

• Layer changes: 58 MB (two of the three layers)

• 14.4 times the size of the diff

• nginx 1.9 to 1.10: 4.8 times the size of the diff

Small changes between versions

Page 21: Alternatives to layer-based image distribution: using CERN filesystem for images

DemoCernVM-FS and runC

Page 22: Alternatives to layer-based image distribution: using CERN filesystem for images

• Small tool to create containers

• Low-level interface - not supposed to be a container runtime

• Used by container runtimes (Docker, Garden) internally

runC

Page 23: Alternatives to layer-based image distribution: using CERN filesystem for images

Performance comparison

Page 24: Alternatives to layer-based image distribution: using CERN filesystem for images

• http://github.com/glestaris/container-camp

• Used iCE - see PyCon UK 2015

• 20 AWS VMs in eu-west (m4.large)

• 1 CernVM-FS server on an AWS VM (m4.large) in eu-central

• Dockerhub

Experiment setup

Page 25: Alternatives to layer-based image distribution: using CERN filesystem for images

• All VMs create a redis:3.2.3 container in parallel

• Comparing runC, Docker and Docker with warm cache

• Run the server and ping (wait for the server to came up)

Scenario

redis-server--daemonizeyeswhile!redis-cliping;doecho'retrying'done

Page 26: Alternatives to layer-based image distribution: using CERN filesystem for images
Page 27: Alternatives to layer-based image distribution: using CERN filesystem for images

• IPFS: InterPlanetary file system

• Deduplication Content addressed storage for object

• History Versioned objects

• Decentralized P2P transfers

• Objects are files, directories or changes (commits)

Other approaches

Page 28: Alternatives to layer-based image distribution: using CERN filesystem for images

• CI server

• Large clusters that parallelly fetch images

• Network contention

• Maintaining a private registry

• Serverless (?)

Use cases