Smuggling Multi-Cloud Support into Cloud-native Applications using Elastic Container Platforms 1 Prof. Dr. rer. nat. Nane Kratzke Computer Science and Business Information Systems Nane Kratzke
Apr 12, 2017
Smuggling Multi-Cloud Support into Cloud-native Applicationsusing Elastic Container Platforms
1Prof. Dr. rer. nat. Nane KratzkeComputer Science and Business Information Systems
Nane Kratzke
The next 30 minutes are about ...
• What are Cloud-native Applications?
• Elastic Container Platforms and why theyshould be considered for multi-cloud research.
• A control loop to scale Elastic Container Platforms across Cloud Service Providers
• Some data of our evaluation
• 7 Lessons Learned and Conclusion
Prof. Dr. rer. nat. Nane KratzkeComputer Science and Business Information Systems 2
Presentation URL
Paper URL
Maturity Criteria
3Cloud Native
• Application can dynamically migrate across infrastructureproviders without interruption of service.
• Application can elastically scale out/in appropriately based on stimuli.
2Cloud
Resilient
• Services are stateless.• Application is unaware and unaffected by failure of dependent services. • Application is infrastructure agnostic and can run anywhere.
1Cloud
Friendly
• Application is composed of loosely coupled services.• Application services are discoverable by name.• Application deployment units are designed according to cloud patterns
(e.g. 12-factor app principles)• Application compute and storage are separated.• Application consumes one or more cloud services: compute, storage,
network.
0Cloud Ready
• Application runs on virtualized infrastructure.• Application can be instantiated from an image or script.
According to OPEN DATA CENTER ALLIANCE Best Practices (Architecting Cloud-Aware Applications), 2014
with add-ons by practitioner Mario-Leander Reimer (QAWare)
Cloud Application Maturity Model (CAMM)
Covered bya lot ofSOA andclouddeploymentapproaches.
This contri-bution‘sfocus ...
Research Surveillance of Practitioners
Prof. Dr. rer. nat. Nane KratzkeComputer Science and Business Information Systems 4
Docker SwarmSwarm Mode (since
Docker 1.12) „copies“ theidea of Kubernetes-like control processes but
integrates them in just onecomponent. Secure by
default (control and dataplane). Hides operation
complexity.
GoogleControl processes that
continuously drive current stateof container based applications
towards an intended desiredstate. Makes Google‘s
experience of running large scale production workloadsavailable as open source
(especially from the Google internal Borg system).
MesosphereApache Mesos based
datacenter operating systemfor fine grained resource
allocation. Frameworks tooperate containers and data
services. Datacenter focused. Mesos operates successfullylarge scale datacenters since
years (Twitter, Netflix, ...)
Practitioners ask for simple solutions (elastic platforms) ...
The very basic idea ...
Prof. Dr. rer. nat. Nane KratzkePraktische Informatik und betriebliche Informationssysteme 5
Operate application on current provider.
Scale cluster into prospective provider.
Shutdown nodes on current provider. Cluster reschedules lost container.
Migration finished.Quint, P.-C., & Kratzke, N. (2016). Overcome Vendor Lock-In byIntegrating Already Available Container Technologies - TowardsTransferability in Cloud Computing for SMEs. In Proceedings of CLOUD COMPUTING 2016 (7th. International Conference on Cloud Computing, GRIDS and Virtualization).
Avoiding Vendor Lock-In:
• Make use of elastic containerplatforms to operate elasticservices being deployable to anyIaaS cloud infrastructure.
• Transfer of these services from oneprivate or public cloud infrastructureto another would be possible at runtime.
But the idea provides more options ...
Prof. Dr. rer. nat. Nane KratzkeComputer Science and Business Information Systems 6
Simply stop „a transfer“ somewhere in between and you get ...
One Control Loop for All
Prof. Dr. rer. nat. Nane KratzkeComputer Science and Business Information Systems 7
Operate application on current provider.
Scale cluster into prospective provider.
Shutdown nodes on current provider. Cluster reschedules lost container.
Migration finished.
Control LoopExample to deploy a cluster
Prof. Dr. rer. nat. Nane KratzkeComputer Science and Business Information Systems 8
Definition of an intended state.{ "type": "cluster", "platform": "Swarm", "deployments": [ { "district": "gce-europe", "flavor": "small", "role": "master", "quantity": 1
}, { "district": "gce-europe", "flavor": "small", "role": "worker", "quantity": 9
}, { "district": "aws-europe", "flavor": "small", "role": "worker", "quantity": 0
} ]
}
Control LoopExample to deploy a cluster
Prof. Dr. rer. nat. Nane KratzkeComputer Science and Business Information Systems 9
Derive a prioritized action list.
|| Create secgroup for gce-europe
-- Create master in gce-europe
|| Create worker in gce-europe|| Create worker in gce-europe|| Create worker in gce-europe|| Create worker in gce-europe|| Create worker in gce-europe|| Create worker in gce-europe|| Create worker in gce-europe|| Create worker in gce-europe|| Create worker in gce-europe
|| executed in parallel-- executed sequentially
Control LoopExample to deploy a cluster
Prof. Dr. rer. nat. Nane KratzkeComputer Science and Business Information Systems 10
Updated resources.
- Secgroup for gce-europe- Master node in gce-europe- Worker node in gce-europe- Worker node in gce-europe- Worker node in gce-europe- Worker node in gce-europe- Worker node in gce-europe- Worker node in gce-europe- Worker node in gce-europe- Worker node in gce-europe- Worker node in gce-europe- Worker node in gce-europe
All detail data like IP-adresses, identifiers, etc. omitted for betterreadability.
- Secgroup for gce-europe- Master node in gce-europe- Worker node in gce-europe- Worker node in gce-europe- Worker node in gce-europe- Worker node in gce-europe- Worker node in gce-europe- Worker node in gce-europe- Worker node in gce-europe- Worker node in gce-europe- Worker node in gce-europe- Worker node in gce-europe
Control LoopExample: Transfer of five worker nodes
Prof. Dr. rer. nat. Nane KratzkeComputer Science and Business Information Systems 11
{ "type": "cluster", "platform": "Swarm", "deployments": [
{ "district": "gce-europe", "flavor": "small", "role": "master", "quantity": 1
}, { "district": "gce-europe",
"flavor": "small", "role": "worker", "quantity": 9
}, { "district": "aws-europe",
"flavor": "small", "role": "worker", "quantity": 0
} ]
}
4
5
|| Create secgroup for aws-europe
|| Create worker in aws-europe|| Create worker in aws-europe|| Create worker in aws-europe|| Create worker in aws-europe|| Create worker in aws-europe
-- Delete worker in gce-europe-- Delete worker in gce-europe-- Delete worker in gce-europe-- Delete worker in gce-europe-- Delete worker in gce-europe
|| executed in parallel-- executed sequentially
- Secgroup for gce-europe- Secgroup for aws-europe- Master node in gce-europe- Worker node in gce-europe- Worker node in gce-europe- Worker node in gce-europe- Worker node in gce-europe- Worker node in gce-europe- Worker node in gce-europe- Worker node in gce-europe- Worker node in gce-europe- Worker node in gce-europe- Worker node in aws-europe- Worker node in aws-europe- Worker node in aws-europe- Worker node in aws-europe- Worker node in aws-europe
Resulting Architecture (Domain Model)
Prof. Dr. rer. nat. Nane KratzkeComputer Science and Business Information Systems 12
Extension pointfor elasticplatforms
Currently supported: Kubernetes, Swarm
Extension point for IaaSinfrastructures
Currently supported: AWS, GCE, Azure, OpenStack
Evaluation:5 Experiments (with a 1 Master and 9 Worker Cluster)
Prof. Dr. rer. nat. Nane KratzkeComputer Science and Business Information Systems 13
OpenStack
Google Compute Engine (GCE, n1-standard-2)
Elastic Compute Cloud (EC2, m3.large)
E1
E2 E2
E1
E3, E4, E5
E3, E4, E5
The same experiments havebeen done with OpenStackas well.
E1: Launch a 10 node cluster.
E2: Terminate a 10 node cluster.
E3: Transfer one node of the cluster.
E4: Transfer 5 nodes of the cluster.
E5: Transfer all nodes of the cluster.
Cluster was Docker Swarm (operated a Sock Shop Reference Application and a Redis-based Guestbook)
Kubernetes
Different elastic containerplatforms had no significantimpact on the runtimes. Therefore data is onlypresented for Docker Swarm.
Docker Swarm
Evaluation (Single Cloud)Deploying and terminating clusters
Prof. Dr. rer. nat. Nane KratzkeComputer Science and Business Information Systems 14
Experiment E1
Experiment E2
10 times longer ???
Evaluation (Multi-Cloud)Transfer GCE ⇠⇢ AWS
Prof. Dr. rer. nat. Nane KratzkeComputer Science and Business Information Systems 15
Experiment E3
Experiment E4
Experiment E5
Comparable with a shutdown.
Node terminationtimes seem todominate thetransfer timesmassively.
Why these (dramatic) differences?
Prof. Dr. rer. nat. Nane KratzkeComputer Science and Business Information Systems 16
Analysis turned out:
1. GCE API workssynchronously (a nodetermination call blocks untiltermination is completed)
2. AWS API worksasychronously (so nodetermination call did not block until termination completed, fire and forget)
3. GCE SDN relatedprocessing times take farlonger than AWS SDN related processing times.
Conclusion
Prof. Dr. rer. nat. Nane KratzkeComputer Science and Business Information Systems 18
• Elastic container platforms provide often overlooked multi-cloud opportunities
• We could succesfully demonstrate multi-cloud transfers between AWS, GCE, Azure and OpenStack using a simple control loop (scaling Kubernetes andDocker SwarmMode).
• The control loop is designed to be integratable in a MAPE loop as executionphase.
• A cybernetic understanding (intended state vs. current state) makes a lot ofmulti-cloud workflows easier.
• On the downside: The solution is limited to container-based applications (CNMM Level 3) and services (but that seems to become a dominating architecturalstyle).
• New research opportunities and future research directions:• Making the solution available as Open Source
• P2P-based elastic platforms would make deployments even easier (no worker/masterroles)
• There is room for improvements (e.g. resource efficient action planning)
Acknowledgement
• Elastic Straps: Pixabay (CC0 Public Domain, PublicDomainPictures)• Definition: Pixabay (CC0 Public Domain, PDPics)• Class room: Pixabay (CC0 Public Domain, Unsplash)• Railway: Pixabay (CC0 Public Domain, Fotoworkshop4You)• Air Transport: Pixabay (CC0 Public Domain, WikiImages)
Prof. Dr. rer. nat. Nane KratzkeComputer Science and Business Information Systems 19
Picture Reference
This research is funded by German Federal Ministry of Education
and Research (03FH021PX4). I would like to thank Peter Quint,
Christian Stüben, and Arne Salveter for their hard work and their
contributions to the Project Cloud TRANSIT.
Presentation URL
Paper URL
About
Prof. Dr. rer. nat. Nane KratzkeComputer Science and Business Information Systems 20
Nane Kratzke
CoSA: http://cosa.fh-luebeck.de/en/contact/people/n-kratzke
Blog: http://www.nkode.io
Twitter: @NaneKratzke
GooglePlus: +NaneKratzke
LinkedIn: https://de.linkedin.com/in/nanekratzke
GitHub: https://github.com/nkratzke
ResearchGate: https://www.researchgate.net/profile/Nane_Kratzke
SlideShare: http://de.slideshare.net/i21aneka
Elastic Platforms and Multi-cloudrequirements
Multi-Cloud Requirements Contributing Platform concepts
Transferability Integration of nodes into one logical clusterDesigned for failureCross-provider deployable
Data location awareness Pod concept (Kubernetes)Volume orchestrator (Flocker for Docker)
Geolocation awareness Tagging of nodes with geolocation, pricing, policy oron-premise informations
Platform schedulers have selectors (Swarm) /affinitities (Kubernetes) / constraints(Mesos/Marathon) to evaluate these taggings
Pricing awareness
Legislation/policy awareness
Local resources awareness
Security requirements Encrypted data / control plane (Swarm)Encrypted overlay networks (e.g. Weave forKubernetes)
Prof. Dr. rer. nat. Nane KratzkeComputer Science and Business Information Systems 22
Several transferability, awareness and security requirements come along withmulti-cloud approaches. Already existing elastic container platforms contributeto fulfill these requirements.
Cloud-native Application
What?Be IDEAL
• Isolated State• Distributed• Elastic• Automated
management• Loosely coupled
Why? There is a need for ..
• Speed (delivery)• Safety (fault tolerance,
design for failure)• Scalability• Client diversity
How?Integrate ...
• (Micro)service orientedarchitectures (M)SOA
• Use API-basedcollaboration
• Consider cloud-focusedpattern catalogues
• Use self-service agile platforms
Prof. Dr. rer. nat. Nane KratzkeComputer Science and Business Information Systems 23
C. Fehling, F. Leymann, R. Retter, W. Schupeck, and P. Arbitter, Cloud Computing Patterns: Fundamentalsto Design, Build, and Manage Cloud Applications. Springer, 2014.
M. Stine, Migrating to Cloud-Native Application Architectures. O’Reilly, 2015
A. Balalaie, A. Heydarnoori, and P. Jamshidi, “Migrating to Cloud-Native Architectures Using Microservices”, CloudWay 2015, Taormina, Italy
S. Newman, Building Microservices. O’Reilly, 2015.
Often heard by practitioners: „A cloud-native application is an application intentionally designed for the cloud.“ True, but helpful?
Cloud-native Application Definition
Prof. Dr. rer. nat. Nane KratzkeComputer Science and Business Information Systems 24
[KQ2017a] Kratzke, N., & Quint, P.-C. (2017). Understanding Cloud-native Applications after 10 Years ofCloud Computing - A Systematic Mapping Study. Journal of Systems and Software, 126 (April).
We need some guidance ...ClouNS – Cloud-native Application Reference Model
Prof. Dr. rer. nat. Nane KratzkeComputer Science and Business Information Systems 25
[KP2016] Kratzke, N., & Peinl, R. (2016). ClouNS - a Cloud-Native Application Reference Model for Enterprise Architects. In 2016 IEEE 20th International Enterprise Distributed Object Computing Workshop (EDOCW) (pp. 1–10).
Did you know?
Prof. Dr. rer. nat. Nane KratzkePraktische Informatik und betriebliche Informationssysteme 26
2 2
2 4 6 77
7 7 11 11
1 1
2 4 7 1014
21 26 42 44
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016
Relationofconsid
eredservices
consideredbyCIMI,OCCI,CDMI,OVF,OCI,TOSCA notconsidered
Cloud standards improved over the last 10 years. However, cloud standardization coveragedecreased (in relation to all available services).
Analyzed using over 2300 offical release notes of Amazon Web Services (AWS). Data for other providers like Google, Azure, Rackspace, etc. not presented. Basic conclusions for theseproviders are the same.
[KQP+2016] Kratzke, N., Quint, P.-C., Palme, D., & Reimers, D. (2016). Project Cloud TRANSIT - Or toSimplify Cloud-native Application Provisioning forSMEs by Integrating Already Available Container Technologies. In V. Kantere & B. Koch (Eds.), European Project Space on Smart Systems, Big Data, Future Internet - Towards Serving the Grand Societal Challenges.
Research Methodology
Prof. Dr. rer. nat. Nane KratzkeComputer Science and Business Information Systems 27
Main focusof this
contribution
CNA == Cloud-native Application
Prof. Dr. rer. nat. Nane KratzkeComputer Science and Business Information Systems 28
Evaluation:Virtual Machine Type Selection
[KQ2015] Kratzke, N., & Quint, P.-C. (2015). About Automatic Benchmarking of IaaS Cloud Service Providers for a World of Container Clusters. Journal of Cloud Computing Research, 1(1), 16–34.
We searched for the most similar machine types of different public cloud serviceproviders. The similarity indicator maps processing, memory, network, and disk I/O performance to just one similarity value (1 means identical, 0 means no similarity at all).
This reference model guides ourresearch
Prof. Dr. rer. nat. Nane KratzkeComputer Science and Business Information Systems 29
Developing a description language for cloud-native applications.
Developing a standardized way of deploying a clustered container runtimeenvironment for cloud-native applications
(CNMM Level 3 conform deploying/operation)
Make use of commodity services of public cloud service providersonly (IaaS).
Research Surveillance of Practitioners
Prof. Dr. rer. nat. Nane KratzkeComputer Science and Business Information Systems 30
Practitioners often prefer layer-based reference models ...
Jason Lavigne, ”Don’t let aPaaS you by - What is aPaaS and whyMicrosoft is excited about it”, seehttps://atjasonunderscorelavigne.wordpress.com/2014/01/27/dont-let-apaas-you-by/ (last access 4th August 2016)
Johann den Haan, ”Categorizing and Comparing the Cloud Landscape”,see http://www.theenterprisearchitect.eu/blog/categorize-compare-cloud-vendors/ (accessed 4th August 2016)
Josef Adersberger, Andreas Zitzelsberger, Mario-Leander Reimer, ”Der Cloud-Native-Stack: Mesos, Kubernetes und Spring Cloud”, seehttp://www.qaware.de/fileadmin/user_upload/QAware-Cloud-Native-Artikelserie-Java_Magazin-1.pdf (accessed 4th August 2016)
MEKUNS Cloud Landscape Model