#vmworld Deep Dive: Run Kubernetes in Production with PKS James Webb: T-Mobile MTS, Platform Engineering Merlin Glynn: VMware, PKS Product Management #CNA1674BE CNA1674BE VMworld 2018 Content: Not for publication or distribution
#vmworld
Deep Dive: Run Kubernetes in Production
with PKS
James Webb: T-Mobile MTS, Platform Engineering
Merlin Glynn: VMware, PKS Product Management
#CNA1674BE
CNA1674BE
VMworld 2018 Content: Not for publication or distribution
Disclaimer
2©2018 VMware, Inc.
This presentation may contain product features orfunctionality that are currently under development.
This overview of new technology represents no commitment from VMware to deliver these features in any generally available product.
Features are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind.
Technical feasibility and market demand will affect final delivery.
Pricing and packaging for any new features/functionality/technology discussed or presented, have not been determined.
VMworld 2018 Content: Not for publication or distribution
Agenda
3©2018 VMware, Inc.
Table of Contents1. PKS Episodes I-III (The Prequel) 10 mins
PKS Design
2. PKS Episode IV Day 0 15minsArchitecting for Production
3. PKS Episode V Day 1 10 minsDeveloper Onboarding & Self Service
4. PKS Episode VI Day 2 15 minsNetworking & Security Persistent StorageMonitoring & LoggingTop 3 Real World Challenges to Look Out for
5. PKS Episodes VII-IX Q&A 10 mins
VMworld 2018 Content: Not for publication or distribution
4©2018 VMware, Inc.
PKS Episodes I-IIIA PKS Prequel Story: PKS Design
VMworld 2018 Content: Not for publication or distribution
5©2018 VMware, Inc.
PKS & PAS & Functions
PAS PKS Functions
Development Teams
What Who
Writes Code Developer
Builds Image Platform
Defines How it is Exposed Platform
1here is my source code
run it on the cloud for me I do not care how
What Who
Writes Code Developer
Builds Image Developer & Pipeline
Defines How it is Exposed Developer & Pipeline
API Rqst
Code
2here is my built code
run it on the cloud for me I will tell you how
Image
What Who
Calls a Function Developer
3here is what I need
run it on the cloud for me stop it when its done
AI
PODBuildpack
VMworld 2018 Content: Not for publication or distribution
6Confidential │ ©2018 VMware, Inc.
Who is PKS Built For?
IT Operator
– PRE (Platform Reliability Engineering)
– Deploy, Scale, Operate PKS
– Physical Infrastructure is Operated
– Network & Security Control Policy is defined
• Developers– Writes code, code deployed using CI/CD– Focus on business problems and innovation
• Application Dev/Ops owner– Automate Everything– Agile– Serve developers
• Platform Reliability Engineers– Platform is Reliable– Capacity Is planned for– Platform is Secured & Controlled– Platform is Auditable
ApplicationDev/Ops Owner
Platform Reliability Engineer
– Develop, Deploy, Scale, Monitor Apps
– Innovation of Business Capability as Cloud native Apps
– Create K8s cluster, scale clusters and maintain the health customers
– Provide developer access to the cluster
Development Teams
VMworld 2018 Content: Not for publication or distribution
7©2018 VMware, Inc.
PKS Design Overview: BOSH
VMworld 2018 Content: Not for publication or distribution
8©2018 VMware, Inc.
PKS Design Overview: A PKS Prequel Story
● It all Starts with an IaaS
● Multi Cloud is a Key ‘Theme’ of PKS
○ Common Ops across clouds
○ Azure Coming Soon
VMworld 2018 Content: Not for publication or distribution
9©2018 VMware, Inc.
PKS Design OverviewControl Plane DesignPRE○ Deploys PKS
Control Plane
VMworld 2018 Content: Not for publication or distribution
10©2018 VMware, Inc.
PKS Design OverviewDeploy A ClusterADO○ Create Cluster
w/ NSX-T
VMworld 2018 Content: Not for publication or distribution
11©2018 VMware, Inc.
PKS Design OverviewDeploy A ClusterADO○ Create Cluster
w/ NSX-T
VMworld 2018 Content: Not for publication or distribution
12©2018 VMware, Inc.
PKS Design OverviewK8s & PKSDeveloper or
CD○ Uses Cluster
w/ NSX-T
VMworld 2018 Content: Not for publication or distribution
13©2018 VMware, Inc.
PKS Design OverviewK8s & PKSDeveloper or
CD○ Uses Cluster
w/out NSX-T
VMworld 2018 Content: Not for publication or distribution
VMworld 2018 Content: Not for publication or distribution
15©2018 VMware, Inc.
PKS Episode IVDay 0: Architecting for Production: Real
World w/ T-Mobile
VMworld 2018 Content: Not for publication or distribution
16©2018 VMware, Inc.
BackgroundWho we are - T-Mobile Platform Engineering
● 25 member team supporting customer facing platforms○ Pivotal Application Service (PAS)○ Pivotal Container Service (PKS)○ Open Source K8S○ BOSH
● Part of a larger organization supporting all IT infrastructure for T-Mobile
Where we were - Jan 2018
● IaaS - 30,000+ VMs● PaaS - 22,000+ Pivotal Application Service (PAS) Containers● CaaS - ~300 Containers running in PAS● Goal: Evaluate and build on-premise K8S offeringVMworld 2018 Content: Not for publication or distribution
17©2018 VMware, Inc.
CaaS Gap
DevOps teams looking for a place to run Docker containers on-premise
● No standard on-premise offering● Docker in PAS is not an ideal experience
○ Upgrades not seamless○ No persistent storage○ TCP Routing - good but not great for all use cases
● DevOps teams often running their own Docker platforms on VMs
VMworld 2018 Content: Not for publication or distribution
18©2018 VMware, Inc.
On-Prem CaaS RequirementsPlatform Team:
Highly AvailableControl Plane (etcd/API)Worker NodesAuthn/Authz
ScalableControl Plane (API)Worker Nodes
Automated DeploymentControl Plane (OpsMan/Bosh)Cluster builds
No Downtime Lifecycle Management
K8S UpgradesOS PatchingInfrastructure Maintenance
LDAP IntegrationAPI Configurability
DevOps Teams:
Native K8S ExperienceContainer OrchestrationPersistent Storage
Single AZIntra-AZ ReplicationCross-Region Replication
PAS-like HTTPS experienceCertificateDNSLoad Balancing
TCP IngressLoad Balancing
VMworld 2018 Content: Not for publication or distribution
19©2018 VMware, Inc.
HL Physical ArchitectureRegion
● 3 AZs○ Network○ Compute○ Storage
● High Bandwidth/Low Latency East/West Networking
Data Center● Multiple Regions per
○ Isolated network & power● Near/Near/Far Availability Strategy
VMworld 2018 Content: Not for publication or distribution
20©2018 VMware, Inc.
PKS - Architecture Challenges Platform Team:
Highly AvailableControl Plane (etcd/API)Worker NodesAuthn/Authz
ScalableControl Plane (API)Worker Nodes
Automated DeploymentControl Plane (OpsMan/Bosh)Cluster builds
No Downtime Lifecycle Management
K8S UpgradesOS PatchingInfrastructure Maintenance
LDAP IntegrationAPI Configurability
DevOps Teams:
Native K8S ExperienceContainer OrchestrationPersistent Storage
Single AZIntra-AZ ReplicationCross-Region Replication
PAS-like HTTPS experienceCertificateDNSLoad Balancing
TCP IngressLoad Balancing
VMworld 2018 Content: Not for publication or distribution
21©2018 VMware, Inc.
PKS – Architecture ChallengesThermal Exhaust Ports:
● Authn - PKSCLI/UAA is not HA Yet …○ AZ failure results in no new LDAP auth until resolved○ Clear recoverability process into new AZ not yet well defined
● API Configurability○ Need access to more API flags to support cluster customization
■ PodPresets■ PodSecurityPolicy■ ...
● Scalability○ Worker scale up available, scale down coming○ 200 Node Worker Limit is tested scale○ Cannot scale K8s API nodes independently of etcd nodes
VMworld 2018 Content: Not for publication or distribution
22©2018 VMware, Inc.
PKS Episode VDay 1: Developer On Boarding
VMworld 2018 Content: Not for publication or distribution
23©2018 VMware, Inc.
Who Does What?
Kubernetes
Namespace Namespace
Namespace Namespace
UAA
Masterkube-api
PKS API
ADO
Developer
OIDC
Access K8s
Set K8s RBAC
AD/LDAP
PKS Create-ClusterPRE
Operates PKS
Set PKS RBAC
Are they the Same Person?
VMworld 2018 Content: Not for publication or distribution
24©2018 VMware, Inc.
UAA
PKS API
ApplicationDev/Ops Owner
“Fred”
ApplicationDev/Ops Owner
“Ethel”
manage
admin
Fred’s K8s Cluster
Ethel’s K8s Cluster
Rick’s K8s ClusterCan Only Access Clusters They Create
Can Access All Clusters
UAA Scopes
pks.clusters.admin
pks.clusters.manage
PKS – Control Plane RBAC Basics
VMworld 2018 Content: Not for publication or distribution
25©2018 VMware, Inc.
Namespace A Namespace C
Namespace B Namespace D
UAAMaster
kube-api
PKS API
OIDC
AD/LDAP
K8s – OIDC / RBAC Basics
kind: RoleapiVersion: rbac.authorization.k8s.io/v1metadata:namespace: Namespace Aname: pod-reader
rules:- apiGroups: [""] resources: ["pods"]verbs: ["get", "watch", "list"]
kind: RoleBinding # Can also Apply at ClusterapiVersion: rbac.authorization.k8s.io/v1metadata:name: read-podsnamespace: Namespace A
subjects:- kind: User # Can Support LDAP G# Name is case sensitiveroups as wellname: FredapiGroup: rbac.authorization.k8s.io
roleRef:kind: Role #this must be Role or ClusterRolename: pod-reader # this must match the name of
the Role or ClusterRole you wish to bind toapiGroup: rbac.authorization.k8s.io
K8s Role
Developer“Fred”
ADO“Lamont”
kubectl create
K8s RoleBinding
VMworld 2018 Content: Not for publication or distribution
26©2018 VMware, Inc.
Kubernetes
dev Namespace C
Namespace B Namespace D
UAA
Masterkube-api
PKS API
OIDC
AD/LDAP
Putting it Together …
ADO“Lamont”
1. pks login <<<---James has pks.manage role 2. pks create-cluster omni-app3. pks get-credentials omni-app4. kubectl create -f cluster-role-binding.yaml
a. Bind Cluster admin role on Cluster to LDAP group “CN=omni-app-admins” <<<--- Lamont is a memberof
PRE“James”
1. get kubeconfig (jwt token from UAA/OIDC)2. kubectl create namespace dev3. kubectl create -f role-binding.yaml
a. Bind NamespaceAdmin role on namespace dev to LDAP group “CN=omni-app-devteam” <<<--- Fred is a memberof
1. get kubeconfig (jwt token from UAA/OIDC)2. kubectl create -f my-app.yaml -n
dev
Developer: “Fred”
1
2
3
VMworld 2018 Content: Not for publication or distribution
27©2018 VMware, Inc.
Putting it Together for Production:
ADO“Lamont”
PRE“James”
● Q: Why didn’t James grant pks.admin or pks.manage to Lamont AKA Self Service of creating the K8s Cluster?
● A: James needs some way to limit what Lamont can create and enable Lamont’s team to perform certain actions on the cluster
■ Resource Quotas■ Tenant / Group Ownership of Clusters
Are the PRE & ADO the Same Person?
Quotas
Tenancy Hierarchy
&
VMworld 2018 Content: Not for publication or distribution
28©2018 VMware, Inc.
PKS Episode VIDay 2: The Challenges (Hard Stuff)
VMworld 2018 Content: Not for publication or distribution
29©2018 VMware, Inc.
AutomationChallenge: Automate all the thingsSolution: Concourse (turtles all the way down)
● Bootstrap side-car BOSH environment (via Concourse)● Deploy Concourse to support environment pipelines● Deploy Opsman (PCF Pipelines)● Deploy PKS (PCF Pipelines)● Deploy PKS clusters (custom)● Post cluster install configuration (custom)
○ Front-End LBs○ Ingress○ Monitoring○ Persistent Storage○ Logging○ ...VMworld 2018 Content: Not for publication or distribution
30©2018 VMware, Inc.
Cluster OwnershipChallenge: Enable (but don’t burden) DevOps customersSolution: Managed Clusters
● Platform team manages:○ Infrastructure (Compute, Network, Storage)○ Cluster install/upgrades○ Base cluster tooling (monitoring, logging, ingress, persistent storage, …)
● Multi-tenant clusters○ Economies of scale○ Fewer objects to manage
● Single tenant clusters where it makes sense○ Sensitive environments○ Advanced customers who need more control
VMworld 2018 Content: Not for publication or distribution
31©2018 VMware, Inc.
User Management & AccessChallenge: Efficiently managing a lot of users and teams(with audit trails)Solution: GitOps
● Adapted from PAS management tooling● Namespace & user management in source control
○ Namespace quotas & configuration○ DevOps team leaders control who has access to their namespace○ User management pipelined for self-service○ Quota/config changes generate a pull request to CaaS Platform Team for
review● LDAP Integration via UAA and PKS cli● User token generation/management cumbersome – better tooling in
the works
VMworld 2018 Content: Not for publication or distribution
32©2018 VMware, Inc.
ObservabilityChallenge: See all the thingsSolution: Prometheus/Grafana
● Leveraging existing PAS tooling to PKS Opsman/BOSH framework ● Prometheus at every layer
○ Proactive monitoring and alarming○ Metrics dashboards○ Capacity planning○ Data available for export to aggregation engines
● Cluster wide APM service being evaluated○ Currently bring your own APM
● Pod logging by default○ DevOps teams can customize as needed
VMworld 2018 Content: Not for publication or distribution
33©2018 VMware, Inc.
Traffic Management – (LB, Ingress, …)Challenge: HTTPS/TCP/UDP/IP SpaghettiSolution: It’s complicated
● Chose not to use NSX● External LTMs direct traffic to clusters
○ Per cluster configuration of LTM (automation needed)○ Provide wildcard DNS & certificate for HTTPS ingress○ Support bring your own certificate as well
● Evaluating TCP ingress solutions● Evaluating Envoy & Istio
○ mTLS & egress routing solve a lot of problems● Smattering of NodePort
VMworld 2018 Content: Not for publication or distribution
34©2018 VMware, Inc.
Persistent StorageChallenge: Replicated Volumes across AZsSolution: Software
● AZ local storage○ VMware does not support (coming in k8s 1.14?)
● SDS layer ○ Local storage presented to worker nodes via VMDK○ Single-AZ storage class (data can be lost or is replicated by application)○ Multi-AZ storage class (SW replicated, 2 or 3 RF)○ Replicated storage at PVC layer easy button for app teams○ Pod scheduling optimizes location (if possible)
● Evaluating CSI drivers for ISCSI storage devices
VMworld 2018 Content: Not for publication or distribution
35©2018 VMware, Inc.
PKS Episode VII-Xhttps://maps.t-mobile.com
Q & A
VMworld 2018 Content: Not for publication or distribution
#CNA1674BE
VMworld 2018 Content: Not for publication or distribution
THANK YOU!
#vmworld #CNA1674BE
VMworld 2018 Content: Not for publication or distribution
VMworld 2018 Content: Not for publication or distribution