Dependability in Hybrid Clouds: Practitioner Insights IFIP 10.4 Work Group Meeting – Winter 2018, Goa, India Sreekrishnan Venkateswaran, Distinguished Engineer, IBM Cloud Center of Excellence
Dependability in Hybrid Clouds: Practitioner InsightsIFIP 10.4 Work Group Meeting – Winter 2018, Goa, India
Sreekrishnan Venkateswaran,Distinguished Engineer, IBM Cloud Center of Excellence
Contents
Today’s Hybrid Cloud Landscape & Dependability NFRsAvailability Fulfilment Approach: Statistics from 50 client dealsCase Studies from the field:
I: Engineering “Just Enough” HA on a Private Cloud II: Hosting a Clustered Appliance on a Public Cloud III: Application/MW HA on a Shared Private Cloud
Concluding Thoughts: Trends on Dependability on Hybrid Clouds
§ IT Hosting philosophies plotted across 2 axes – cost & volume
§ One end of the spectrum lie high-volume low-cost public clouds
§ Other end of the spectrum is the low-volume high-cost single-tenant environments, cloud or legacy
§ For enterprise clients, there is a sweet spot in this landscape in terms of price and services via a managed enterprise-grade multi-tenant cloud
§ Value close to traditional/private IT by providing management above the hypervisor, enhanced isolation & production SLAs
§ Price points close to that of public clouds via standardization, virtualization and automation
Cost
Volume
Expensive
Public Clouds
Innovation Partner
Value for Money
BasicMgt Practice
AdvancedMgt Practice
AutomationReasonable
Least
SelectiveIndividual Mass
TrustedSupplier
ManagedCloud
Traditional IT and Private Clouds
Standardization
Cloud Landscape: Building Blocks for Hybrid Hosting in Today’s Client Deals
Cloud Broker
PrivateClouds:OpenStack,VMwarevRealize
HostedPrivateClouds:VMwarevRealizeonBluemix,
Bluebox
IBMCloudManagedServices(CMS)
AWS,Azure,GoogleCloud,IBMBluemix
Availability Capabilities Across Cloud Categories Expen
sive
Public Clouds
Value for Money
Reasonable
LeastSelectiveIndividual Mass
ManagedCloud
Traditional IT and Private Clouds
PrivateClouds:IBM Cloud Orchestrator, OpenStack,
VMware vRealize
HostedPrivateClouds: Bluebox, VMWare Cloud
on SoftLayer
IBMCloudManagedServices(CMS),CMS4SAP
SoftLayer,AWS,Azure,GoogleCloud
Cloud Category Availability Philosophyo On-premise Privateo Hosted Privateo Traditional IT
o Custom Design
Public Clouds o Provider offers VM-level availability SLAso Provider offers IaaS-level HA on Bare metal
Managed Multi-tenant (or “shared private”) Clouds
In addition to OS-level availability, introduces clustering to provide a more highly available environmento HA clusters by allowing customers to specify anti-collocation of the
virtual workload onto separate servers for fault containmento Connects clusters to shared storage for shared-disk HA topologies.
Example:CaseStudy1
Example:CaseStudy2
Example:CaseStudy3
Non-Functional Requirements in Cloud Deals: A Recent ExampleNFR# Category Requirement Deliverables
NFR01 Availability Cloud management software should be highly available
OpenStack and Virtualization component will have active-active configuration. Handle Server overload v/s Server going down.
NFR02 Availability Hardware Management Workload running on a failed host will be restarted on another host in the resource pool for both AIX and VMware if the host fails
NFR03 Business Continuity Backup & Restore Support backup policies pertaining to NetBackup
NFR04 Monitoring & Event Mgt
Host monitoring requiredGuest monitoring required (Managed)Dashboards - utilization monitoring
Monitor both hosts and guests, OS agents to be initially deployed manually during post-provisioning.
NFR05 Image Management Standard Images and patterns to be maintained
Standard image catalogue will be maintainedApplication patterns to be createdManage a standardized catalogue of patterns
NFR06 Security Follow client’s security guidelines TBD
NFR07 Disaster Recovery RPO of 30 minutes, RTO of 4 hours Support failover on DR-sensitive workloads within RTO/RPO
NFR08 Security Network Isolation Segregation using VLANs (for Lpars) and VxLANs (for x-86 optional)
SLA & SLO Requirements in Cloud Deals: Recent Example
o Availability SLA: 99.5 to 99.9o Provisioning Request Fulfilment SLO: 15 mins to 24 hourso DR SLA: RPO/RTO: 15m/4ho Incident resolution SLO: See tableo On-boarding time SLO: 1 dayo Time to build Pod: 8 months
Instance Type Provisioning Time SLOs
Bronze 15 minutesSilver 30 MinutesGold 60 minutesPlatinum 24 hours
IncidentResolutionSLAs
Availability Solution Approach Data from Cloud Deals: Recent Statistics (1/2)No HA
Compute HA Data HA Network HA
Cloud Stack
HA DR
#Deals(outof50)
10Hypervisor
-levelOS
ClusterMW(e.g.DBServer)
App-level
DB-level
Clustered File systems
HW RAID
SDSReplicas
GW SDN
23 229 2 2 1 25 1 2 10 11 25
0
5
10
15
20
25
30
35
Hypervisor OS Level MW Level App Level Other
Compute HA
0
5
10
15
20
25
DB Level Clustered FS HW Raid SDS Replication
Other
Data HA
0
5
10
15
20
25
30
GW
(Vya
tta)
SDN
(NSX
)
Oth
er
Network HA
0
5
10
15
20
25
HA DR
Cloud Stack
Availability Solution Approach Data from Cloud Deals: Recent Statistics (2/2)
# Conclusions Improvement Strategy
1 Most solutions do not map SLA requirements to the level of HA needed across constituent components. They merely follow rules of thumb such as “triple replicate storage since this airline client needs 99.95” Simple uptime
modeling. Focus of Field Case Study 1.
2 For managed cloud deals, the managed service provider usually offers only 2 system availability options, one corresponding to HA & another to non-HA. How to convince managed service provider teams to offer HA required by the client?
05
101520253035
Hypervisor OS Level MW Level App Level Other
Compute HA
05
10152025
DB Level Clustered FS HW Raid SDS Replication
Other
Data HA0
5
10
15
20
25
30
GW
…
SDN
(NSX
)
Oth
er
0
5
10
15
20
25
HA DR
Cloud Stack
1. Aglobalclientrequireda99.90uptimeSLA(9h/ydowntime)atOSlevel2. ButthemanagedServiceProvider(MSP)offeredonlyanuptimeSLAof99.0
(3d/ydowntime)bydefault3. However,theMSPallowed99.5(2d/y,7m/ddown)
SLAonclientIaaSenabledwith“BasicHA”:- Tier-4DataCenter(99.995atsite-level)- StorageV7KwithRedundantHVACs- System-P/AIX- 24x7Hands&FeetinDC
2 Power-8 Frames/AIX
V7000
Client Infra Hosted in a Tier-4 DC with “Basic HA”
Network Elements
1. Assumethatthemanagedserviceprovideroffersonly99.0with“BasicHA”2. EngineeradditionalHAontheIaaS3. ModeltheensuingredundancyandestablishtotheMSPteamthattheadditionalHA
canincreasetheSLAfrom99.0to99.95(4h/y)withoutadditionalrisktotheMSP
Client Case Study 1: Engineering “Just-Enough” HA on a Private Cloud
Solution Approach that was Followed
Problem Statement
Thefollowingistheproposeddesignfortheclient(intheprimarydatacenter):
C1,1 C1,2
PROPOSED HA DESIGN FOR THE CLIENTʼS ENVIRONMENT
C3,1 C3,k3
C1, k1
C2,1 C2,k2
P1, t1, f1
P2, t2, f2
P3, t3, f3
. . .
C1,* = ComputeC1,1 = System-P Frame-1C1,2 = System-P Frame-2C1,3 = System-P Frame-3 (Redundant node: Power-HA)
C2,* = StorageC2,1 = V7000 Storage VolumesC2,2 = Redundant RAID-10
C3,* = Network C3,1 = Network ElementsC3,2 = Switches/Routers/FWs/CPEs in dual redundant mode
Additional HA Engineered in the Proposed System
Wemodelacloud-hosted systemSasaserialcombinationofnclusters.Lettherebe‘n’clustersthatconstitutethesystem.LeteachclusterCi becomposedofKi nodes, eachdenoted asCi,ki.Overalldown-timeprobabilityofScanbeexpressedasDs =Bs +Fs where------ [1]Bs =Systemdowntime duetonon-recoverablefailures(breakdownofoneormoreclusters)andFs =Systemdowntimeduetorecoverablefailures(outagewhenclustersrecoverfromnodefailures)Bs andFs aremutuallyexclusiveifwedisregardthepossibilityofanunrecoverablefailureduringclusterfailover
Pi = Probability that a node in cluster Ci is down (= 1% from MSPʼs assumption that 99% can be offered without HA)
fi = Average yearly failures for component Ci (from cloud broker data lakes)ti = Failover latency with the chosen HA algorithm (from empirical observations)ḱi<Ki =maximumnumberoffailednodesthatcanbetoleratedbytheclusteringalgorithmofCi
IfthelevelofredundancyinaclusterisN+ὴ,thenḱiisὴ.
ProbabilitythatClusterCi isUP=∑"#$
"#$%"#&ḱi
(1− 𝑃𝑖)$𝑃𝑖"#&$
ProbabilitythatallClustersinthesystemareup=∏(i=1ton)[∑"#$
"#$%"#&ḱi
(1− 𝑃𝑖)$𝑃𝑖"#&$]
DowntimeprobabilityofSystemS,Bs =1-∏(i=1ton)[∑"#$
"#$%"#&ḱi
(1− 𝑃𝑖)$𝑃𝑖"#&$]------- [2]
Uptime Modeling [1/3]
Letti bethetime (inminutes) tofailover ifanode incluster Ci goesdown.Letfi betheaveragenumberoffailuresexperienced byanode inclusterCi inayear.
Failovertime ti isasumof(i) Time todetect thatthecurrentlyactivenode incluster Ciisdown;this isthetime beforewhichaheartbeatmiss isdetected (ii)Time tobringupthefailovernode ifitisonstandbyand(iii)Time forthefailovernodetotakeoverfromtheprimarynode
SincePi istheprobability ofanode inCluster Ci isdown,it isalsotheprobability thatthecurrentlyactivenode inClusterCi isdown.Downtime duetofailovertransactions inCluster Ci =fi*ti
C1,1 C1,2
CLOUD-HOSTED CLUSTERED IaaS ARCHITECTURE of a System S
C3,1 C3,k3
C1, k1
C2,1 C2,k2
Cn,1 Cn,2 Cn,kn
..
...
..
...
P1,t1,f1
P2,t2,f2
P3,t3,f3
Pn,tn,fn
...
Uptime Modeling: Calculating Ds [2/3]
However,multipleclustersmightbesimultaneouslyexperiencingfailovertransactions,sothattimecannotbedoublecounted.DowntimeduetofailovertransactionsinclusterCi,whennootherclustersareexperiencingfailovertransactions=fi*ti* (𝐾𝑖 − ḱi)*Pi (X1),whereX1istheeventthatonlyclusterCi inthesystemisexperiencingafailovereventandPi(X1)=∏(j=1ton, j<>i)[(1-Pj) (𝐾𝑗 − ḱj)]Note:Weignore theerrorofcountingintra-clusternodefailovertimeswhenmorethanḱi nodesinCi
failsimultaneously.Wealsodisregardthepossibilityofanunrecoverableerrorduringclusterfailover.Thus,DowntimeduetofailovertransactioninClusterCi whentherearenoothersimultaneousfailoversinanyothercluster=fi*ti *(𝐾𝑖 − ḱi) ∏(j=1 ton, j<> i)[(1-Pj) (𝐾𝑗 − ḱj)]minutesDowntimeduetofailovertransactionsacrossallClusters=∑(i=1ton)fi*ti*(𝐾𝑖 − ḱi)*Pi(X1)minutesDowntimeprobabilityduetoallfailovertransactionsacrossclustersFs =(∑(i=1ton)((fi*ti*(𝐾𝑖 − ḱi))/525600)*∏(j=1ton, j<> i)[(1-Pj) (𝐾𝑗 − ḱj)])----- [3]ApplyingEquation[2]andEquation[3]toEquation[1],wegetTotaldowntimeprobability
Ds =(1- ∏(i=1ton)[∑ 𝑲𝒊𝒋
𝑲𝒊
𝒋%𝑲𝒊&ḱi(𝟏 − 𝑷𝒊)𝒋𝑷𝒊𝑲𝒊&𝒋])+
(∑(i=1ton)((fi*ti*(𝑲𝒊 − ḱi))/525600)*∏(j=1 ton,j<>i)[(1-Pj) (𝑲𝒋 − ḱj)]) ----- [4]
TotalUptimeProbabilityUs=1– Ds ----- [5]
Uptime Modeling: Calculating Ds [3/3]
Estimating New Uptime with the Additionally Engineered HA
Component#
UptimewithoutHA(Pi)
Estimates,givenMSP’sposture
Averageyearlyfailures(fi)
(Estimatesfromacloudbroker)
FailoverlatencyinHAmode(ti)
(Empirical,fromexperience)
ProposedHAmethod
SystemUptimewiththisarchitecture(Us)(ApplyingEquation
#5)1 99%(3d/ydowntime) 1 30minutes PowerHA(2+1)
99.945%
=9m/ydowntime
2 99% 2 10seconds RAID10
3 99% 1 10seconds DualNodeCluster
CLIENTENVIRONMENT
WebTier
GW
DISK
App Tier App Tier DBTier
1Gbps
LoadBalancerVirtualAppliance
ClientUser
MSP
Internet
Component2:60TBV7KDisks(RAID10forHA)
Component1: Power-VCVirtualizedPhysicalNodes(2+1HA)
Component3:NetworkElements(dualnodeclusterforHA)
1. Aglobale-commerceretailerwhowishedtoadopta“Cloud-First”strategyranspecializedHAappliances(OracleRACinthiscase)inthebackendforHA(“belowtheshoppingcart”).MigratingtheclienttopubliccloudwasruledoutwithoutRACsupport
2. RACClusteringofOracledatabaseserversisnotcloud-friendlyandhenceisnotsupportedonmainstreampubliccloudsbecauseo Dedicatednetworklinksbetweenclusterserversisneededtocarryheartbeats.Missedheartbeatscan
generatefalsenegativesandcanbedisastrouso Layer-2adjacencyrequiredbetweenclusterserverso Redundantstorageneeded
Client Case Study 2: Hosting a Clustered Appliance on Public Cloud
Solution Approach that was Followed
Problem Statement
1. Un-bondthe2publiccloudinterfacesonphysicalbaremetalhosts.ProvisionheartbeatVLANsonthisinterface
2. ProvisioningbaremetalserversonthesamephysicalrackimpliestheyareL2adjacentsinceswitchesinarackaretrunked(linkaggregation).Ifclusterserversfallondifferentracksopenatickettogetthecorrespondingswitchestrunked
3. UseredundantSDSstorage(VSANorCeph)4. Madethisa“pseudostandard”buildingpatternincertainpublicclouddatacenters
Client Case Study 3: Application & MW HA on a Shared Private Cloud
Solution Approach that was Followed
Problem Statement1. AresearchfacilityinMexicooffers“classroomvirtualization”,whichise-
deliveryofcoursesoverthepublicInternet2. Ahighlyscalablewebserviceisconnectedtoascalableapplicationtier
andadatabasetier(IBMDB2inthiscase),all3tiershostedona“sharedprivate”managedcloud.TheDBtierneedstooperateinHAmode
3. Ingeneral,DBHAwillbeoflimitedvalueataVM-levelsincebothdatabaseserverinstancescouldenduponthesamephysicalhost
1. Databaseservers(IBMDB2)weredeployedinHAmodeinanti-collocatedVMs.
2. Thedatabaseitselfwashostedonashareddisk3. Theclustereddatabasecouldthustoleratesinglephysicalhostfailures
Concluding Thoughts: Trends on Dependability in Hybrid Clouds
§ In the “Cloud First” hosting model that enterprises are increasingly adopting, workload dependability fulfilment is more about solution composition than about engineering effort if target hosting is on modern-day public and “shared private” clouds.§ On private clouds, dependability translates to redundancy engineering, but math modeling is almost always needed for a “just enough” design that maps as close as possible to SLAs§ There are also “non technical” aspects that influence. dependability, such as the standard SLA catalogues of managed service providers and application support teams.
Summary
We discussed:§ Today’s hybrid cloud landscape & dependability NFRs§ Availability fulfilment approach statistics from 50 client deals§ Three client case studies where required dependability was improvised/engineered in the face of constraints on the cloud§ Trends on dependability in hybrid clouds