Dependability in Hybrid Clouds: Practitioner Insights

Dependability in Hybrid Clouds: Practitioner InsightsIFIP 10.4 Work Group Meeting – Winter 2018, Goa, India

Sreekrishnan Venkateswaran,Distinguished Engineer, IBM Cloud Center of Excellence

Contents

Today’s Hybrid Cloud Landscape & Dependability NFRsAvailability Fulfilment Approach: Statistics from 50 client dealsCase Studies from the field:

I: Engineering “Just Enough” HA on a Private Cloud II: Hosting a Clustered Appliance on a Public Cloud III: Application/MW HA on a Shared Private Cloud

Concluding Thoughts: Trends on Dependability on Hybrid Clouds

§ IT Hosting philosophies plotted across 2 axes – cost & volume

§ One end of the spectrum lie high-volume low-cost public clouds

§ Other end of the spectrum is the low-volume high-cost single-tenant environments, cloud or legacy

§ For enterprise clients, there is a sweet spot in this landscape in terms of price and services via a managed enterprise-grade multi-tenant cloud

§ Value close to traditional/private IT by providing management above the hypervisor, enhanced isolation & production SLAs

§ Price points close to that of public clouds via standardization, virtualization and automation

Cost

Volume

Expensive

Public Clouds

Innovation Partner

Value for Money

BasicMgt Practice

AdvancedMgt Practice

AutomationReasonable

Least

SelectiveIndividual Mass

TrustedSupplier

ManagedCloud

Traditional IT and Private Clouds

Standardization

Cloud Landscape: Building Blocks for Hybrid Hosting in Today’s Client Deals

Cloud Broker

PrivateClouds:OpenStack,VMwarevRealize

HostedPrivateClouds:VMwarevRealizeonBluemix,

Bluebox

IBMCloudManagedServices(CMS)

AWS,Azure,GoogleCloud,IBMBluemix

Availability Capabilities Across Cloud Categories Expen

sive

Public Clouds

Value for Money

Reasonable

LeastSelectiveIndividual Mass

ManagedCloud

Traditional IT and Private Clouds

PrivateClouds:IBM Cloud Orchestrator, OpenStack,

VMware vRealize

HostedPrivateClouds: Bluebox, VMWare Cloud

on SoftLayer

IBMCloudManagedServices(CMS),CMS4SAP

SoftLayer,AWS,Azure,GoogleCloud

Cloud Category Availability Philosophyo On-premise Privateo Hosted Privateo Traditional IT

o Custom Design

Public Clouds o Provider offers VM-level availability SLAso Provider offers IaaS-level HA on Bare metal

Managed Multi-tenant (or “shared private”) Clouds

In addition to OS-level availability, introduces clustering to provide a more highly available environmento HA clusters by allowing customers to specify anti-collocation of the

virtual workload onto separate servers for fault containmento Connects clusters to shared storage for shared-disk HA topologies.

Example:CaseStudy1

Example:CaseStudy2

Example:CaseStudy3

Non-Functional Requirements in Cloud Deals: A Recent ExampleNFR# Category Requirement Deliverables

NFR01 Availability Cloud management software should be highly available

OpenStack and Virtualization component will have active-active configuration. Handle Server overload v/s Server going down.

NFR02 Availability Hardware Management Workload running on a failed host will be restarted on another host in the resource pool for both AIX and VMware if the host fails

NFR03 Business Continuity Backup & Restore Support backup policies pertaining to NetBackup

NFR04 Monitoring & Event Mgt

Host monitoring requiredGuest monitoring required (Managed)Dashboards - utilization monitoring

Monitor both hosts and guests, OS agents to be initially deployed manually during post-provisioning.

NFR05 Image Management Standard Images and patterns to be maintained

Standard image catalogue will be maintainedApplication patterns to be createdManage a standardized catalogue of patterns

NFR06 Security Follow client’s security guidelines TBD

NFR07 Disaster Recovery RPO of 30 minutes, RTO of 4 hours Support failover on DR-sensitive workloads within RTO/RPO

NFR08 Security Network Isolation Segregation using VLANs (for Lpars) and VxLANs (for x-86 optional)

SLA & SLO Requirements in Cloud Deals: Recent Example

o Availability SLA: 99.5 to 99.9o Provisioning Request Fulfilment SLO: 15 mins to 24 hourso DR SLA: RPO/RTO: 15m/4ho Incident resolution SLO: See tableo On-boarding time SLO: 1 dayo Time to build Pod: 8 months

Instance Type Provisioning Time SLOs

Bronze 15 minutesSilver 30 MinutesGold 60 minutesPlatinum 24 hours

IncidentResolutionSLAs

Availability Solution Approach Data from Cloud Deals: Recent Statistics (1/2)No HA

Compute HA Data HA Network HA

Cloud Stack

HA DR

#Deals(outof50)

10Hypervisor

-levelOS

ClusterMW(e.g.DBServer)

App-level

DB-level

Clustered File systems

HW RAID

SDSReplicas

GW SDN

23 229 2 2 1 25 1 2 10 11 25

0

5

10

15

20

25

30

35

Hypervisor OS Level MW Level App Level Other

Compute HA

0

5

10

15

20

25

DB Level Clustered FS HW Raid SDS Replication

Other

Data HA

0

5

10

15

20

25

30

GW

(Vya

tta)

SDN

(NSX

)

Oth

er

Network HA

0

5

10

15

20

25

HA DR

Cloud Stack

Availability Solution Approach Data from Cloud Deals: Recent Statistics (2/2)

# Conclusions Improvement Strategy

1 Most solutions do not map SLA requirements to the level of HA needed across constituent components. They merely follow rules of thumb such as “triple replicate storage since this airline client needs 99.95” Simple uptime

modeling. Focus of Field Case Study 1.

2 For managed cloud deals, the managed service provider usually offers only 2 system availability options, one corresponding to HA & another to non-HA. How to convince managed service provider teams to offer HA required by the client?

05

101520253035

Hypervisor OS Level MW Level App Level Other

Compute HA

05

10152025

DB Level Clustered FS HW Raid SDS Replication

Other

Data HA0

5

10

15

20

25

30

GW

…

SDN

(NSX

)

Oth

er

0

5

10

15

20

25

HA DR

Cloud Stack

1. Aglobalclientrequireda99.90uptimeSLA(9h/ydowntime)atOSlevel2. ButthemanagedServiceProvider(MSP)offeredonlyanuptimeSLAof99.0

(3d/ydowntime)bydefault3. However,theMSPallowed99.5(2d/y,7m/ddown)

SLAonclientIaaSenabledwith“BasicHA”:- Tier-4DataCenter(99.995atsite-level)- StorageV7KwithRedundantHVACs- System-P/AIX- 24x7Hands&FeetinDC

2 Power-8 Frames/AIX

V7000

Client Infra Hosted in a Tier-4 DC with “Basic HA”

Network Elements

1. Assumethatthemanagedserviceprovideroffersonly99.0with“BasicHA”2. EngineeradditionalHAontheIaaS3. ModeltheensuingredundancyandestablishtotheMSPteamthattheadditionalHA

canincreasetheSLAfrom99.0to99.95(4h/y)withoutadditionalrisktotheMSP

Client Case Study 1: Engineering “Just-Enough” HA on a Private Cloud

Solution Approach that was Followed

Problem Statement

Thefollowingistheproposeddesignfortheclient(intheprimarydatacenter):

C1,1 C1,2

PROPOSED HA DESIGN FOR THE CLIENTʼS ENVIRONMENT

C3,1 C3,k3

C1, k1

C2,1 C2,k2

P1, t1, f1

P2, t2, f2

P3, t3, f3

. . .

C1,* = ComputeC1,1 = System-P Frame-1C1,2 = System-P Frame-2C1,3 = System-P Frame-3 (Redundant node: Power-HA)

C2,* = StorageC2,1 = V7000 Storage VolumesC2,2 = Redundant RAID-10

C3,* = Network C3,1 = Network ElementsC3,2 = Switches/Routers/FWs/CPEs in dual redundant mode

Additional HA Engineered in the Proposed System

Wemodelacloud-hosted systemSasaserialcombinationofnclusters.Lettherebe‘n’clustersthatconstitutethesystem.LeteachclusterCi becomposedofKi nodes, eachdenoted asCi,ki.Overalldown-timeprobabilityofScanbeexpressedasDs =Bs +Fs where------ [1]Bs =Systemdowntime duetonon-recoverablefailures(breakdownofoneormoreclusters)andFs =Systemdowntimeduetorecoverablefailures(outagewhenclustersrecoverfromnodefailures)Bs andFs aremutuallyexclusiveifwedisregardthepossibilityofanunrecoverablefailureduringclusterfailover

Pi = Probability that a node in cluster Ci is down (= 1% from MSPʼs assumption that 99% can be offered without HA)

fi = Average yearly failures for component Ci (from cloud broker data lakes)ti = Failover latency with the chosen HA algorithm (from empirical observations)ḱi<Ki =maximumnumberoffailednodesthatcanbetoleratedbytheclusteringalgorithmofCi

IfthelevelofredundancyinaclusterisN+ὴ,thenḱiisὴ.

ProbabilitythatClusterCi isUP=∑"#$

"#$%"#&ḱi

(1− 𝑃𝑖)$𝑃𝑖"#&$

ProbabilitythatallClustersinthesystemareup=∏(i=1ton)[∑"#$

"#$%"#&ḱi

(1− 𝑃𝑖)$𝑃𝑖"#&$]

DowntimeprobabilityofSystemS,Bs =1-∏(i=1ton)[∑"#$

"#$%"#&ḱi

(1− 𝑃𝑖)$𝑃𝑖"#&$]------- [2]

Uptime Modeling [1/3]

Letti bethetime (inminutes) tofailover ifanode incluster Ci goesdown.Letfi betheaveragenumberoffailuresexperienced byanode inclusterCi inayear.

Failovertime ti isasumof(i) Time todetect thatthecurrentlyactivenode incluster Ciisdown;this isthetime beforewhichaheartbeatmiss isdetected (ii)Time tobringupthefailovernode ifitisonstandbyand(iii)Time forthefailovernodetotakeoverfromtheprimarynode

SincePi istheprobability ofanode inCluster Ci isdown,it isalsotheprobability thatthecurrentlyactivenode inClusterCi isdown.Downtime duetofailovertransactions inCluster Ci =fi*ti

C1,1 C1,2

CLOUD-HOSTED CLUSTERED IaaS ARCHITECTURE of a System S

C3,1 C3,k3

C1, k1

C2,1 C2,k2

Cn,1 Cn,2 Cn,kn

..

...

..

...

P1,t1,f1

P2,t2,f2

P3,t3,f3

Pn,tn,fn

...

Uptime Modeling: Calculating Ds [2/3]

However,multipleclustersmightbesimultaneouslyexperiencingfailovertransactions,sothattimecannotbedoublecounted.DowntimeduetofailovertransactionsinclusterCi,whennootherclustersareexperiencingfailovertransactions=fi*ti* (𝐾𝑖 − ḱi)*Pi (X1),whereX1istheeventthatonlyclusterCi inthesystemisexperiencingafailovereventandPi(X1)=∏(j=1ton, j<>i)[(1-Pj) (𝐾𝑗 − ḱj)]Note:Weignore theerrorofcountingintra-clusternodefailovertimeswhenmorethanḱi nodesinCi

failsimultaneously.Wealsodisregardthepossibilityofanunrecoverableerrorduringclusterfailover.Thus,DowntimeduetofailovertransactioninClusterCi whentherearenoothersimultaneousfailoversinanyothercluster=fi*ti *(𝐾𝑖 − ḱi) ∏(j=1 ton, j<> i)[(1-Pj) (𝐾𝑗 − ḱj)]minutesDowntimeduetofailovertransactionsacrossallClusters=∑(i=1ton)fi*ti*(𝐾𝑖 − ḱi)*Pi(X1)minutesDowntimeprobabilityduetoallfailovertransactionsacrossclustersFs =(∑(i=1ton)((fi*ti*(𝐾𝑖 − ḱi))/525600)*∏(j=1ton, j<> i)[(1-Pj) (𝐾𝑗 − ḱj)])----- [3]ApplyingEquation[2]andEquation[3]toEquation[1],wegetTotaldowntimeprobability

Ds =(1- ∏(i=1ton)[∑ 𝑲𝒊𝒋

𝑲𝒊

𝒋%𝑲𝒊&ḱi(𝟏 − 𝑷𝒊)𝒋𝑷𝒊𝑲𝒊&𝒋])+

(∑(i=1ton)((fi*ti*(𝑲𝒊 − ḱi))/525600)*∏(j=1 ton,j<>i)[(1-Pj) (𝑲𝒋 − ḱj)]) ----- [4]

TotalUptimeProbabilityUs=1– Ds ----- [5]

Uptime Modeling: Calculating Ds [3/3]

Estimating New Uptime with the Additionally Engineered HA

Component#

UptimewithoutHA(Pi)

Estimates,givenMSP’sposture

Averageyearlyfailures(fi)

(Estimatesfromacloudbroker)

FailoverlatencyinHAmode(ti)

(Empirical,fromexperience)

ProposedHAmethod

SystemUptimewiththisarchitecture(Us)(ApplyingEquation

#5)1 99%(3d/ydowntime) 1 30minutes PowerHA(2+1)

99.945%

=9m/ydowntime

2 99% 2 10seconds RAID10

3 99% 1 10seconds DualNodeCluster

CLIENTENVIRONMENT

WebTier

GW

DISK

App Tier App Tier DBTier

1Gbps

LoadBalancerVirtualAppliance

ClientUser

MSP

Internet

Component2:60TBV7KDisks(RAID10forHA)

Component1: Power-VCVirtualizedPhysicalNodes(2+1HA)

Component3:NetworkElements(dualnodeclusterforHA)

1. Aglobale-commerceretailerwhowishedtoadopta“Cloud-First”strategyranspecializedHAappliances(OracleRACinthiscase)inthebackendforHA(“belowtheshoppingcart”).MigratingtheclienttopubliccloudwasruledoutwithoutRACsupport

2. RACClusteringofOracledatabaseserversisnotcloud-friendlyandhenceisnotsupportedonmainstreampubliccloudsbecauseo Dedicatednetworklinksbetweenclusterserversisneededtocarryheartbeats.Missedheartbeatscan

generatefalsenegativesandcanbedisastrouso Layer-2adjacencyrequiredbetweenclusterserverso Redundantstorageneeded

Client Case Study 2: Hosting a Clustered Appliance on Public Cloud


Problem Statement

1. Un-bondthe2publiccloudinterfacesonphysicalbaremetalhosts.ProvisionheartbeatVLANsonthisinterface

2. ProvisioningbaremetalserversonthesamephysicalrackimpliestheyareL2adjacentsinceswitchesinarackaretrunked(linkaggregation).Ifclusterserversfallondifferentracksopenatickettogetthecorrespondingswitchestrunked

3. UseredundantSDSstorage(VSANorCeph)4. Madethisa“pseudostandard”buildingpatternincertainpublicclouddatacenters

Client Case Study 3: Application & MW HA on a Shared Private Cloud


Problem Statement1. AresearchfacilityinMexicooffers“classroomvirtualization”,whichise-

deliveryofcoursesoverthepublicInternet2. Ahighlyscalablewebserviceisconnectedtoascalableapplicationtier

andadatabasetier(IBMDB2inthiscase),all3tiershostedona“sharedprivate”managedcloud.TheDBtierneedstooperateinHAmode

3. Ingeneral,DBHAwillbeoflimitedvalueataVM-levelsincebothdatabaseserverinstancescouldenduponthesamephysicalhost

1. Databaseservers(IBMDB2)weredeployedinHAmodeinanti-collocatedVMs.

2. Thedatabaseitselfwashostedonashareddisk3. Theclustereddatabasecouldthustoleratesinglephysicalhostfailures

Concluding Thoughts: Trends on Dependability in Hybrid Clouds

§ In the “Cloud First” hosting model that enterprises are increasingly adopting, workload dependability fulfilment is more about solution composition than about engineering effort if target hosting is on modern-day public and “shared private” clouds.§ On private clouds, dependability translates to redundancy engineering, but math modeling is almost always needed for a “just enough” design that maps as close as possible to SLAs§ There are also “non technical” aspects that influence. dependability, such as the standard SLA catalogues of managed service providers and application support teams.

Summary

We discussed:§ Today’s hybrid cloud landscape & dependability NFRs§ Availability fulfilment approach statistics from 50 client deals§ Three client case studies where required dependability was improvised/engineered in the face of constraints on the cloud§ Trends on dependability in hybrid clouds

Dependability in Hybrid Clouds: Practitioner Insights

Documents