Cloud Services Innovation Platform (CSIP)
Performance Modeling to Support Multi-Tier Application
Deployment to Infrastructure-as-a-Service CloudsWes Lloyd, Shrideep
Pallickara, Olaf David, James Lyon, Mazdak Arabi, Ken Rojas
November 6, 2012
Colorado State University, Fort Collins, Colorado USAUCC 2012:
5th IEEE/ACM International Conference on Utility and Cloud
Computing
1OutlineBackgroundResearch ProblemResearch QuestionsExperimental
SetupExperimental ResultsConclusions2Nov 6, 2012 IEEE/ACM UCC 2012
Performance Modeling to Support Multi-Tier Application Deployment
to Infrastructure-as-a-Service Clouds2Background3Traditional
Application Deployment4Object StoreSingle ServerNov 6, 2012
IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier
Application Deployment to Infrastructure-as-a-Service CloudsObject
StoreGeospatial DBrDBMSNOSQL DBFile
ServerServicesDistributedCacheLogging ServerApache TomcatIaaS
cloudApplication Deployment5Application Component Deployment 6App
ServerComponentDeploymentApplicationComponentsApplication
StackVirtual Machine (VM) Images
PERFORMANCErDBMS r/oFile ServerLog ServerLoad BalancerImage
2rDBMS write. . .Image 1App ServerFile ServerLog ServerrDBMS
writeImage nrDBMS r/oLoad BalancerDist. cacheNov 6, 2012 IEEE/ACM
UCC 2012 Performance Modeling to Support Multi-Tier Application
Deployment to Infrastructure-as-a-Service Cloudsn=# components; k=#
components per set
Permutations
Combinations
But neither describes partitions of a set!Application
Deployments
7
Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support
Multi-Tier Application Deployment to Infrastructure-as-a-Service
Clouds
Bells Number 8ModelComponentDeploymentn = #componentsApplication
StackVM deployments# of ConfigurationsDatabaseFile ServerLog
Server. . .k= #configsconfig 1MDFLconfig 2MFLconfig nMLFD1 VM :
1..n componentsnk4155526203787784,140921,147n. . .DNumber of ways a
set of n elements can be partitioned into non-empty subsets
Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support
Multi-Tier Application Deployment to Infrastructure-as-a-Service
CloudsResearch Problem9Problem StatementHow should application
components be deployed to ?
Provide high throughput (requests/sec) With low resource costs
(# of VMs) To guide VM image compositionAvoid resource contention
from interfering components
10Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support
Multi-Tier Application Deployment to Infrastructure-as-a-Service
Clouds11VM
VM
VM
Physical Machine (PM) Resources
VMVMVM
VM
VM
VM
PERFORMANCEResourceContentionResourceSurplusNov 6, 2012 IEEE/ACM
UCC 2012 Performance Modeling to Support Multi-Tier Application
Deployment to Infrastructure-as-a-Service CloudsResource
Utilization Statisticsc12CPU- CPU time- CPU time in user mode- CPU
time in kernel mode- CPU idle time- # of context switches- CPU time
waiting for I/O- CPU time serving soft interrupts- Load average (#
proc / 60 secs)
Disk- Disk sector reads- Disk sector reads completed- Merged
adjacent disk reads- Time spent reading from disk- Disk sector
writes- Disk sector writes completed- Merged adjacent disk writes-
Time spent writing to diskNetwork- Network bytes sent- Network
bytes receivedPMVMVMPMVMVMVMNov 6, 2012 IEEE/ACM UCC 2012
Performance Modeling to Support Multi-Tier Application Deployment
to Infrastructure-as-a-Service Clouds
Can Resource Utilization Statistics
13
Model Application Performance? Nov 6, 2012 IEEE/ACM UCC 2012
Performance Modeling to Support Multi-Tier Application Deployment
to Infrastructure-as-a-Service CloudsResearch Questions14Research
QuestionsWhich resource utilization statistics are the best
predictors?
How should resource utilization data be treated for use in
models?
Which modeling techniques are best for predicting application
performance and ranking performance of service compositions?
15RQ1)
RQ2)
RQ3)Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to
Support Multi-Tier Application Deployment to
Infrastructure-as-a-Service CloudsExperimental Setup16
RUSLE2 ModelRevised Universal Soil Loss EquationCombines
empirical and process-based sciencePrediction of rill and interrill
soil erosion resulting from rainfall and runoffUSDA-NRCS agency
standard modelUsed by 3,000+ field officesHelps inventory erosion
ratesSediment delivery estimationConservation planning tool17Nov 6,
2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier
Application Deployment to Infrastructure-as-a-Service CloudsRUSLE2
Web ServiceMulti-tier client/server applicationRESTful, JAX-RS/Java
using JSON objectsSurrogate for common architectures
18OMS3RUSLE2POSTGRESQLPOSTGIS1.7+ million shapes57k XML files,
305MbNov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support
Multi-Tier Application Deployment to Infrastructure-as-a-Service
Clouds
Eucalyptus 2.0 Private Cloud(9) Sun X6270 blade serversDual
Intel Xeon 4-core 2.8 GHz CPUs24 GB ram, 146 GB 15k rpm HDDsCentOS
5.6 x86_64 (host OS)Ubuntu 9.10 x86_64 (guest OS)Eucalytpus
2.0Amazon EC2 API support8 Nodes (NC), 1 Cloud Controller (CLC, CC,
SC)Managed mode networking with private VLANsXEN hypervisor v
3.4.3, paravirtualization19Nov 6, 2012 IEEE/ACM UCC 2012
Performance Modeling to Support Multi-Tier Application Deployment
to Infrastructure-as-a-Service CloudsRUSLE2 ComponentsVirtual
MachineDescriptionMModelApache Tomcat 6.0.20, Wine 1.0.1, RUSLE2
Model, Object Modeling System (OMS 3.0)DDatabasePostgresql-8.4, and
PostGIS 1.4.0-2. soil data: 1.7 million shapes, 167 million
pointsmanagement data: 98 shapes, 489k pointsclimate data: 31k
shapes, 3 million points4.6 GB for the state of TNFFile Servernginx
http server 0.7.62 57,185 XML files consisting of
305MB.LLoggerCodebeamer 5.5 running 32-bit ApacheTomcat 6.0Custom
REST/JSON logging service as wrapper. 20Nov 6, 2012 IEEE/ACM UCC
2012 Performance Modeling to Support Multi-Tier Application
Deployment to Infrastructure-as-a-Service Clouds20SC2M DF LSC4M
D
FLSC7LMD
FSC3M D
F LSC5MDF LSC6MD FLSC8MD
F LSC9MD LFSC10M FD LSC11M FDLSC12M LD FSC13M LDFSC14M DLFSC15M
LFDSC1M DF L21(15) Tested Component Deployments
Each VM deployed to separate physical machineAll components
installed on composite imageScript enabled/disabled components to
achieve configsNov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling
to Support Multi-Tier Application Deployment to
Infrastructure-as-a-Service CloudsRUSLE2 Application
Variants22D-bound Model 21% Database 77% File I/O .75% Overhead1%
Logging .1%
M-bound Model 73% Database 1% File I/O 18% Overhead 8% Logging
1%
D-bound:join w/ a nested queryM-bound:standard modelNov 6, 2012
IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier
Application Deployment to Infrastructure-as-a-Service Clouds22
23
SC15SC14SC13SC12SC11SC10SC9SC8SC7SC6SC5SC4SC3SC2SC1 CPU time
disk sector reads disk sector writes net bytes rcvd net bytes
sentResource Utilization Variance for Component Deployments
Boxes represent absolute deviation from meanMagnitude of
variance for deploymentsTested Resource Utilization
Variablesc24Network- Network bytes sent (nbr)- Network bytes
received (nbs)CPU- CPU time- CPU time in user mode (cpu usr)- CPU
time in kernel mode (cpu krn)- CPU idle time (cpu_idle)- # of
context switches (contextsw)- CPU time waiting for I/O
(cpu_io_wait)- CPU time serving soft interrupts (cpu_sint_time)-
(loadavg) (# proc / 60 secs)
Disk- Disk sector reads (dsr)- Disk sector reads completed
(dsreads)- Merged adjacent disk reads (drm)- Time spent reading
from disk (readtime)- Disk sector writes (dsw)- Disk sector writes
completed (dswrites)- Merged adjacent disk writes (dwm)- Time spent
writing to disk (writetime)Nov 6, 2012 IEEE/ACM UCC 2012
Performance Modeling to Support Multi-Tier Application Deployment
to Infrastructure-as-a-Service Clouds25100 random runsJSON
object20x Ensembles100 random runs100 random runs100 random runs100
random runs100 random runs100 random runs100 random runs100 random
runsSC5MDF LSC8MD
F LSC11M FDLSC14M DLFSC1M DF L(15) RUSLE2deployments
Resource UtilizationDatascript captureExperimental Data
Collection
1st run training dataset2nd run test datasetNov 6, 2012 IEEE/ACM
UCC 2012 Performance Modeling to Support Multi-Tier Application
Deployment to Infrastructure-as-a-Service CloudsExperimental
Results26
RQ1 Which are the best predictors? VM VariablesCPUDisk
I/ONetwork I/O27Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling
to Support Multi-Tier Application Deployment to
Infrastructure-as-a-Service CloudsRQ1 Which are the best
predictors? PM Variables28
CPUNetwork I/ONov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling
to Support Multi-Tier Application Deployment to
Infrastructure-as-a-Service CloudsRQ2 How should VM resource
utilization data be used by performance models?Combination:
RUdata=RUM+RUD+RUF+RUL
Used Individually: RUdata={RUM; RUD; RUF; RUL;}
29Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support
Multi-Tier Application Deployment to Infrastructure-as-a-Service
Clouds
RQ2 How should VM resource utilization data be used by
performance models?30D-bound separateD-bound combinedM-bound
separateM-bound combinedTreating VM data separately for D-bound was
better !RUM or RUMDFLfor M-bound was better !Note the larger
RMSEfor D-bound RUMDFL!Nov 6, 2012 IEEE/ACM UCC 2012 Performance
Modeling to Support Multi-Tier Application Deployment to
Infrastructure-as-a-Service CloudsRQ3 Which modeling techniques
were best?Multiple Linear Regression (MLR)Stepwise Multiple Linear
Regression (MLR-step)Multivariate Adaptive Regression Splines
(MARS)Artificial Neural Network (ANNs)31Nov 6, 2012 IEEE/ACM UCC
2012 Performance Modeling to Support Multi-Tier Application
Deployment to Infrastructure-as-a-Service Clouds
RQ3 Which modeling techniques were
best?32MultipleLinearRegressionStepwiseMLRMultivariateAdaptiveRegresionSplinesArtificalNeuralNetworkRUMDFL
data used tocompare models.
Had high RMSEtest error for D-Bound (32% avg)Model performance
did not vary much
Best vs. Worst
D-BoundM-Bound .11% RMSEtrain.08% .89% RMSEtest.08% .40 rank
err.66
Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support
Multi-Tier Application Deployment to Infrastructure-as-a-Service
CloudsConclusions33ConclusionsCPU statistics were the best
predictors
The best treatment of resource utilization statistics was model
specific. - (RUMDFL) best for M-Bound RUSLE2 (more I/O) -
Individual VM stats (e.g. RUM) best for D-Bound RUSLE2 (more
CPU)
ANN and MARS provided lower RMSerror. All models adequately
predicted performance and ranks
34RQ1)
RQ2)
RQ3)Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to
Support Multi-Tier Application Deployment to
Infrastructure-as-a-Service CloudsQuestions35
Extra Slides36Gaps in Related WorkExisting approaches do not
considerVM image compositionComplementary component
placementsInterference among componentsMinimization of resources (#
VMs)Load balancing of physical resourcesPerformance models
ignoreDisk I/O Network I/OVM and component locationApproaches &
Gaps 37
Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support
Multi-Tier Application Deployment to Infrastructure-as-a-Service
CloudsApplicationServersLoad BalancerLoad BalancerService
RequestsnoSQL data storesrDBMSdistributed cacheInfrastructure
ManagementProblems & Challenges 38Scale Services
Tune Application Parameters
Tune Virtualization ParametersNov 6, 2012 IEEE/ACM UCC 2012
Performance Modeling to Support Multi-Tier Application Deployment
to Infrastructure-as-a-Service CloudsProvisioning VariationProblems
& Challenges 39VMPhysical HostPhysical HostPhysical
HostPhysical HostPhysical HostPhysical
HostVMVMVMAmbiguousMappingVMVMVMVMVMVMVMVMVMVMVMVMVMVMRequest(s) to
launch VMsVMs ReservePM Memory BlocksVMs Share PMCPU / Disk /
Network
PERFORMANCENov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to
Support Multi-Tier Application Deployment to
Infrastructure-as-a-Service CloudsApplication Profiling
VariablesPredictive Power40
Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support
Multi-Tier Application Deployment to Infrastructure-as-a-Service
CloudsApplication Deployment ChallengesVM image compositionService
isolation vs. scalabilityResource contention among
componentsProvisioning variation Across physical hardware
41
Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support
Multi-Tier Application Deployment to Infrastructure-as-a-Service
CloudsResource Utilization StatisticsVMs Reserve PM memoryShare
CPU, disk, and network I/O resourcesVM application performance
Reflects quality of load balancing of shared resourcesResource
contention performance degradationResource surplus good
performance, higher costs
42
Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support
Multi-Tier Application Deployment to Infrastructure-as-a-Service
CloudsResource Utilization VariablesStatisticDescription P/VCPU
timeCPU time in msP/Vcpu usrCPU time in user mode in msP/Vcpu
krnCPU time in kernel mode in msP/Vcpu_idleCPU idle time in
msP/VcontextswNumber of context switchesP/Vcpu_io_waitCPU time
waiting for I/O to completeP/Vcpu_sint_timeCPU time servicing soft
interruptsVdsrDisk sector reads (1 sector = 512
bytes)VdsreadsNumber of completed disk readsVdrmNumber of adjacent
disk reads mergedVreadtimeTime in ms spent reading from
diskVdswDisk sector writes (1 sector = 512 bytes)VdswritesNumber of
completed disk writesVdwmNumber of adjacent disk writes
mergedVwritetimeTime in ms spent writing to diskP/VnbrNetwork bytes
sentP/VnbsNetwork bytes receivedP/VloadavgAvg # of running
processes in last 60 sec43Nov 6, 2012 IEEE/ACM UCC 2012 Performance
Modeling to Support Multi-Tier Application Deployment to
Infrastructure-as-a-Service CloudsExperimental DataScript captured
resource utilization statsVirtual machinesPhysical MachinesTraining
data: first complete run20 different ensembles of 100 model runs 15
component configurations30,000 model runs Test data: second
complete run30,000 model runs44Nov 6, 2012 IEEE/ACM UCC 2012
Performance Modeling to Support Multi-Tier Application Deployment
to Infrastructure-as-a-Service Clouds