© 2010 IBM Corporation RESEARCH Columbia University COMS W6998-6 Migration to Cloud Kay Sripanidkulchai, IBM T.J. Watson Research Center November 10, 2010
© 2010 IBM Corporation
RESEARCH
Columbia University COMS W6998-6 Migration to Cloud
Kay Sripanidkulchai, IBM T.J. Watson Research CenterNovember 10, 2010
© 2010 IBM Corporation2
Migration Technologies and Process Steps
Location
change
Virtualization
Status
Same DC
Different DC
Cold
Live
Live-ness
P2P
P2V
V2V
Onto Cloud
Live Migration In a LAN•VMWare VMotion•Xen Live Migration (NSDI ’05 [1])•KVM Live Migration, KVM Block Migration•IBM System p Live Partition Mobility•Hyper-V
Live Migration Across WANs •VEE’07 [2]•CCGRID’09•VIDC’09•INM’07 [4]•Cisco/VMWare White Paper [3]
Migration to Cloud•Sigcomm ’10 [5]•P2V Conversion (VMWare vCenter Converter, PlateSpin Migrate)
TestMigratePlan and
DesignDiscover
© 2010 IBM Corporation3
Recap of Live Migration
� Demo Migrate memory, register, and configuration files
of a VM from one hypervisor to another
hypervisor while the VM is running.
VM
hypervisor1 hypervisor2
mgmt / migration
networkproduction
network
VMshared SAN
volume
© 2010 IBM Corporation4
Outline “Migration to Cloud”
�Planning migrations, focusing on performance and
SLA requirements–What to migrate?
–Which cloud?
�Executing migrations–P2V conversions
–Migrating to EC2
© 2010 IBM Corporation5
ITIL System Management Eco-system
Security and Network Components
Scalable/High-Availability/DRArchitectures
Enterprise vs. individual customers have different requirements [LADIS 09]
Enterprise-Class Application Building Blocks (3-Tiered +
Messaging + etc.)
Enterprise-ClassHardware
Application BuildingBlocks (3-Tiered )
CommodityHardware
Typical Enterprise Application Architecture
Typical Small/Individual Application Architecture
?? ?
© 2010 IBM Corporation6
Enterprise Applications
E.g., Payroll, travel and expense reimbursement, customer relationship management etc.
BE
FE
BL
Front End
(FE)
Business Logic
(BL)
Back End
(BE)
3-tier Application Structure 6
FE1 FE2
BL1 BL2BL3 BL4 BL5
BL1 BL2 BL3 BL4 BL5
© 2010 IBM Corporation7
Enterprise Applications
E.g., Payroll, travel and expense reimbursement, customer relationship management etc.
7
BE
FE
BL
© 2010 IBM Corporation9
96.3
50
99.8
97
99.7
57
99.9
93
99.6
23
99.9
81
99.9
97
99.9
97
99.9
68
99.9
83
98.4
64
99.9
18
99.9
23
99.8
46
99.9
06
99.9
94
99.9
62
99.9
96
99.9
99
99.9
93
96
97
98
99
100o
nke
lbo
rg.c
om
ww
w.k
arl
sb
org
.se
ww
w.m
ate
ma
tike
rsa
mfu
nd
et.o
rg.s
e
ww
w.n
av
yfc
u.o
rg
ww
w.t
ob
ak
sfa
kta
.org
se
arc
h.
ya
ho
o.c
om
ww
w.
am
azo
n.c
om
ww
w.c
nn
.co
m
ww
w.e
ba
y.c
om
ww
w.
wa
lma
rt.c
om
Av
aila
bil
ity
(%
)
20072008
There are gaps in service availability requirements for enterprise users [LADIS 09]
Individual/Small 99.368%(~55 hours downtime/year)
Enterprise 99.987%(~1 hour downtime/year)
State-of-the-art cloud SLA at 99.95% or ~4 hours downtime/year.
© 2010 IBM Corporation10
Our focus #1 : Planning hybrid cloud layouts
• Cost savings, Application response times, Bandwidth costs
• Scale and complexity of enterprises applications
back-end
front-end
Local Data
Center
back end
an ACL
Local Data
Center
Cloud
back-end
frontend
Internet
back end
front-end
© 2010 IBM Corporation11
C0 C1 C2
C3 C4
C5
Ci
Cj
Ck
I
E
Enterprise
App1 App2
Abstracting the planning problem
Internal
External
© 2010 IBM Corporation12
To determine:
mi= number of servers of component
Ci to migrate to the cloud (mi ≤ Ni)
Tij= number of transactions per second
along (i,j)
Sij= average size of transactions along (i,j)
Abstracting the planning problem
Ni = number of servers in component CiCi
Cj
© 2010 IBM Corporation13
Formulating the planning problem
Local Data Center
Cloud
back-end
frontend
back-end(sensitivedatabases)
front-end
�Objective: Maximize cost savings on migration
–Benefits due to hosting servers in the cloud
–Cost increase/savings related to wide area Internet communication
�Constraints:–Policy constraints–Bounds on increase in
transaction delay
�Future work: –Application availability
© 2010 IBM Corporation14
Partitioning requests after migration
(1) Location sensitive routing
Migrate
CiL CjL
CiR CjR
T’iR,jLT’iL,jR
T’iL,jL
T’iR,jR
Cloud
Local DC
Ci CjTi,j
Local DC
(2) Location Independent routing•Split in proportion to the number of servers in CjL and CjR
•Introduces non-linearity in constraints.
© 2010 IBM Corporation15
Modeling Approach
Model complexity Vs. Practicality of data collection
Fine-grained models:
• Potentially more accurate
• Model parameters harder to collect
Our Approach:
•Use easily available information (e.g., computation times
of components and communication times on links)
•Empirical experience to drive iterative model refinements
© 2010 IBM Corporation16
Modeling user response times
� Ideally, desirable to bound increase in:–Mean response time–Response time variations (e.g., 95%ile response times).
�Bounding changes to mean delay relatively easier–Linearity of expectations
�Bounding delay variations harder–E.g., need distribution of component service times
–Feasible to bound changes to variance of response times• By conditioning on path taken by transactions• Assuming independence of individual component response times etc.• Can be extended to applications with non path-like transactions
–Conservative bounds on changes to delay percentiles feasible
© 2010 IBM Corporation17
Benefits/costs on migration
� Benefits due to hosting servers in the cloud– Economies of scale, lowered operational expenses – Benefit estimates from Armbrust et al (Berkeley TR, 2009)– Benefits dependent on compute or storage servers
� Costs related to Internet communication – Linear cost model– Matches charging model of EC2, Azure etc.
� Future Extensions:– One-time costs of executing migrations– Savings due to not provisioning enterprises for peaks
© 2010 IBM Corporation18
Evaluation Goals and Case Studies
�Evaluation Goals:–Are there scenarios where a hybrid approach makes
sense?
–What are the cost savings associated with going to the
cloud?
–How effective are coarse-grained planning models?
�Case Studies:–Windows Azure SDK application
–Campus Enterprise Resource Planning (ERP)
application
© 2010 IBM Corporation19
Experiments on cloud test-bed
� Thumbnail example application
� Two Azure data centers (DCs), represent local/remote
� Internal users: hosts in campus close to internal DC
� External users: Planetlab
� Reengineer application for hybrid cloud deployment
© 2010 IBM Corporation20
Results
� Plan requirements: increase in mean delay less than 10%, increase in
variance less than 50%
� Algorithm Recommendation: Migrate 1 FE , 3 BL servers
� Observed: 17% increase in mean, 12% increase in variance
© 2010 IBM Corporation21
Conclusions [SIGCOMM 10]
� Hybrid cloud models often make sense– Enable cost savings, while meeting enterprise policies and
application response time requirements
� Planned approach to migration important and feasible– Algorithms for hybrid cloud layouts
– Algorithms for correct reconfiguration of security policies
� Future Work– Exploring model complexity and performance inaccuracy
– Wider range of application case studies
– Take workload and network dynamics into account
© 2010 IBM Corporation22
Outline “Migration to Cloud”
�Planning migrations, focusing on performance and
SLA requirements–What to migrate?
–Which cloud?
�Executing migrations–P2V conversions
–Migrating to EC2
© 2010 IBM Corporation23
Which cloud provider is best suited for my application? [HotCloud 10]
� Reason #1: clouds have different service models– Infrastructure-as-a-Service
– Platform-as-a-Service
– A mixture of both
� Reason #2: clouds offer different charging schemes– Pay per instance-hour
– Pay per CPU cycle
� Reason #3: applications have different characteristics– Storage intensive
– Computation intensive
– Network latency sensitive
� Reason #4: high overhead to port application to clouds– Different and incompatible APIs
– Configuration and data migration
© 2010 IBM Corporation24
� Step 1: identify the common cloud services
� Step 2: benchmark the services
How does How does CloudCmpCloudCmp work?work?
6/22/2010 HotCloud 2010, Boston 24
Intra-cloud
network
Storage
service
Computatio
n service
Wide-area
networkWeb application
© 2010 IBM Corporation25
How does How does CloudCmpCloudCmp work?work?
6/22/2010 HotCloud 2010, Boston 25
� Step 3: capture realistic application workload– Extract the execution path of each request
� Step 4: estimate the performance and costs– Combine benchmarking results and workload information
Frontend
Database
Request
Response
Estimated processing
time
Estimated cost
© 2010 IBM Corporation26
ChallengesChallenges
� How to design the benchmarking tasks?– Fair and representative
� How to accurately capture the execution path of a request?– An execution path can be complex, across multiple machines
� How to estimate the overall processing time of an application– Applications can be multi-threaded
6/22/2010 HotCloud 2010, Boston 26
© 2010 IBM Corporation27
Results: Results: storagestorage
6/22/2010 HotCloud 2010, Boston 27
• Despite X’s good performance in
computation, its storage service can be slower
than the others
• A cloud may not ace all services
© 2010 IBM Corporation28
Outline “Migration to Cloud”
�Planning migrations, focusing on performance and
SLA requirements–What to migrate?
–Which cloud?
�Executing migrations–P2V conversions
–Migrating to EC2
© 2010 IBM Corporation29
Executing migrations
*http://thewebfellas.com/blog/2008/9/1/creating-an-new-ec2-ami-from-within-
vmware-or-from-vmdk-files
Physical server
P2V VirtualMachineImage
Convert to Cloud-Supported Format (i.e., AMI)* Virtual
MachineImage
Virtual Machine Runningin Cloud
Bundle, upload (to S3), register,launch instanceusing Cloud APIs*
ec2-bundle-imgqemu-img
ec2-bundle-vol
ec2-upload-bundle
ec2-register
VMWare vCenter Converter
Quest vConverter
© 2010 IBM Corporation30
Reference Material
1. Mohammad Hajjat, Xin Sun, Yu-Wei Sung, Dave Maltz, Sanjay Rao, Kunwadee
Sripanidkulchai and Mohit Tawarmalani. Cloudward Bound: Planning for
Benefical Migration of Enterprise Applications to the Cloud, Sigcomm 2010.
2. Ang Li, Xiaowei Yang, Srikanth Kandula and Ming Zhang. CloudCmp: Shopping
for a Cloud Made Easy. HotCloud 2010.
3. Timothy Wood, Prashant Shenoy, Arun Venkataramani, and Mazin Yousif.
Black-box and Gray-box Strategies for Virtual Machine Migration. NSDI 2007.
4. Kunwadee Sripanidkulchai, Sambit Sahu, Yaoping Ruan, Anees Shaikh, and
Chitra Dorai, Are Clouds Ready for Large Distributed Applications?, LADIS
2009.
© 2010 IBM Corporation31
Migration Project Ideas
� Planning live migration within the LAN– Algorithms for when to migrate, what to migrate, where to migrate
• VMWare: Build 2 ESXi hypervisors, run vSphere Enterprise (or above), understand how DRS
works, design algorithm to automate live migration, emulate resource contention to trigger
migration, and evaluate algorithm
• KVM or Xen: Improve KVM or Xen’s management capabilities to automate live migration by
implementing capabilities similar to VMWare’s DRS in libvirt
• Look at reference [3] for examples of algorithms for inspiration
� Migration to cloud– Fast migration of instances from local data center to EC2
• Build new migration capabilities to migrate virtual machines from your local data center (in
whichever image format you like – VMWare, Xen, etc.) to EC2. Look at how to use S3 and image
conversion technologies for ami. See if you can optimize migration performance using caching,
deduplication, etc.