Live Container Migration: Opportunities and Challenges Niroj Pokhrel
Live Container Migration: Opportunities and Challenges
Niroj Pokhrel
Agenda
● Introduction and Background
● Different Migration methods
● Case Study: OpenVZ and Docker
● Conclusion
● Powerful CPUs● Minimal utilization● Run multiple vms
in same server
Introduction
Container Vs Virtual Machine
Conventional Cold Migration
● Stop a container
● Copy filesystem to destination server.
● Start the container at destination
★ Involves a downtime
★ Prior Planning Required
Live Migration
● Move a running container from one server to another without a reboot
● Transparent to user, container source and container destination
Why Live Migration?
● Load Balancing
● Update Kernel/OS
● Replace or Maintain Hardware
● High availability
● Fault tolerance
Live Migration
● Memory Migration
○ Precopy
○ Postcopy
● Network Migration
● Disk Migration
Precopy Migration
Postcopy Migration
Comparison
Precopy Postcopy
Destination Node Failure ++ --
Downtime -- ++
Up state after migration ++ --
Write Intensive Application --
Read Intensive Application ++
Suspend/Resume Migration
● More Secured
● Destination host is inactive during transfer
● Network connections dropped and reestablished at
destination
● Disk transfer optimization with disk deltas
● Disconnected operations
Record/Replay Migration● Used for Recovering state● Repeat events from log to get to desired state● Log only non-deterministic events● Compute deterministic events on rerun
➢ Maximizing trace completeness➢ Reduce Performance overhead➢ Reduce log file size
Case study: OpenVZ
● Checkpointing and Restoring
● Container an isolated entity
● Complete state can be saved on disk
Checkpointing and Restoring (C/R)Source Destination
Copy File system
Checkpoint and save to disk
Transfer Checkpoint
Restart
Prerequisite for CR
● PID Virtualization
● Process group isolation
● Network isolation and virtualization
● Resource virtualization
Important Notes on C/R● First step in Checkpointing and last
step in Restoring is process freezing● Process freezing is done by
TIF_FREEZE signal● Different dependencies should be
saved● hook() is added on top of process
stack for restoring● Roll back possible
Implementation in Linux
● Save and Restore state
● Memory Precopy or Postcopy
● Perform checks
● Implement C/R steps
● Deal with filesystems
Case study: Docker
● CRIU and P.Haul for live migration
Conclusion
● Live migration essential for high availability and load balancing
● Many live migration methods present
● Different methods have different opportunities and challenges
● Precopy and postcopy prominent memory migration techniques
● OpenVZ and Docker use C/R technique
References● Brendan Burns, Brian Grant, David Oppenheimer, Eric Brewer, and John Wilkes, "Borg, Omega,
and Kubernetes", Communications of the ACM 59(5):50-57, April 2016● Medina, Violeta ; Garcia, Juan Manuel; “A Survey of Migration Mechanisms of Virtual Machines”
ACM Computing Surveys Fall, 2014, Vol.46(3), p.30(33)● Felter, Wes; Ferreira, Alexandre; Rajamony, Ram ; Rubio, Juan; “An Updated Performance
Comparison of Virtual Machines and Linux Containers” Performance Analysis of Systems and Software (ISPASS), 2015 IEEE International Symposium on 2015 IEEE International Symposium on Performance Analysis of Systems and Software, March 29-31 2015, pp.171-172
● Bussmann, Jens; Grzadkowski, Filip; “Containers with Google: from Borg to Kubernetes” available at http://www.redhatonline.com/pl/wp-content/uploads/2016/05/RH-GOOG_WAW_JensBussmann.pdf
● Emelyanov, Pavel; “Live migrating a container: pros, cons and gotchas” available at http://www.slideshare.net/Docker/live-migrating-a-container-pros-cons-and-gotchas
● Mirkin, Andrey; Kuznetsov, Alexey; Kolyshkin, Kir; “Containers checkpointing and live migration” available at https://landley.net/kdocs/ols/2008/ols2008v2-pages-85-90.pdf
Questions?
Container Features
● Namespace
● Control Groups (cgroups)
● Layered filesystem (Docker specific)