The importance of HA and automation tools Frédéric Lepied Engineering Manager [email protected] Lessons Learned On Upgrades Senior Software Engineer [email protected] Emilien Macchi
Aug 07, 2015
The importance of HA and automation tools
Frédéric LepiedEngineering [email protected]
Lessons Learned On Upgrades
Senior Software [email protected]
Emilien Macchi
Red Hat Cloud Innovation Practice Engineering
Frédéric Lepied: RCIP Engineering managerEmilien Macchi: installer team / Puppet PTL
Disclaimer
The examples are taken from former eNovance products and not Red Hat ones.
OpenStack is a wonderful place,but upgrades are not easy.
What is a successful upgrade?
• No need of new hardware• The less interruption possible• Minor & Major upgrade support• Efficient, fast, reproducible process
Roadmap
• Redundant architecture• Enough free capacity• Image based deployment• Automation tooling
Redundant Architecture
Enough free capacity
• Have enough compute resources to migrate instances
• Have some spare in case of failure
Image based workflow (recommended)
• Build your images once• Install using your images• Upgrade using your images
Build and archive your images
• Build your image in a CI• Use packaging tools (yum, apt, …)• Compression & archive• Stamp with versioning• Use Cloud Storage (Swift, Ceph)
Image based deployment
Limit the number of images
• More images = more pain• Single image with:
• all packages installed• all services disabled at boot
Image based deployment
Prohibit packaging tools
• Keep systems:• consistent• reproducible• auditable
• Speed-up configuration management
• Allow to re-enable the tools
Image based deployment
Upgrade your system with a tool
• APT / YUM:• too slow at scale (~20 min / node)• need to manage your repositories
• Using eDeploy:• very fast at scale (~20 s / node)• allow rollbacks
Image based deployment
Automation tooling
• Control system upgrade• Configuration management• Orchestration• Automate the workflow
Control system upgrade
We need:• one command to upgrade one system• no service restarted or reloaded• possibility to rollback
What we use:• eDeploy : tool to upgrade images with rsync
Automation tooling
Configuration management
• Puppet, Chef, Ansible, whatever you like• “The best tool is the one you already use.”• But:
• … you need to update your config• … do not manage packages
Automation tooling
Orchestrator
• Puppet and Chef are good for configuration• But you need to orchestrate multiple systems:
• restart services in the right order• upgrade the system at the right time
Automation tooling
Upgrade workflow
Automation tooling
• Pre-upgrade actions• Resources evacuation• Stop OpenStack services• Stop Infra / system services• Upgrade packages• Start Infra / system services• Start OpenStack services• Post-upgrade actions
Example: upgrade a compute node
• evacuate virtual machines• disable nova compute service• system upgrade• update config• service libvirtd restart• service openstack-nova-compute restart• enable nova-compute service• test the service
Automation tooling
Ansible snippet example (hypervisor)- name: evacuate compute node script: evacuate-compute.sh tags: 2
- name: restart nova-compute service: name={{ item }} state=restarted with_items:
- "{{ libvirt }}"- "{{ nova_compute }}"
tags: 8
- name: enable nova-compute service script: enable-compute.sh tags: 9
Automation tooling
Automate the workflow
Automation tooling
• Upgrades are repetitive• Prepare an upgrade without effort• Prepare Ansible Playbooks with
snippets• Compose Playbooks by computing:
• what is upgraded in the image• which service is running on a node
Ansible best practices
• Use tags in snippets to define ordering• Run HA nodes in serial• Run compute nodes in parallel• Use a script for hypervisor evacuation• Allow to continue to roll playbooks after a
failure• a snippet for each service to upgrade
Automation tooling
Generate Ansible playbooks per role
Automation tooling
• Your OpenStack needs HA
• Make sure you have free capacity
• Image based upgrade is a good option
• Orchestration and Configuration Management are key
Conclusion
Thank you!
http://tinyurl.com/ansible-snippets@EmilienMacchi@flepied