OpenNebula at Harvard University FAS Research Computing John Noss April 22 2016
About FAS RC
Our OpenNebula setup: - OpenNebula and Ceph hardware- Network setup
Our configuration with puppet:- opennebula-puppet-module- roles/profiles- Config within OpenNebula
Context scripts / load testing
Use cases for OpenNebula at RCThings we’d love to see
Agenda
Overview of Odyssey
•150 racks spanning 3 data centers across 100 miles using 1 MW power
•60k CPU cores, 1M+ GPU Cores
•25 PB (Lustre, NFS, Isilon, Gluster)
•10 miles of cat 5/6 + IB cabling
•300k lines of Puppet code
•300+ VMs
•2015: 25.7 million jobs
240 million CPU hours
About FAS RC
Our OpenNebula setup: - OpenNebula and Ceph hardware- Network setup
Our configuration with puppet:- opennebula-puppet-module- roles/profiles- Config within OpenNebula
Context scripts / load testing
Use cases for OpenNebula at RCThings we’d love to see
Agenda
Where we’re coming from
● Previous kvm infrastructure:○ One datacenter○ 4 C6145s (8 blades, 48 core/ 64 core, 256GB ram)○ 2 10GbE switches but not 802.3ad LACP, they are active-passive○ 2 R515 replicated gluster
● VM provisioning process very manual○ add to dns○ add to cobbler for dhcp ○ edit in cobbler web GUI if changing disk, ram, or cpu○ run virt-builder script to provision on a hypervisor (manually selected for load-balancing)
■ Full OS install, and puppet run from scratch - takes a long time
● Issues:○ Storage issues with gluster, heal client-side (on kvm hypervisors), VMs going read only○ Management very manual - changing capacity is manual, etc
Hardware Setup - OpenNebula
● Hypervisors (nodes): ○ 8 Dell R815
■ 4 each in 2 datacenters○ 64 core, 256GB ram○ Intel X520 2-port 10GbE, LACP
● Controller:○ Currently one node is serving as controller as well as hypervisor, but the controller function
can be moved to a different node manually if the db is on replicated mysql (tested using galera)
Hardware Setup - Ceph
● OSDs:○ 10 Dell R515
■ 5 each in 2 primary datacenters○ 16 core, 32GB ram○ 12x 4TB ○ Intel X520 2-port 10GbE, LACP
● Mon:○ 5 Dell R415
■ 2 each in 2 primary datacenters■ 1 in a 3rd datacenter as a tie-breaker
○ 8 core, 32GB ram○ 2x 120GB SSD, raid1 for mon data device○ Intel X520 2-port 10GbE, LACP
● MDS○ Currently using cephfs for opennebula system datastore mount○ MDS running on one of the mons
Network Setup
2x Dell Force10 S4810 10gbe switches in each of the 2 primary datacenters (with 2x 10gb between datacenters)
2x twinax (one from each switch) to each of the opennebula and ceph nodes, bonded LACP (802.3ad)
Tagged 802.1q vlans for:
1. Admin (ssh, opennebula communication, sunstone, puppet, nagios monitoring, etc; MTU 1500)2. Ceph-client network (used for clients--opennebula hypervisors--to access ceph; routes only to other
ceph-client vlans in other datacenters; MTU 9000)3. Ceph-cluster network (MTU 9000) (backend ceph network; routes only to other ceph-cluster vlans in
other datacenters; only on ceph OSDs)4. Opennebula guest vm networks
a. Some in one datacenter only, some span both datacenters
Note that vlan (1) needs to be tagged to have a normal MTU of 1500, because the bond MTU needs to be 9000 so that (2) and (3) can have their MTU 9000
Network setup: static routes
profiles::network::datacenter_routes::routes_hash:
datacenter1:
ceph-client-datacenter2:
network: 172.16.20.0/24
gateway_ip: 172.16.10.1
gateway_dev: bond0.100
require: Network::Interface[bond0.100]
ceph-client-datacenter3:
network: 172.16.30.0/24
gateway_ip: 172.16.10.1
gateway_dev: bond0.100
require: Network::Interface[bond0.100]
datacenter2:
ceph-client-datacenter1:
network: 172.16.10.0/24
gateway_ip: 172.16.20.1
gateway_dev: bond0.200
require: Network::Interface[bond0.200]
ceph-client-datacenter3:
network: 172.16.30.0/24
gateway_ip: 172.16.20.1
gateway_dev: bond0.101
require: Network::Interface[bond0.200]
172.16.20.0/24 via 172.16.10.1 dev bond0.100
172.16.30.0/24 via 172.16.10.1 dev bond0.100
About FAS RC
Our OpenNebula setup: - OpenNebula and Ceph hardware- Network setup
Our configuration with puppet:- opennebula-puppet-module- roles/profiles- Config within OpenNebula
Context scripts / load testing
Use cases for OpenNebula at RCThings we’d love to see
Agenda
Configuring OpenNebula with puppet
Installation:
● PXE boot - OS installation, runs puppet● Puppet - bond configuration, tagged vlans, yum repos, opennebula and
sunstone passenger installation and configuration○ Combination of local modules and upstream (mysql, apache, galera, opennebula)
● Puppetdb - exported resources to add newly-built hypervisors as onehosts on controller, and, if using nfs for system datastore, to add to /etc/exports on the controller and to pick up the mount of /one
Ongoing config management:
● Puppet - adding vnets, addressranges, security groups, datastores (for various ceph pools, etc)
● Can also create onetemplates, and onevms as well
OpenNebula puppet module
Source: https://github.com/epost-dev/opennebula-puppet-moduleOr: https://forge.puppet.com/epostdev/one (not up-to-date currently)
(Deutsche Post E-Post Development)
Puppet module to install and manage opennebula:
● Installs and configures opennebula controller and hypervisors○ Takes care of package installs○ Takes care of adding hypervisor as onehost on controller (using puppetdb)
● Also can be used for ongoing configuration management of resources inside opennebula - allows to configure onevnets, onesecgroups, etc, within opennebula
Minimum code to setup opennebula with puppet:
package {'rubygem-nokogiri':
ensure => installed,
} ->
class { '::one':
oned => true,
sunstone => true,
sunstone_listen_ip => '0.0.0.0',
one_version => '4.14',
ssh_priv_key_param => '-----BEGIN RSA PRIVATE KEY-----...',
ssh_pub_key => 'ssh-rsa...',
} ->
onehost { $::fqdn :
im_mad => 'kvm',
vm_mad => 'kvm',
vn_mad => '802.1Q',
}
Only needed if not using puppetdb
Can encrypt using eyaml if passing this via hiera
Hiera for opennebula config
one::one_version: '4.14.2'
one::enable_opennebula_repo: 'false'
one::ntype: '802.1Q'
one::vtype: 'kvm'
one::puppetdb: true
one::oneid: opennebula_cluster1
one::oned: true
one::oned_port: 2634
one::oneflow: true
one::sunstone: true
one::sunstone_passenger: true
one::sunstone_novnc: true
one::oned::sunstone_sessions: 'memcache'
one::oned::sunstone_logo_png: 'puppet:///modules/profiles/logo.png'
one::oned::sunstone_logo_small_png: 'puppet:///modules/profiles/logo.png'
one::ldap: true
one::backend: mysql
one::oned::db: opennebula
one::oned::db_user: oneadmin
...
one::sched_interval: 10
one::sched_max_host: 10
one::sched_live_rescheds: 1
one::inherit_datastore_attrs:
- DRIVER
one::vnc_proxy_support_wss: 'only'
one::vnc_proxy_cert: "/etc/pki/tls/certs/%{hiera('one::oneid')}_vnc.cer"
one::vnc_proxy_key: "/etc/pki/tls/private/%{hiera('one::oneid')}_vnc.key"
one::kvm_driver_emulator: '/usr/libexec/qemu-kvm'
one::kvm_driver_nic_attrs: '[ filter = "clean-traffic", model="virtio" ]'
...
Puppet Roles/Profiles
Puppet roles/profiles provide a framework to group technology-specific configuration (modules, groups of modules, etc) into profiles, and then combine profiles to make a role for each server or type of server.
- http://www.craigdunn.org/2012/05/239/- http://garylarizza.com/blog/2014/02/17/puppet-workflow-part-2/- https://puppet.com/podcasts/podcast-getting-organized-roles-and-profiles
OpenNebula roles
# opennebula base role
class roles::opennebula::base inherits roles::base {
include ::profiles::storage::ceph::client
include ::profiles::opennebula::base
}
# opennebula hypervisor node
class roles::opennebula::hypervisor inherits roles::opennebula::base {
include ::profiles::opennebula::hypervisor
include ::profiles::opennebula::controller::nfs_mount
}
# opennebula controller node
class roles::opennebula::controller inherits roles::opennebula::base {
include ::profiles::opennebula::controller
include ::profiles::opennebula::controller::nfs_export
include ::profiles::opennebula::controller::local_mysql
include ::profiles::opennebula::controller::mysql_db
include ::profiles::opennebula::controller::sunstone_passenger
}
OpenNebula profiles
site/profiles/manifests/opennebula
├── base.pp
├── controller
│ ├── local_mysql.pp
│ ├── mysql_db.pp
│ ├── nfs_export.pp
│ └── sunstone_passenger.pp
├── controller.pp
├── hypervisor
│ ├── nfs_mount.pp
│ └── virsh_secret.pp
└── hypervisor.pp
OpenNebula profiles: NFS mount on hypervisors
class profiles::opennebula::hypervisor::nfs_mount (
$oneid = $::one::oneid,
$puppetdb = $::one::puppetdb,
) {
# exported resource to add myself to /etc/exports on the controller
@@concat::fragment { "export_${oneid}_to_${::fqdn}":
tag => $oneid,
target => '/etc/exports',
content => "/one ${::fqdn}(rw,sync,no_subtree_check,root_squash)\n",
}
# set up mount /one from head node
if $::one::oned == true {
} else {
# not on the head node so mount it
# pull in the mount that the head node exported
Mount <<| tag == $oneid and title == "${oneid}_one_mount" |>>
}
}
Collect this from the controller (note, this will have a 2-run dependence before completing successfully - but, it will continue past the error on the first run)
Export this to the controller
OpenNebula profiles: NFS export on controller node
class profiles::opennebula::controller::nfs_export (
$oneid = $::one::oneid,
){
concat { '/etc/exports':
ensure => present,
owner => root,
group => root,
require => File['/one'],
notify => Exec['exportfs'],
}
# collect the fragments that have been exported by the hypervisors
Concat::Fragment <<| tag == $oneid and target == '/etc/exports' |>>
# export a mount that the hypervisors will pick up
@@mount { "${oneid}_one_mount":
ensure => 'mounted',
name => '/one',
tag => $oneid,
device => "${::fqdn}:/one",
fstype => 'nfs',
options => 'soft,intr,rsize=8192,wsize=8192',
atboot => true,
require => File['/one'],
}
}
Collect these from the hypervisors
Export this to the hypervisors
OpenNebula profiles: Cephfs
class profiles::storage::ceph::client (
$fsid = hiera('profiles::storage::ceph::fsid',{}),
$keyrings = {},
$cephfs_keys = {},
$cephfs_kernel_mounts = {},
$mon_hash = hiera('profiles::storage::ceph::mon_hash',{}),
$network_hash = hiera('profiles::storage::ceph::network_hash', {}),
) inherits profiles::storage::ceph::base {
...
create_resources(profiles::storage::ceph::keyring, $keyrings)
create_resources(profiles::storage::ceph::cephfs_key, $cephfs_keys)
create_resources(profiles::storage::ceph::cephfs_kernel_mount, $cephfs_kernel_mounts )
}
[opennebula-node01]# df -h /one
Filesystem Size Used Avail Use% Mounted on
172.16.10.10:6789,172.16.10.11:6789:/one 327T 910G 326T 1% /one
OpenNebula profiles: Local mysql
class profiles::opennebula::controller::local_mysql (
) {
include ::mysql::server
# disable PrivateTmp - causes issues with OpenNebula
file_line { "${::mysql::server::service_name}
_disable_privatetmp":
ensure => present,
path => "/usr/lib/systemd/system/${::mysql::server::
service_name}.service",
line => 'PrivateTmp=false',
match => 'PrivateTmp=true',
notify => [
Exec['systemctl-daemon-reload'],
Service['mysqld']
]
}
}
class profiles::opennebula::controller::mysql_db (
$oned_db = hiera('one::oned::db', 'oned'),
$oned_db_user = hiera('one::oned::db_user', 'oned'),
$oned_db_password = hiera('one::oned::db_password', 'oned'),
$oned_db_host = hiera('one::oned::db_host', 'localhost'),
) {
# setup mysql server, local currently, on the master
mysql::db { $oned_db:
user => $oned_db_user,
password => $oned_db_password,
host => $oned_db_host,
grant => ['ALL'],
}
}
OpenNebula profiles: Sunstone passenger
class profiles::opennebula::sunstone_passenger (
$web_ssl_key = 'undef',
$web_ssl_cert = 'undef',
$vnc_ssl_key = 'undef',
$vnc_ssl_cert = 'undef',
) inherits profiles::opennebula::base {
include ::profiles::web::apache
include ::apache::mod::passenger
include ::systemd
# disable PrivateTmp - causes issues with sunstone image uploads
file_line { "${::apache::params::service_name}_disable_privatetmp":
ensure => present,
path => "/usr/lib/systemd/system/${::apache::params::service_name}.service",
line => 'PrivateTmp=false',
match => 'PrivateTmp=true',
notify => [
Exec['systemctl-daemon-reload'],
Service['httpd'],
]
}
...
OpenNebula profiles: Sunstone passenger hiera
one::sunstone: true
one::sunstone_passenger: true
one::sunstone_novnc: true
one::oned::sunstone_sessions: 'memcache'
profiles::opennebula::percentliteral: '%'
profiles::web::apache::vhosts:
opennebula01:
vhost_name: <fqdn>
custom_fragment: 'PassengerUser oneadmin'
docroot: /usr/lib/one/sunstone/public/
directories:
-
path: /usr/lib/one/sunstone/public/
override: all
options: '-MultiViews'
port: 443
ssl: true
ssl_cert: "/etc/pki/tls/certs/%{hiera('one::oneid')}_web_cert.cer"
ssl_key: "/etc/pki/tls/private/%{hiera('one::oneid')}_web.key"
...
OpenNebula profiles: Sunstone passenger hiera cont.
...
opennebula01-80to443:
vhost_name: <fqdn>
docroot: /var/www/html
port: 80
rewrite_rule: "^.*$ https://%{hiera('profiles::opennebula::percentliteral')}{HTTP_HOST}%{hiera('profiles::
opennebula::percentliteral')}{REQUEST_URI} [R=301,L]"
apache::mod::passenger:passenger_high_performance: on
apache::mod::passenger:passenger_max_pool_size: 128
apache::mod::passenger:passenger_pool_idle_time: 600
apache::mod::passenger:passenger_max_requests: 1000
apache::mod::passenger:passenger_use_global_queue: 'on'
Other puppetized configs: XMLRPC SSL
one::oned_port: 2634
profiles::web::apache::vhosts:
opennebula-xmlrpc-proxy:
Vhost_name: <fqdn>
docroot: /var/www/html/ # doesn’t matter, just needs to be there for the vhost
port: 2633
ssl: true
ssl_cert: "/etc/pki/tls/certs/%{hiera('one::oneid')}_xmlrpc_cert.cer"
ssl_key: "/etc/pki/tls/private/%{hiera('one::oneid')}_xmlrpc.key"
proxy_pass:
path: '/'
url: 'http://localhost:2634/'
file { '/var/lib/one/.one/one_endpoint':
ensure => file,
owner => 'oneadmin',
group => 'oneadmin',
mode => '0644',
content => "http://localhost:${oned_port}/RPC2\n", # localhost doesn't use the ssl port
require => Package['opennebula-server'],
before => Class['one::oned::service'],
}
ONE_XMLRPC=https://<fqdn of controller>:2633/RPC2 # for end user CLI access
About FAS RC
Our OpenNebula setup: - OpenNebula and Ceph hardware- Network setup
Our configuration with puppet:- opennebula-puppet-module- roles/profiles- Config within OpenNebula
Context scripts / load testing
Use cases for OpenNebula at RCThings we’d love to see
Agenda
Configuration inside OpenNebula once it’s running
Types provided by opennebula-puppet-module:
onecluster
onedatastore
onehost
oneimage
onesecgroup
onetemplate
onevm
onevnet
onevnet_addressrange
Add vnets, datastores, etc:
profiles::opennebula::controller::onevnets:
vlan100:
ensure: present
bridge: 'br101'
phydev: 'bond0'
dnsservers: ['172.16.99.10','172.16.99.11']
gateway: '172.16.100.1'
vlanid: '101'
netmask: '255.255.255.0'
network_address: '172.16.100.0'
mtu: '1500'
profiles::opennebula::controller::onevnet_addressranges:
vlan100iprange:
ensure: present
onevnet_name: 'vlan100'
ar_id: '1' # read only value
protocol: 'ip4'
ip_size: '250'
ip_start: '172.16.100.5'
profiles::opennebula::controller::onesecgroups:
onesecroup100:
description: 'description'
rules:
-
protocol: TCP
rule_type: OUTBOUND
-
protocol: TCP
rule_type: INBOUND
ip: '172.16.100.0'
size: '255'
range: '22,1024:65535'
profiles::opennebula::controller::onedatastores:
ceph_datastore:
ensure: 'present'
type: 'IMAGE_DS'
ds_mad: 'ceph'
tm_mad: 'ceph'
driver: 'raw'
disk_type: 'rbd'
ceph_host: 'ceph-mon1 ceph-mon2'
ceph_user: 'libvirt-opennebula'
ceph_secret: '<uuid_name_for_libvirt_secret>'
pool_name: 'opennebula_pool'
bridge_list: 'opennebula_controller01'
Create_resources on controller
class profiles::opennebula::controller (
$onevnets = {},
$onevnet_addressranges = {},
$onesecgroups = {},
$onedatastores = {},
$oneid = $::one::oneid,
){
validate_hash($onevnets)
create_resources(onevnet, $onevnets)
validate_hash($onevnet_addressranges)
create_resources(onevnet_addressrange, $onevnet_addressranges)
validate_hash($onesecgroups)
create_resources(onesecgroup, $onesecgroups)
validate_hash($onedatastores)
create_resources(onedatastore, $onedatastores)
...
}
About FAS RC
Our OpenNebula setup: - OpenNebula and Ceph hardware- Network setup
Our configuration with puppet:- opennebula-puppet-module- roles/profiles- Config within OpenNebula
Context scripts / load testing
Use cases for OpenNebula at RCThings we’d love to see
Agenda
Context script to configure diamond and load tests
#!/bin/bash
source /mnt/context.sh
cd /root
yum install -y puppet
puppet module install garethr-diamond
puppet module install stahnma-epel
...
cat > diamond.pp <<EOF
class { 'diamond':
graphite_host => "$GRAPHITE_HOST",
...
EOF
puppet apply diamond.pp
diamond
if [ $(echo $LOAD_TESTS | grep dd) ] ; then
dd if=/dev/urandom of=/tmp/random_file bs=$DD_BLOCKSIZE count=$DD_COUNT
for i in $(seq 1 $DD_REPEATS); do
date >> ddlog
sync; { time { time dd if=/tmp/random_file of=/tmp/random_file_copy ; sync ; } ; } 2>> ddlog
...
Onetemplate context variables & instantiation
Onetemplate update (or in Sunstone):
CONTEXT=[ LOAD_TESTS="$LOAD_TESTS", GRAPHITE_HOST="$GRAPHITE_HOST”...
Instantiate with:
onetemplate instantiate 19 --raw "$( cat paramfile )" --name vmname-%i -m4
Using paramfile with newline-separated contents:
LOAD_TESTS=ddGRAPHITE_HOST=172.16.100.12VAR_NAME2=var_value2...
Context script to install graphite and grafana
#!/bin/bash
source /mnt/context.sh
MY_HOSTNAME=$(nslookup $ETH0_IP | grep name|sed -e 's/.* //' -e 's/\.$//')
cd /root
yum install -y puppet
puppet module install dwerder-graphite
yum install -y git
git clone https://github.com/bfraser/puppet-grafana.git /etc/puppet/modules/grafana
puppet module install puppetlabs-apache
mkdir /opt/graphite
cat > grafana.pp <<EOF
class {'::apache':
default_vhost => false,
}
apache::vhost { '$MY_HOSTNAME-graphite':
port => '8080',
servername => '$MY_HOSTNAME',
docroot => '/opt/graphite/webapp',
wsgi_application_group => '%{GLOBAL}',
wsgi_daemon_process => 'graphite',
wsgi_daemon_process_options => {
processes => '5',
...
About FAS RC
Our OpenNebula setup: - OpenNebula and ceph hardware- Network setup
Our configuration with puppet:- opennebula-puppet-module- roles/profiles- Config within OpenNebula
Context scripts / load testing
Use cases for OpenNebula at RCThings we’d love to see
Agenda
Use cases in RC
● Streamlining and centralizing management of VMs● Creating testing vms: with OpenNebula, much easier to create and manage
the one-off vms needed to test something out (this makes it less likely to need to test something in production)
● Automatically spinning up vms to test code: when making a change in puppet, have a git hook do a test run on each category of system we have in temporary opennebula vms first
● Oneflow templates, and HA for client applications by leveraging two datacenters
● Elastic HPC: spin up and down compute nodes as needed
About FAS RC
Our OpenNebula setup: - OpenNebula and Ceph hardware- Network setup
Our configuration with puppet:- opennebula-puppet-module- roles/profiles- Config within OpenNebula
Context scripts / load testing
Use cases for OpenNebula at RCThings we’d love to see
Agenda
Things we’d love to see
● Confining certain vlans to certain hosts without segmenting into clusters (vlans and datastores can be in multiple clusters in 5.0)
● Folders or other groupings on vm list, template list, security groups, etc, to organize large numbers of them in sunstone view (labels coming in 5.0)
● Image resize, not just when launching a VM (coming in 5.0)● Oneimage upload from CLI - not just specify path local to frontend● Onefile update from CLI● Dynamic security groups with auto commit (coming in 5.0)● Private vlan / router handling (with certain 802.1q vlan id’s trunked to hypervisors; coming in 5.0)● Changelog on onetemplates, onevm actions, etc (it’s possible to see user in oned.log but not
changes)● Sunstone: show VM name not just ID when taking action such as shutdown
Sunstone: change the name of “shutdown” to describe what will actually happen for non-persistent VMsSunstone: show eth0 IP on vm info page, or add a copy button for IP from vm list page
● Move Ctrl-Alt-Del button away from the X button to close VNC (or prompt for confirmation)