General Bare-metal
Provisioning Framework
Mikyung Kang, USC/ISI David Kang, USC/ISI
Ken Igarashi, NTT docomo Mana Kenoko, NTT docomo
Hiromichi Ito, Virtual Tech Japan Arata Notsu, Virtual Tech Japan
Overview ¡ Why General Bare-metal Provisioning? (USC/ISI) ¡ Why Bare-metal provisioning?
¡ OpenStack Bare-metal History: Essex – Folsom – Grizzly
¡ Bare-metal Provisioning Framework
¡ Bare-metal Release Plan
¡ Bare-metal provisioning support? (NTT Docomo) ¡ Instance Request
¡ Nova-compute Selection
¡ Image Provisioning
¡ Network Isolation
¡ Nova-volume Attachment
¡ VNC Access
¡ Snapshot
General Bare-Metal Provisioning Framework (Speaker Session)
2
Why Bare-metal Provisioning?
General Bare-Metal Provisioning Framework (Speaker Session)
3
¡ Manage Bare-metal Machines using OpenStack
Real-‐‑‒time Analysis
Virtual Machines Bare-‐‑‒Metal Machines
Various CPU support
OpenStack
Management using OpenStack
Why Bare-metal Provisioning?
General Bare-Metal Provisioning Framework (Speaker Session)
4
¡ Difference between VM and Bare-metal Machines ¡ Virtual Machines
¡ Hypervisor exists between physical resources and virtual machines ¡ Image provisioning, VM’s power management, volume isolation
(iSCSI), console access (VNC), VM’s snapshot
¡ Bare-metal Machines ¡ There is no hypervisor
¡ Bare-metal machine can access physical resources freely ¡ Need to achieve same security level as virtual environments
Virtual Machine
NW StorageiSCSI
NWVLANHW
Hypervisor (OpenStack)
CPU MEM HDD NIC
Host OSHW
CPU MEM HDD NIC
OSimageDB
Bare-Metal Machine
NW StorageiSCSI
NWVLAN
OS imageDB
Why Bare-metal Provisioning?
General Bare-Metal Provisioning Framework (Speaker Session)
5
¡ Virtual machine vs. Bare-metal machine instances
Nova-Compute
CPU
MEM
HDD
Aggregate resources bm1.medium
bm1.tiny
bare-metal Driver
HW
Hypervisor
CPU MEM HDD NIC
Host OS
m1.tiny m1.medium m1.large
Nova-Compute (virtual)
Bare-metal machine Virtual machine
OpenStack Bare-metal History
General Bare-metal Provisioning Framework (Speaker Session)
6
• Non-PXE Tilera multi-core bare-metal machines
Essex Release: April 2012
• Non-PXE Tilera multi-core bare-metal machines • Pending review: PXE support & bare-metal MySQL DB
Folsom Release: Sept. 2012
• Finish review à merge to upstream: basic functions • New features including fault-tolerance and security
enhancement as well as scheduler changes
Grizzly Release: April 2013
OpenStack Bare-metal History ¡ Initial design for Tilera (Non-PXE) Image Provisioning (TFTP/NFS)
General Bare-metal Provisioning Framework (Speaker Session)
7
Essex
Folsom
Bare-metal Provisioning Framework
General Bare-metal Provisioning Framework (Speaker Session)
8
Compute Node w/ bare-metal plugin
LibvirtDriver (nova/virt/libvirt/driver.py)
Non-PXE (TILERA) (nova/virt/baremetal/tilera.py)
baremetal_driver ={baremetal.tilera.TILERA | baremetal.pxe.PXE }
1
2
nova/virt/ --- libvirt/ --- driver.py --- baremetal/ --- driver.py --- tilera.py --- tilera_pdu.py --- pxe.py --- ipmi.py
BareMetalDriver (nova/virt/baremetal/driver.py)
PXE (nova/virt/baremetal/pxe.py)
compute_driver=baremetal.driver.BareMetalDriver
PDU (nova/virt/baremetal/tilera_pdu.py)
IPMI (nova/virt/baremetal/ipmi.py)
power_manager ={baremetal.tilera_pdu.Pdu | baremetal.ipmi.Ipmi }
3
Tilepro64 ARM x86_64 instace_type_extra_specs =cpu_arch:xxx 4
Grizzly
¡ Bare-metal nova-compute vs. back-end machines
Bare-metal Provisioning Framework
General Bare-metal Provisioning Framework (Speaker Session)
9
Nova-scheduler Nova-compute w/ bare-metal plug-in
PXE: X86_64
PXE: ARM
Non-PXE: Tilera
Bare-metal back-end
x86_64 BM Farm
ARM BM Farm
Tilera BM Farm
cpu_arch=* hypervisor_type =baremetal
Bare-metal nodes
information Maximum Capability
Homogeneous Capability
Registers bare-metal resources
Bare-metal Provisioning Framework
General Bare-metal Provisioning Framework (Speaker Session)
10
Nova-Compute
CPU
MEM
HDD
Aggregate resources bm1.medium
bm1.tiny
bare-metal Driver
Nova-Scheduler
TEXT
Including total number of bare-metal machines
Bare-metal Filter: cpu_arch &
hypervisor_type
Essex
Folsom
baremetal_sql_connection = mysql://$ID:$Password@$IP/nova_bm
Bare-metal MySQL DB
Registers bare-metal resources
Multiple Capabilities
Bare-metal Provisioning Framework
General Bare-metal Provisioning Framework (Speaker Session)
11
Nova-Compute
CPU
MEM
HDD
Aggregate resources bm1.medium
bm1.tiny
bare-metal Driver
Nova-Scheduler
Bare-metal Filter: cpu_arch &
hypervisor_type
Grizzly
Bare-metal Release Plan
General Bare-metal Provisioning Framework (Speaker Session)
12
Grizzly-1: Nov. 22nd
Grizzly-3: Feb. 21st
Copyright©2011 NTT DOCOMO, INC. All rights reserved.
General Bare-Metal Provisioning Framework
Ken Igarashi Mana Kaneko
(NTT docomo Inc.)
DOCOMO, INC All Rights Reserved
0
2000
4000
6000
8000
10000
transmit receive
Thro
ughp
ut [M
bps]
Baremetal Virtual SR-IOV
0 10 20 30 40 50 60 70
2 4 8 16 24 32 64 96
Tim
e [µ
S]
Number of Process
Baremetal Virtual
0
0.1
0.2
0.3
0.4
64 1024 1500
Late
ncy
[ms]
Packet Size [bytes]
Baremetal SR-IOV Virtual
o CPU (Coremark)
o TCP Throughput (Netperf)
o Context Switch (LMBench)
o Ping
Benchmarking
14
0 20000 40000 60000 80000
100000 120000 140000 160000 180000
Baremetal Virtual Better
worse
Better worse
DOCOMO, INC All Rights Reserved
VM Provisioning Procedure in Nova
15
Glance
Hypervisor
Nova-API
1. Instance Request
Nova-Scheduler
Host OS
Nova-Compute
Hypervisor
Host OS
Nova-Compute
Hypervisor
Host OS
Nova-Compute
Storage Storage
Storage Storage
USER1 Vol-13
USER1 Vol-14
USER2 Vol-11
USER2 Vol-12
Nova-Volume
DOCOMO, INC All Rights Reserved
VM Provisioning Procedure in Nova
16
Glance
Hypervisor
Nova-API
1. Instance Request
Nova-Scheduler
Host OS
Nova-Compute
Hypervisor
Host OS
Nova-Compute
Hypervisor
Host OS
Nova-Compute
2. Choose Nova-Compute
Storage Storage
Storage Storage
USER1 Vol-13
USER1 Vol-14
USER2 Vol-11
USER2 Vol-12
Nova-Volume
DOCOMO, INC All Rights Reserved
VM Provisioning Procedure in Nova
17
Glance
Hypervisor
Nova-API
1. Instance Request
Nova-Scheduler
Host OS
Nova-Compute
Hypervisor
Host OS
Nova-Compute
Hypervisor
Host OS
Nova-Compute
2. Choose Nova-Compute
VM VM VM
Storage Storage
Storage Storage
USER1 Vol-13
USER1 Vol-14
USER2 Vol-11
USER2 Vol-12
Nova-Volume
VM
3. Image Provisioning
DOCOMO, INC All Rights Reserved
VM Provisioning Procedure in Nova
18
Glance
Hypervisor
Nova-API
1. Instance Request
Nova-Scheduler
Host OS
Nova-Compute
Hypervisor
Host OS
Nova-Compute
Hypervisor
Host OS
Nova-Compute
2. Choose Nova-Compute
VM VM VM
Storage Storage
Storage Storage
USER1 Vol-13
USER1 Vol-14
USER2 Vol-11
USER2 Vol-12
Nova-Volume
VM
3. Image Provisioning
4. Network Isolation
DOCOMO, INC All Rights Reserved
VM Provisioning Procedure in Nova
19
Glance
Hypervisor
Nova-API
1. Instance Request
Nova-Scheduler
Host OS
Nova-Compute
Hypervisor
Host OS
Nova-Compute
Hypervisor
Host OS
Nova-Compute
2. Choose Nova-Compute
VM VM VM
Storage Storage
Storage Storage
USER1 Vol-13
USER1 Vol-14
USER2 Vol-11
USER2 Vol-12
Nova-Volume
VM
3. Image Provisioning
4. Network Isolation
5. Nova-Volume Attachment
DOCOMO, INC All Rights Reserved
VM Provisioning Procedure in Nova
20
Glance
Hypervisor
Nova-API
1. Instance Request
Nova-Scheduler
Host OS
Nova-Compute
Hypervisor
Host OS
Nova-Compute
Hypervisor
Host OS
Nova-Compute
2. Choose Nova-Compute
VM VM VM
Storage Storage
Storage Storage
USER1 Vol-13
USER1 Vol-14
USER2 Vol-11
USER2 Vol-12
Nova-Volume
VM
3. Image Provisioning
4. Network Isolation
5. Nova-Volume Attachment
6. VNC Access
DOCOMO, INC All Rights Reserved
VM Provisioning Procedure in Nova
21
Glance
Hypervisor
Nova-API
1. Instance Request
Nova-Scheduler
Host OS
Nova-Compute
Hypervisor
Host OS
Nova-Compute
Hypervisor
Host OS
Nova-Compute
2. Choose Nova-Compute
VM VM VM
Storage Storage
Storage Storage
USER1 Vol-13
USER1 Vol-14
USER2 Vol-11
USER2 Vol-12
Nova-Volume
VM
3. Image Provisioning
4. Network Isolation
5. Nova-Volume Attachment
6. VNC Access
glance
7. Snapshot
AMI AMI
DOCOMO, INC All Rights Reserved
Bare-Metal Provisioning Functions o We need to implement same functions for bare-metal
provisioning 1. Instance Request – Description for bare-metal machine instances 2. Choose Nova-Compute – Scheduler for bare-metal machines 3. Image Provisioning – Turn on/off and deploy images to bare-metal
machines 4. Network Isolation – Create private LAN among bare-metal
machines 5. Nova-Volume Attachment – Provide secure iSCSI access 6. VNC Access – Provide console access to bare-metal servers 7. Snapshot – Create new AMI from a running VM
22
How to achieve those functions without hypervisor? Keep
Compatibility (Same API)
Less impact to Nova
DOCOMO, INC All Rights Reserved
1. Instance Request o Create instance types for bare-metal machines
o bare-metal machine instances have
“instance_type_extra_specs”
Ø euca-run-instances –t m1.tiny -> Create virtual instance Ø euca-run-instances –t b1.tiney -> Create bare-metal instance
23
Name Id memory_mb VCPUS local_gb m1.tiny 1 512 1 40 m1.medium 2 4096 2 80 b1.tiny 3 512 1 40 b1.medium 4 4096 2 80
Id key value 3 cpu_arch tilepro64 4 cpu_arch x86_64
DOCOMO, INC All Rights Reserved
2. Choose Nova-Compute (Sceduler) o Create pseudo Nova-Computes for bare-metal machines
o Filter scheduler can classify virtual and bare-metal machines
24
CPU
MEM
HDD
Nova-Scheduler
HW
Hypervisor
CPU MEM HDD NIC
Host OS
HW
Hypervisor
CPU MEM HDD NIC
Host OS
b1.midium
b1.tiny
b1.tiny
b1.midium
m1.tiny
m1.large
Nova-API
b1.tiny m1.midium
bare-metal Driver
HW
Hypervisor
CPU MEM HDD NIC
Host OS
m1.tiny m1.midium m1.large
Nova-Compute (virtual)
Filter Scheduler
Bare-Metal
Virtual
m1.large
Nova-Compute
DOCOMO, INC All Rights Reserved
3. Image Provisioning (x86_64) 0. Preparation
25
nova-compute
Create “kernel + ramdisk”, and register them to glance
“baremetal-mkinitrd.sh”
glance
AKI ARI
Edit nova.conf
compute_driver=nova.virt.baremetal.driver.BareMetalDriver baremetal_driver=nova.virt.baremetal.pxe.PXE power_manager=nova.virt.baremetal.ipmi.Ipmi baremetal_deploy_ramdisk = 843adb6d-e0f8-452d-9a60-d8c883a0983c baremetal_deploy_kernel = 7dfd792c-fc85-480e-8d07-7d9b20d58c24
Run bare-metal deployment servers
- dnsmasq (PXE server) - bm_deploy_server
Specify nova-compute type
Driver for nova-compute and power manager
AKI and ARI for 1st boot
DOCOMO, INC All Rights Reserved
3. Image Provisioning (x86_64) 1. 1st Boot
2. System Setup
26
PXE boot Use kernel/ramdisk for the deployment
Nova-Scheduler
Nova-API
b1.tiny
euca-run-instances –t b1.tinyl --ramdisk ari-bare (–kernel aki-bare) ami-bare
AKI (deploy)
ARI (deploy)
Send AMI via iSCSI
nova-compute/ PXE server
AMI-bare nova-compute/
bm_deploy_server Read Configuration (Nova-Network)
MAC and IP Address 1. Create File system (SWAP) 2. Configure MAC and IP address 3. Setup PXE for 2nd boot 4. Reboot
Bare-Metal Machines
DOCOMO, INC All Rights Reserved
3. Image Provisioning (x86_64) 3. 2nd Boot
27
aki-Bare ari-Bare
nova-compute/ PXE server
AMI-bare
aki-Bare ari-Bare
PXE boot Use kernel/ramdisk for the
provisioning
Boot from Local HDD
euca-run-instances –t b1.tinyl --ramdisk ari-bare (–kernel aki-bare) ami-bare
Bare-Metal Instance
DOCOMO, INC All Rights Reserved 28
o Virtual Machines Ø Hypervisor checks addresses (IP
and MAC), and puts VLAN tag
o Bare-Metal Machines Ø Use can change address and VLAN
tag freely
4. Network Isolation
MW-d
OS-d
APL-d
IP address spoofing!
(pretend others)
HW
MW-d
OS-d
APL-d
Hypervisor
OK NG
MW-d
OS-d
APL-d
MAC, IP address, VLAN spoofing!
(pretending others)
HW
HW
Hypervisor
DOCOMO, INC All Rights Reserved
4. Network Isolation (β version) o Use Quantum – NEC’s Trema + OpenFlow Switch
Ø Protect against address spoofing (MAC and IP) Ø Create a private network among instances
29
Nova-Compute (bare-metal) Quantum
OpenFlow Controller (Trema from NEC)
of_in_port=<switch’s port> src_mac != <Instance's MAC> -> DROP
of_in_port=<switch’s port> src_ip != <Instance's IP> -> DROP
of_in_port=* dst_ip=<Instance's IP> protocol and dst_port Allowed by security group ->
ALLOW of_in_port=* dst_ip=<BROADCAST> protocol
and dst_port Allowed by security group -> ALLOW
Security Group A
Security Group A
Security Group B Security Group B OpenFlow
Switch
DOCOMO, INC All Rights Reserved 30
o Virtual Machines Ø Nova-Volume is transparent to
users
o Bare-Metal Machines Ø Use can see all Nova-Volumes
5. Nova-Volume Attachment
MW-d
OS-d
APL-d
HW
Storage USER2 Vol-13 Storage USER1
Vol-14 Storage USER4 Vol-11 Storage USER3
Vol-12
Nova-Volume
MW-d
OS-d
APL-d
HW
MW-d
OS-d
APL-d
Hypervisor
HW
Hypervisor
iscsiadm –m discovery
iscsiadm –m discovery
Don’t work! Can see all the volumes
DOCOMO, INC All Rights Reserved
5. Nova-Volume Attachment (β version) o Use Nova-Compute as a proxy of Nova-Volume
Ø Separate Nova-Volume network and provide ACL using CHAP
31
Storage Storage
Storage Storage
USER1 Vol-13
USER2 Vol-14
USER3 Vol-11
USER4 Vol-12
Nova-Volume
Server A
Server B
Server C
Nova-Volume Network
OpenFlow Switch
1. Isolate iSCSI netowrk
Server D
Bare-Metal Nova Volume Network
2. Provide ACL for each bare-metal machines
DOCOMO, INC All Rights Reserved
6. VNC Access (β version) o Provide console access by Serial over LAN (SOL)
o Use Ajax Console (shellinabox)
32
Nova-Compute Bare-metal
SOL
Serial Console
http://code.google.com/p/shellinabox/
DOCOMO, INC All Rights Reserved
Bare-Metal Provisioning 1. Instance Request
- Create new instance type with “extra_specs = bare-metal”
2. Choose Nova-Compute - Create new scheduler called “Heterogeneous Scheduler”
3. Image Provisioning - Use Intel vPro and IPMI to Turn on/off bare-metal machines
4. Network Isolation - Use Quantum (OpenFlow) to protect against address spoofing and create a private LAN within a security group
5. Nova Volume Attachment - Network ACL (VLAN and CHAP)
6. VNC Access - Serial over LAN
7. Snapshot - TBD
33
DOCOMO, INC All Rights Reserved
Libvirt and Bare-Metal Driver
34
Category Operation Libvirt Bare-Metal
Instance
Activate O O (IPMI)
Reboot O O (IPMI)
Suspend O X
Terminate O O (IPMI)
MAC/IP Address O O (Deploy Ramdisk)
Floating IP O O Snapshot O X
Security Security Groups O O (OpenFlow)
Keypair O O
Console O (VNC) △ (SOL)
o Compare operations supported by Horizon
Copyright©2011 NTT DOCOMO, INC. All rights reserved.
Scaling the Nova-Compute using Zabbix
General Bare-Metal
Provisioning Framework
DOCOMO, INC All Rights Reserved
Bare-Metal Machine Provisioning o Manage Bare-Metal Machines same as Virtual Machines
Ø Run an instance through OpenStack API
ü euca-run-instances –t b1.tinyl --ramdisk ari-bare (–kernel aki-bare) ami-bare
38
Virtual Machines
Bare-Metal Machines
Open Stack
Management using OpenStack
Utilize all the ecosystem created on top of OpenStack
Auto-Scaling
DOCOMO, INC All Rights Reserved
Auto-Scaling of the Nova-Compute o Change resources dynamically based on load
39
Common Computing Pool
Common Computing Pool
DOCOMO, INC All Rights Reserved
How Does Zabbix Scale a Nova-Compute?
40
Nova-Compute Zabbix
Item1, Item2
ITEM
Total CPUs Total Memory
Total Disk etc…
System Information
“Item1” = Total CPUs
Zabbix argent
VM's CPU load
Total vCPUs VM’s Memory
VM’s Disk etc…
Management
“Item2” = Total vCPUs Collectd Libvirt
Plugin
VM VM
VM
Zabbix Plugin
Scale-out Trigger Scale-in Trigger
TRIGGER
Scale-out Action Scale-in Action
ACTION
Information from Libvirt
V M
H O S T
DOCOMO, INC All Rights Reserved
Trigger & Action for scaling the Nova-Compute
41
Trigger List Expression Value
Scale-out Total vCPUs.ave(60) > Total CPUs True : PROBLEM
False : OK
Scale-in Total vCPUs.ave(180)
< Total CPUs - number of CPUs per server
True : PROBLEM
False : OK
Item List Item1 Total CPUs
Item2 Total vCPUs
Action List Value Status Operation
Scale-out PROBLEM Execute “euca-run-instances~” command to Nova-api
Scale-in PROBLEM Execute “euca-terminate-instances~” command to Nova-api
DOCOMO, INC All Rights Reserved
Bare-metal codes for submission o Updated scheduler and compute for multiple bare-metal
capabilities Ø https://review.openstack.org/13920
o Added separate bare-metal MySQL DB Ø https://review.openstack.org/10726
o A script for bare-metal node management Ø https://review.openstack.org/#/c/11366/
o Updated bare-metal provisioning framework Ø https://review.openstack.org/11354
o Added PXE back-end bare-metal Ø https://review.openstack.org/11088
o Added bare-metal host manager Ø https://review.openstack.org/11357
43
DOCOMO, INC All Rights Reserved
Bare-metal docs
OpenStack Wiki • http://wiki.openstack.org/
GeneralBareMetalProvisioningFramework
OpenStack Source • nova/virt/baremetal/docs/*.rst • README and installation documents
The Latest Github branch • https://github.com/NTTdocomo-openstack/
nova/
44
44