Top Banner
General Bare-metal Provisioning Framework Mikyung Kang, USC/ISI David Kang, USC/ISI Ken Igarashi, NTT Docomo Mana Kaneko, NTT Docomo Hiromichi Ito, Virtual Tech Japan Arata Notsu, Virtual Tech Japan
11
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 2012 Fall OpenStack Bare-metal Summit Session

General Bare-metal

Provisioning Framework

Mikyung Kang, USC/ISI David Kang, USC/ISI

Ken Igarashi, NTT Docomo Mana Kaneko, NTT Docomo

Hiromichi Ito, Virtual Tech Japan Arata Notsu, Virtual Tech Japan

Page 2: 2012 Fall OpenStack Bare-metal Summit Session

Agenda ¡  Speakers/Contributors ¡  USC/ISI + NTT Docomo + Virtual Tech Japan: nova-scheduler

¡  HP: scalability and CI process

¡  NEC: security enhancement for network/volume isolation

¡  Calxeda: deployment/scalability

¡  USC/ISI: fault-tolerance

General Bare-Metal Provisioning Framework (Summit Session)

2

Page 3: 2012 Fall OpenStack Bare-metal Summit Session

Copyright©2011 NTT DOCOMO, INC. All rights reserved.

New features on bare-metal provisioning framework

Page 4: 2012 Fall OpenStack Bare-metal Summit Session

DOCOMO, INC All Rights Reserved

o  Role of Hypervisor Ø  Image Provisioning Ø  VM’s Power Management Ø  Network Isolation (VLAN) Ø  Volume Isolation (iSCSI) Ø  Console Access (VNC) Ø  VM’s Snapshot

Ryota Mibu

Ken Igarashi

Virtual Machine

4

o  Virtual Machines Ø  Hypervisor exists between physical

resources and virtual machines

NW Storage iSCSI

NW VLAN HW

Hypervisor (OpenStack)

CPU MEM HDD NIC

Host OS

o  Bare-Metal Machines Ø  There is no Hypervisor Ø  Bare-Metal Machine can access

physical resources freely

HW

CPU MEM HDD NIC

Need to achieve same security level as virtual environments

OS image

DB

Difference between Virtual and Bare-Metal Machines

Bare-Metal Machine

NW Storage iSCSI

NW VLAN

OS image

DB

Page 5: 2012 Fall OpenStack Bare-metal Summit Session

DOCOMO, INC All Rights Reserved

Why Bare-Metal Provisioning? 5

o  Virtual machine vs. Bare-metal Machine

Bare-metal machine

Nova DB

Nova-Compute (Virtual)

HW

Hypervisor 

CPU MEM HDD NIC

Host OS

m1.tiny m1.midium m1.large IPMI PXE

Nova-Compute (Bare-Metal)

Virtual machine

Id : vcpus : memory_m : cpus_used … Id : vcpus : memory_m : cpus_used … Id : vcpus : memory_m : cpus_used …

CPU

MEM

HDD

m1.large CPU

MEM

HDD

b1.midium

CPU

MEM

HDD

b1.tiny

Nova DB

BM DB

Id : vcpus : memory_m : cpus_used … ipmi_address : ipmi_user : ipmi_password : pxe_mac_address

Id : vcpus : memory_m : cpus_used …

Nova-Compute (Virtual)

Page 6: 2012 Fall OpenStack Bare-metal Summit Session

DOCOMO, INC All Rights Reserved

o  A Nova-Compute can tell only one resource and capability to a scheduler

o  Which information should be informed to a scheduler? Ø  Aggregated resources :

ü Bare-Metal Nodes must be homogenous (CPU Arch, # of CPUs, # of Memory) ü  Instance must be smaller than the bare-metal machine

Ø  Maximum resource : inform one bare-metal machine’s resource ü Can not provision multiple instance at a time ü Can not use resources efficiently

Legacy Scheduler and Bare-Metal Provisioning

6

Nova-Scheduler

Nova-Compute (Bare-Metal)

Resources Capability

CPU

MEM

HDD

CPU

MEM

HDD

CPU

MEM

HDD

CPU MEM HDD

CPU MEM HDD

CPU MEM HDD

Aggregated Resources

CPU MEM HDD

Maximum Resources

Page 7: 2012 Fall OpenStack Bare-metal Summit Session

DOCOMO, INC All Rights Reserved

o  Put one Nova-Compute for one Bare-Metal machine

o  1:1 = Nova-Compute : Bare-Metal machine ⇒ Should not configure like this.

Nova-Compute2

Simple Solution

7

Nova-Compute1 Scheduler

Nova DB

MQ

capability via messaging

gets all BM-machines information

BM-machine-1

BM-machine-2

Nova-Compute1 : BM-Machine-1 Nova-Compute2 : BM-Machine-2…

BM DB

Page 8: 2012 Fall OpenStack Bare-metal Summit Session

DOCOMO, INC All Rights Reserved

Expose BM-Mache to a scheduler o Nova-Compute creates/updates "compute_nodes" entries for

each BM-Machines o Nova-Compute sends a list of capabilities for each BM-Machine

o  Scheduler can choose appropriate BM-Machine since all BM-Machines are exposed to the scheduler

8

Nova-Compute

Scheduler

gets all BM-machine’s information

BM DB BM-machines

via compute_nodes table

via capability message

(cpu,mem,disk)x3

(extra specs)x3

Nova DB

MQ

Scheduler Instance Request Nova-

Compute

MQ

instance_system_metadata Node : ID

PXE boot

Page 9: 2012 Fall OpenStack Bare-metal Summit Session

DOCOMO, INC All Rights Reserved

Changes in Capability Management o ComputeDriver passes a list of capabilities to

ComputeManager with nodenames Ø before: {'cpu_arch': 'x86_64', ...} Ø after: [{'node': 'node-1','cpu_arch': 'x86_64', ...}, {'node':

'node-2', 'cpu_arch': 'tilera', ...},...] o ComputeManager sends them to scheduler

Ø We can reuse almost all code related to passing capability o Scheduler holds the capabilities of the nodes

Ø the dictionary keys are changed: "$host" -> "$host/$node" Ø before:{'host-1': {capability}, ...} Ø after:{'host-1/node-1': {capability}, 'host-1/node-2':

{capability}, ...}

Page 10: 2012 Fall OpenStack Bare-metal Summit Session

DOCOMO, INC All Rights Reserved

Changes in Resource Tracking

o ComputeManager holds multiple ResourceTrackers corresponding to the nodes Ø  In a dictionary: {'node-1': tracker1, 'node-2': tracker2, ...}

o ResourceTracker is aware of node Ø  holds the name of the node Ø  creates/updates an entry in compute_nodes table corresponding to the

node ü To store the name, use hypervisor_hostname? or add a new column?

o Compatibility Ø Under non-bare-metal drivers, the current behavior is unchanged

Page 11: 2012 Fall OpenStack Bare-metal Summit Session

DOCOMO, INC All Rights Reserved

Other Topics

o ResourceTracker for bare-metal Ø  calculates resources in all-or-nothing manner Ø  specify a custom RT class by driver or by flag

ü  in reviewing

o  Best-match scheduling Ø  choose the best-suit node rather than the largest one

o  ...