Top Banner
Storing VMs with Cinder and Ceph RBD
44

Storing VMs with Cinder and Ceph RBD.pdf

Dec 04, 2014

Download

Documents

true
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Storing VMs with Cinder and Ceph RBD.pdf

Storing VMs with Cinder and

Ceph RBD

Page 2: Storing VMs with Cinder and Ceph RBD.pdf

Growing With Hardware Appliances

First PB

•  Proprietary storage hardware

• Well-known storage vendor

$14 b’zillion

Second PB

•  Proprietary storage hardware

•  Same storage vendor

Another

$14 b’zillion

47

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

Page 3: Storing VMs with Cinder and Ceph RBD.pdf

52

DC

DC

DC

DC

D

C

DC

DC

DC

DC

DC

DC

DC

C++

Page 4: Storing VMs with Cinder and Ceph RBD.pdf

53

DC

DC

DC

DC

D

C

DC

DC

DC

DC

DC

DC

DC

C++ X

Page 5: Storing VMs with Cinder and Ceph RBD.pdf

54

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

HUMAN [DEVELOPER]

!!

Page 6: Storing VMs with Cinder and Ceph RBD.pdf

Hard Drives Are Tiny Record Players and They Fail Often jon_a_ross, Flickr / CC BY 2.0 71

Page 7: Storing VMs with Cinder and Ceph RBD.pdf

72

D

55 times / day

= D

D D

x 1 MILLION

D D

D D

Page 8: Storing VMs with Cinder and Ceph RBD.pdf

73

Page 9: Storing VMs with Cinder and Ceph RBD.pdf

OPEN SOURCE

COMMUNITY-FOCUSED

SCALABLE

NO SINGLE POINT OF FAILURE

SOFTWARE BASED

SELF-MANAGING

philosophy design

Page 10: Storing VMs with Cinder and Ceph RBD.pdf

79

RADOS A reliable, autonomous, distributed object store comprised of self-healing, self-managing,

intelligent storage nodes

LIBRADOS

A library allowing

apps to directly

access RADOS, with support for

C, C++, Java,

Python, Ruby,

and PHP

RBD A reliable and fully-

distributed block device, with a Linux

kernel client and a

QEMU/KVM driver

CEPH FS A POSIX-compliant

distributed file system, with a Linux

kernel client and

support for FUSE

RADOSGW A bucket-based REST

gateway, compatible with S3 and Swift

APP APP HOST/VM CLIENT

Page 11: Storing VMs with Cinder and Ceph RBD.pdf

81

DISK

FS

DISK DISK

OSD

DISK DISK

OSD OSD OSD OSD

FS FS FS FS btrfs xfs

ext4

M M M

Page 12: Storing VMs with Cinder and Ceph RBD.pdf

82

M

M

M

HUMAN

Page 13: Storing VMs with Cinder and Ceph RBD.pdf

83

Monitors:

• Maintain cluster map

•  Provide consensus for distributed decision-making

• Must have an odd number

•  These do not serve stored objects to clients

M

OSDs: • One per disk (recommended)

•  At least three in a cluster

•  Serve stored objects to clients

•  Intelligently peer to perform replication tasks

•  Supports object classes

Page 14: Storing VMs with Cinder and Ceph RBD.pdf

APP??

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

Page 15: Storing VMs with Cinder and Ceph RBD.pdf

APP

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

Page 16: Storing VMs with Cinder and Ceph RBD.pdf

APP

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

A-G

H-N

O-T

U-Z

F*

Page 17: Storing VMs with Cinder and Ceph RBD.pdf

107

10 10 01 01 10 10 01 11 01 10

10 10 01 01 10 10 01 11 01 10

hash(object name) % num pg

CRUSH(pg, cluster state, rule set)

Page 18: Storing VMs with Cinder and Ceph RBD.pdf

108

10 10 01 01 10 10 01 11 01 10

10 10 01 01 10 10 01 11 01 10

Page 19: Storing VMs with Cinder and Ceph RBD.pdf

109

CRUSH

•  Pseudo-random placement algorithm

•  Ensures even distribution

•  Repeatable, deterministic

•  Rule-based configuration

•  Replica count

•  Infrastructure topology

•  Weighting

Page 20: Storing VMs with Cinder and Ceph RBD.pdf

110

CLIENT

??

Page 21: Storing VMs with Cinder and Ceph RBD.pdf

112

Page 22: Storing VMs with Cinder and Ceph RBD.pdf

113

CLIENT

??

Page 23: Storing VMs with Cinder and Ceph RBD.pdf

111

Page 24: Storing VMs with Cinder and Ceph RBD.pdf

84

RADOS A reliable, autonomous, distributed object store comprised of self-healing, self-managing,

intelligent storage nodes

LIBRADOS

A library allowing

apps to directly

access RADOS, with support for

C, C++, Java,

Python, Ruby,

and PHP

RBD A reliable and fully-

distributed block device, with a Linux

kernel client and a

QEMU/KVM driver

CEPH FS A POSIX-compliant

distributed file system, with a Linux

kernel client and

support for FUSE

RADOSGW A bucket-based REST

gateway, compatible with S3 and Swift

APP APP HOST/VM CLIENT

Page 25: Storing VMs with Cinder and Ceph RBD.pdf

LIBRADOS

M

M

M

85

APP

native

Page 26: Storing VMs with Cinder and Ceph RBD.pdf

L

LIBRADOS

•  Provides direct access to RADOS for applications

•  C, C++, Python, PHP, Java

• No HTTP overhead

Page 27: Storing VMs with Cinder and Ceph RBD.pdf

87

RADOS A reliable, autonomous, distributed object store comprised of self-healing, self-managing,

intelligent storage nodes

LIBRADOS

A library allowing

apps to directly

access RADOS, with support for

C, C++, Java,

Python, Ruby,

and PHP

RBD A reliable and fully-

distributed block device, with a Linux

kernel client and a

QEMU/KVM driver

CEPH FS A POSIX-compliant

distributed file system, with a Linux

kernel client and

support for FUSE

RADOSGW A bucket-based REST

gateway, compatible with S3 and Swift

APP APP HOST/VM CLIENT

Page 28: Storing VMs with Cinder and Ceph RBD.pdf

88

M

M

M

LIBRADOS

RADOSGW

APP

native

REST

LIBRADOS

RADOSGW

APP

Page 29: Storing VMs with Cinder and Ceph RBD.pdf

89

RADOS Gateway:

•  REST-based interface to RADOS

•  Supports buckets, accounting

•  Compatible with S3 and Swift applications

Page 30: Storing VMs with Cinder and Ceph RBD.pdf

90

RADOS A reliable, autonomous, distributed object store comprised of self-healing, self-managing,

intelligent storage nodes

LIBRADOS

A library allowing

apps to directly

access RADOS, with support for

C, C++, Java,

Python, Ruby,

and PHP

CEPH FS A POSIX-compliant

distributed file system, with a Linux

kernel client and

support for FUSE

RADOSGW A bucket-based REST

gateway, compatible with S3 and Swift

APP APP HOST/VM CLIENT

RBD A reliable and fully-

distributed block device, with a Linux

kernel client and a

QEMU/KVM driver

Page 31: Storing VMs with Cinder and Ceph RBD.pdf

91

M

M

M

VM

LIBRADOS LIBRBD

VIRTUALIZATION CONTAINER

Page 32: Storing VMs with Cinder and Ceph RBD.pdf

LIBRADOS

92

M

M

M

LIBRBD

CONTAINER

LIBRADOS LIBRBD

CONTAINER VM

Page 33: Storing VMs with Cinder and Ceph RBD.pdf

LIBRADOS

93

M

M

M

KRBD (KERNEL MODULE)

HOST

Page 34: Storing VMs with Cinder and Ceph RBD.pdf

RADOS Block Device:

• Storage of virtual disks in RADOS

• Allows decoupling of VMs and

containers

• Live migration!

• Images are striped across the

cluster

• Thin-provisioning

• Snapshots and cloning

Page 35: Storing VMs with Cinder and Ceph RBD.pdf

LIBRADOS

115

M

M

M

VM

LIBRBD

VIRTUALIZATION CONTAINER

Page 36: Storing VMs with Cinder and Ceph RBD.pdf

HOW DO YOU

SPIN UP

THOUSANDS OF VMs

INSTANTLY

AND

EFFICIENTLY?

116

Page 37: Storing VMs with Cinder and Ceph RBD.pdf

144

117

0 0 0 0

instant copy

= 144

Page 38: Storing VMs with Cinder and Ceph RBD.pdf

4 144

118

CLIENT

write

write

write

= 148

write

Page 39: Storing VMs with Cinder and Ceph RBD.pdf

4 144

119

CLIENT read

read

read

= 148

Page 40: Storing VMs with Cinder and Ceph RBD.pdf

29

local disk(VM images)

Novacompute

Glance(templates)

read X

X

X'

old-style VM image creation

● ephemeral

● expensive to create

Page 41: Storing VMs with Cinder and Ceph RBD.pdf

Why use block storage?

• Persistent• More familiar to users

• Not tied to a single host• Decouples compute and storage• Enables Live migration

• Extra capabilities of storage system• Efficient snapshots• Different types of storage available• Cloning for fast restore or scaling

Page 42: Storing VMs with Cinder and Ceph RBD.pdf

31

CinderAPI

Cindervolume

create image from X

X

Cinder volume creation

Glance(templates)

volume driver

locate X

location of X

read X

X'

reference to X'

flexibility in where VM images are stored

Page 43: Storing VMs with Cinder and Ceph RBD.pdf

32

CinderAPI

Cindervolume

create image from X

X

Efficient volume creation

Glance(templates)

volume driver

locate X

location of X

clone X to X'

X'

reference to X'

fast CoW clone

X' complete

Page 44: Storing VMs with Cinder and Ceph RBD.pdf

Questions?

Josh Durgin

[email protected]

jdurgin on freenode

inktank.com | ceph.com