Top Banner
Experience sharing about OpenStack and Ceph Integration 吴德新 武宇亭 AWcloud
17

Ceph Day Beijing: Experience Sharing and OpenStack and Ceph Integration

Jul 28, 2015

Download

Technology

Ceph Community
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Ceph Day Beijing: Experience Sharing and OpenStack and Ceph Integration

Experience sharing about OpenStack and Ceph Integration

吴德新 武宇亭

AWcloud

Page 2: Ceph Day Beijing: Experience Sharing and OpenStack and Ceph Integration

Agenda

Awcloud introduction

Ceph benefits and problems

Performance tuning case study

High concurrency

Cinder backup

Page 3: Ceph Day Beijing: Experience Sharing and OpenStack and Ceph Integration

AWcloud Introduction

An OpenStack startup company founded in 2012

Founder and core team come from Red Hat and IBM

Providing enterprise-grade OpenStack products and

services

Core contributors for OpenStack distributed message

broker ZeroMQ integration

285 reviews and 34 commits in Kilo release

Page 4: Ceph Day Beijing: Experience Sharing and OpenStack and Ceph Integration

http://www.inktank.com/wp-content/uploads/2013/03/Diagram_v3.0_CEEPHOPENTSTACK11-1024x455.png

Integration with OpenStack

Most widely deployed Cinder backend

Seamless integration

40% deployments OpenStack survey

Large scale deployments observed

Used most in AWcloud OpenStack deployments

Page 5: Ceph Day Beijing: Experience Sharing and OpenStack and Ceph Integration

Benefit

Decouple VM storage from compute node

Easy to live migration, evacuation and resize

Avoid calculating disk numbers of each hypervisor

Genuine SDS

Controlled via crush rules

• Infrastructure topology aware

• Adjustable replication

• Weighting

Exposed with cinder volume type

Page 6: Ceph Day Beijing: Experience Sharing and OpenStack and Ceph Integration

Benefit (cont'd)

Copy-on-write Clone Efficient

Fast VM provisioning

No concurrent clone limitation -- Good for golden

image.

Light-weight snapshots

Better support for continuous backups

Incremental volume backup

Greater storage efficiency Thin provisioning

Discard support for disk space reclamation

Beyond VM block storage – Unified storage Container, Bare mental, Object, FS

Page 7: Ceph Day Beijing: Experience Sharing and OpenStack and Ceph Integration

Problem and Suggestion

Can't fully utilize high performance disks

Improved obviously in Hammer, but still not enough

Not turnkey project

Steep learning curve

Operations on large scale deployment

Feature request

QoS among whole cluster

Geo-replication for DR (ongoing in community)

Per-image configuration

Improve out-of-box performance

Optimize the default configuration

Benchmark collection and publish

Tool for self-tuning as per deployment

Page 8: Ceph Day Beijing: Experience Sharing and OpenStack and Ceph Integration

Ceph tuning case study

Production environment

~100 OSD Three different types of disk

• HDD • SSD • FC-SAN

Initial cluster performance

~ 2000 IOPS (4k randwrite)

Page 9: Ceph Day Beijing: Experience Sharing and OpenStack and Ceph Integration

Ceph tuning case study

Edit ceph cluster configuration file Turn off debug

• debug osd = 0/0 • debug throttle = 0/0

OSD parameters • filestore xattr use omap = true • filestore queue • osd op/disk • Journal

RBD client rbd cache = true

Page 10: Ceph Day Beijing: Experience Sharing and OpenStack and Ceph Integration

Ceph tuning case study

Customizing a cluster layout

Page 11: Ceph Day Beijing: Experience Sharing and OpenStack and Ceph Integration

Ceph tuning case study

Remove the worst osd ceph osd perf

osd fs_commit_latency(ms) fs_apply_latency(ms) 0 14 17 1 14 16 2 10 11 3 4 5 4 13 15 5 17 20 6 15 18 7 14 16 8 299 329

Page 12: Ceph Day Beijing: Experience Sharing and OpenStack and Ceph Integration

2000

4000

6000

9000

Initial value Optimizedconfiguration

Custmizinglayout

Remove theworst

IOPS

Fig. The cluster IOPS with tuning step

Ceph tuning case study

Page 13: Ceph Day Beijing: Experience Sharing and OpenStack and Ceph Integration

RBD feature support

Snapshot

Rollback snapshot Download from snapshot

Pool capacity report Report the pool's capacity instead of the whole

cluster https://review.openstack.org/#/c/166164/

Bug fix Resize after clone https://review.openstack.org/#/c/148185/

Page 14: Ceph Day Beijing: Experience Sharing and OpenStack and Ceph Integration

High concurrent workload

Concurrent RBD client operations are limited

Create/delete volume takes a long time

Cinder-volume/ceph-osd High CPU utilization

Cinder-volume cannot consume message

Short-term Band-Aid solution

Use more cinder-volume workers

https://review.openstack.org/#/c/135795/

Page 15: Ceph Day Beijing: Experience Sharing and OpenStack and Ceph Integration

High concurrency workload

Wokers Clone volume Delete volume

Time(second) Time(second) C-vol %CPU OSD %CPU

1 126 508 200% 100%

2 54 470 70% 120%

4 46 474 40% 140%

Table. Time consuming of operate 80 volumes on different workers

Page 16: Ceph Day Beijing: Experience Sharing and OpenStack and Ceph Integration

RBD backup to RBD

Blank image

Changed image1

Changed image2

diff

snap1

snap2

diff

import

import backup1

backup2

Blank image

snap1

snap2

Backup

diff

Blank image

Changed image1

diff

Blank image

Changed image2

Restore

Page 17: Ceph Day Beijing: Experience Sharing and OpenStack and Ceph Integration

谢谢