SSD для вашей базы данных, Петр Зайцев (Percona)

Post on 21-Jun-2015

676 Views

Category:

Internet

5 Downloads

Preview:

Click to see full reader

DESCRIPTION

Доклад Петра Зайцева на HighLoad++ 2014.

Transcript

Peter Zaitsev,CEO, Percona

November 1, 2014Highload++ 2014

Moscow,Russia

SSD/Flash for Modern Databases

www.percona.com2

Percona

• Percona Server• Percona Xtrabackup• Percona XtraDB Cluster• Percona Toolkit

We love Open Source

Software

• Consulting • Support • Managed Services

We want to help you to

succeed with MySQL and

Beyond

www.percona.com3

In this Presentation

Flash technology overview

Review some of the available technology

What does this mean for databases ?

Specific opportunities for MySQL

www.percona.com4

Before SSDs

www.percona.com5

There were HDDs

Good at Sequential Read/Writes

RT=Seek Time + Rotation Latency

Reads/Write – Similar Latency

No Specific Write Limits

Retain data for a long time

One IO Request in Parallel

Low cost per GB

www.percona.com6

RAID and SAN

www.percona.com7

Using Many HDDs together

Caching Reads

Buffering Writes (Writeback Cache)

Better Sequential Read/Write speed

Better throughput at high concurrency

Higher IO latencies for uncached IO

www.percona.com8

Flash Revolution

Use Flash chips instead of platte

rs

No moving part

s

No seeks

www.percona.com9

NAND Flash

Cell

Page/Read Block

Erase Block

Write but no overwrite

Wears with writes (erases)

www.percona.com10

Writing to the Flash

•Set all bits to “1111111…”Erase•Set some of the bits to 0: “0100111..”Write•Impossible. Do Erase, when Write

Change Zero to one

www.percona.com11

Types of NAND Flash

From AnandTech:

www.percona.com12

Flash Storage Design

Cache

Battery/Super Capacitor

Controller + Complex Firmware

Built-in Parallelism

www.percona.com13

Flash Controller Tasks

Write wear leveling

Garbage collection

Error correction

Bad block mapping

Read scrubbing

Read disturb management

Encryption

www.percona.com14

Flash Properties

Lots of IOs per device! (100K+)

Less random IO penalty

Writes more expensive than reads (but can be faster)

Limited by amount of writes

Limited retention

Concurrent execution on single device

Fast write acknowledgement (safe or not)

Can burst writes

www.percona.com15

Flash Interface Designs

DIMM

PCI-E

SFF-8639

SATA/SAS

FC and Network

www.percona.com16

Transitioning

AHCI NVMe

www.percona.com17

AHCI vs NVMe

• Source: AnandTech.com

www.percona.com18

Sandisk ULLtraDIMM

www.percona.com19

HGST Virident

www.percona.com20

Sandisk FusionIO

www.percona.com21

Intel P3700

www.percona.com22

Intel 730 (SATA)

www.percona.com23

mSATA

www.percona.com24

M.2 Interface

www.percona.com25

Violin Memory

www.percona.com26

“Consumer” vs “Enterprise”

Performance

Endurance

Durability

Retention

Encryption

www.percona.com27

Not your HDD

All HDDs are the same; All SSDs are different

www.percona.com28

Evaluation

Performance changes over time

Empty Space Matters

Complex internals

Watch stability carefully

www.percona.com29

How Flash Fails

Clear write amount defined EOL (but often can handle a lot more)

One day… it’s gone

“Power Loss Protection”

Internal ECC and redundancy

www.percona.com30

To RAID or not to RAID ?

More valuable for consumer grade

Watch for good Flash support

RAID controller logic may slow things down

Use a redundant array of inexpensive servers instead?

www.percona.com31

Redundancy

Device internal redundancy

Hardware RAID

Software RAID

Filesystem “RAID”

www.percona.com32

OS Support

Flash support is actively being improved

TRIM

Sparse Files

www.percona.com33 www.percona.com

Flash And Databases

www.percona.com34

Database History

Most have been designed in HDD time

Optimize for sequential IO

Count on cheap sequential writes

RAID, BBU to improve performance

www.percona.com35

It’s time for Flash

Your OLTP Database should live on Flash

www.percona.com36

But What Flash ?

Pick a flash type that is right for your application

www.percona.com37

IO vs Memory

www.percona.com38

Warmup

Much faster warmup times

Even if the database fits in memory, SSD might be justified

www.percona.com39

Tolerate more IO bound load

• 5ms• Can do 20 IO/s for 100ms

response time (non parallel)HDD

• 0.1ms• Can do 1000 IO/s for 100ms

response time (non parallel)Flash

www.percona.com40

Endurance

Might be a top consideration

www.percona.com41

Endurance Math

• 4400GB/day over 5 Years• 1400MB/sec peak writes• 66 days at peak write

throughput

HGST FlashMax III 2200GB

• 72TB total life time writes• 400MB/sec write• 52 hours at peak write

throughput

Crucial M500 960GB

www.percona.com42

Databases and Flash

How do we optimize databases to us

Flash best?

www.percona.com43

“Torn Page” problem

Flash can avoid this with little cost due to internal design

FusionIO NVMFS (Atomic Writes)

Copy-on-Write File Systems• ZFS• BTRFS

Filesystem level data journaling less preferred• data=journal for EXT4

Skip-Innodb-double-write

www.percona.com44

Fast IO Path

Bypass Caching O_DIRECT

Native Asynchronous IO

Efficient Checksuming

Innodb_checksum_algorithm=crc32

Innodb_flush_method=O_DIRECT

www.percona.com45

IO Cost Accounting

Sequential vs Random IO balance

IO vs CPU Balance

Smaller page sizes might make sense• innodb_page_size=4K

www.percona.com46

Less Pre-fetching

Most pre-fetched data must be used

Often best to try It out

www.percona.com47

Less merging on flushing

Do not assume flushing multiple sequential dirty pages has same cost

Innodb_flush_neighbors=0

www.percona.com48

Less Space on Disk

Innodb Compression (2x typical)

TokuDB Compression (5-10x typical)

Archiving data off OLTP System

www.percona.com49

Less Writes on Flash

Hybrid Flash/SSD System

Transactional Logs, Other logs on the HDD with RAID and BBU

Small Temporary objects on tmpfs

Innodb_log_file_size=<LARGE>

www.percona.com50

Logs on RAID can be fast

www.percona.com51

Single Intel 730 Sysbench

www.percona.com52

IOPS

www.percona.com53

Consistency

• Graph by http://cloud.percona.com

www.percona.com54

Is Flash Too Fast ?

• Multiple instances might scale better

www.percona.com55

Other Thoughts

Host hardware and OS matter, especially with high end flash

Virtualization has higher relative overhead

Network higher relative overhead

www.percona.com56 www.percona.com

Peter Zaitsevpz@percona.com

@PeterZaitsevhttps://www.linkedin.com/in/peterzaitsev

Thank You!

top related