Top Banner
Compression in Open Source Databases Peter Zaitsev CEO, Percona MySQL Central @ OOW 26 Oct 2015
53

Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

May 21, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

Compression in Open Source Databases

Peter Zaitsev

CEO, Percona MySQL Central @ OOW

26 Oct 2015

Page 2: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

Few Words About Percona

Your Partner in MySQL and

MongoDB Success

100% Open Source Software

“No Lock in Required”

Solutions and Services

We work with MySQL, MariaDB,

MongoDB, Amazon RDS and Aurora

2

Page 3: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

About the Talk

A bit of the History

Approaches to Data Compression

What some of the popular systems implement

3

Page 4: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

Lets Define The Term

Compression - Any Technique to make

data size smaller

4

Page 5: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

A bit of History

Early Computers were too slow to compress data in Software

Hardware Encryption (ie Tape)

Compression first appears for non performance critical data

5

Page 6: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

We did not need it much for space…6

Page 7: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

Welcome to the modern age

Data Growth outpaces HDD improvements

Powerful CPUs Flash

CloudData we store now

7

Page 8: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

Exponential Data Size Growth8

Page 9: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

Powerful CPUs

High Performance Multiple Cores

9

Page 10: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

Can Compress and Decompress Fast!

Snappy, LZ4 • Up to 1GB/sec

compression • Up to 2GB/sec

decompression

10

Page 11: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

Flash

Disk space is more costly than for HDDs

Write Endurance is expensive

Want to write less data

Decent at handling fragmentation

11

Page 12: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

Cloud

Pay for Space Pay for IOPS

More limited Storage

Performance

Network Performance may

be limited

12

Page 13: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

Data we store in Databases

Modern Data Compresses Well! • Text • JSON • XML

13

Page 14: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

COMPRESSION BASICSIntroduction into a ways of making your data smaller

14

Page 15: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

Lossy and Lossless

Database generally use Lossless Compression

Lossy compression done on the application level

15

Page 16: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

Some ways of getting data smaller

Layout Optimizations

“Encoding”

Dictionary Compression

Block Compression

16

Page 17: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

Layout Optimizations

Column Store versus Row Store

Hybrid Formats

Variable Block Sizes

17

Page 18: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

Encoding

Depends on Data Type and Domain

Delta Encoding, Run Length Encoding (RLE)

Can be faster than read of uncompressed data

UTF8 (strings) and VLQ (Integers)

Index Prefix Compression

18

Page 19: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

Dictionary Compression

Replacing frequent values with

Dictionary Pointers

Kind of like STL String

ENUM type in MySQL

19

Page 20: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

Block Compression

Compress “block” of data so it is smaller for storage

Finding Patterns in Data and Efficiently encoding them

Many Algorithms Exist: Snappy, Zlib, LZ4, LZMA

20

Page 21: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

Block Compression Details

Compression rate highly depends on data

Compression rate depends on block size

Speed depends on block size and data

21

Page 22: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

Block Size Dependence (by Leif Walsh)22

Page 23: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

There is no one size fits all

Typically Compression Algorithm can be selected

Often with additional settings

23

Page 24: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

WHERE AND HOWWhere do we compress data and how do we do that

24

Page 25: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

Where to Compress Data

In Memory ?

In the Database Data Store ?

As Part of File System ?

Storage Hardware ?

Application ?

25

Page 26: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

Compression in Memory

Reduce amount of memory needed for same working

set

Reduce IO for Fixed amount of

Memory

Typically in-Memory

Performance Hit

Encoding/Dictionary

Compression are good fit

26

Page 27: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

Database Data Store

Reduce Database Size on Disk

Works with all file systems and

storage

With OS cache can be used as In-

Memory compression

variant

Dealing with fragmentation is common issue

27

Page 28: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

Compression on File System Level

Works with all Databases/Storage

Engines

Performance Impact can be

significant

Logical Space on disk is not reduced

ZFS

28

Page 29: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

Compression on Storage Hardware

Hardware Dependent

Does not reduce space on disk

Can result in Performance Gains

rather than free space (SSD)

Can become a choke point

29

Page 30: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

By Application

No Database Support needed

Reduce Database Load and Network Traffic

Application may know more about data

More Complexity

Give up many DBMS features (search, index)

30

Page 31: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

DESIGN CONSIDERATIONSWhat makes database system to do well with compression

31

Page 32: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

The Goal

Minimize Negative Impact for User

Operations (Reads and Writes)

32

Page 33: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

Design Principles

Fast Decompression

Compression in Background

Parallel Compression/

Decompression

Reduce need of Re-Compression

on Update

33

Page 34: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

Choosing Block Size

Large Blocks • Most

efficient for compression

• Bulky Read Writes

Small Blocks • Fastest to

Decompress • Best for

point lookups

34

Page 35: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

IMPLEMENTATION EXAMPLESWhat Database systems Really do with Compression

35

Page 36: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

MySQL “Packed” MyISAM

Compress table “offline” with myisampack

Table Becomes Read Only

Variety of compression methods are used

Only data is compressed, not indexes

Note MyISAM support index prefix compression for all indexes

36

Page 37: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

MySQL Archive Storage Engine

Does not support indexes

Essentially file of rows with sequential access

Uses zlib compression

37

Page 38: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

Innodb Table Compression

Available Since MySQL 5.1

Pages compressed using zlib

Compressed page target (1K, 4K, 8K) has to be set

Both Compressed and Uncompressed pages can be cached in Buffer Pool

Per Page “log” of changes to avoid recompression

Extenrally Stored BLOBs are compressed as single entity

38

Page 39: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

Innodb Transparent Page Compression

Available in MySQL 5.7

Zib and LZ4 Compression

Compresses pages as they are written to disk

Free space on the page is given back using “hole punching”

Originally designed to work with FusionIO NVMFS

Can cause problems for current filesystem due to very high hole number

39

Page 40: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

Disk usage (Linkbench data set by Sunny Bains)40

Page 41: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

Performance on Fast SSD (FusionIO NVMFS)41

Page 42: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

Results on Slower SSD (Intel 730*2, EXT4)42

Page 43: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

Fractal Trees Compression

Available as Storage Engine for MySQL and MongoDB

Can use many compression libraries

Tunable Compression Block Size

Reduce Re-Compression due to message buffering

43

Page 44: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

Can get a lot of compression44

Page 45: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

MongoDB WiredTiger Storage Engine

Engine Has many compression settings

Indexes are using Index Prefix Compression

Data Pages can be compressed using zlib or Snappy

45

Page 46: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

Compression Size (results by Asya Kamsky)46

Page 47: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

Compression in RocksDB

RocksDB – LSM Based Storage Engine for MongoDB and MySQL

LSM works very well with compression

Supports, zlib, lz4, bzip2 compression

Can use different compression methods for different Levels in LSM

47

Page 48: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

Compression results from Mike Kania48

Page 49: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

PostgreSQL

Uses compression by default with TOAST

2KB (default) or longer Strings, BlOBs

Unlike Innodb External Storage is not required for Compression

Recommended to use File system compression ie ZFS if Compression is Desired

49

Page 50: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

Summary

Compression is Important in Modern Age

Consider it for your system

Many different techniques are used to make data smaller by databases

Compression support is rapidly changing and improving

50

Page 51: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

Want More ?

I’m talking about MySQL Replication Options

Free (as in Beer) Moscow MySQL Users Group meetup November 6th,

Hosted by Mail.ru

http://www.meetup.com/moscowmysql/

51

Page 52: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

Percona Live 2016 call for paper is Open

Call for Papers Open until November 29, 2016

MySQL, MongoDB, NoSQL, Data in The Cloud

Anything to make Data Happy!

http://bit.ly/PL16Call

52

Page 53: Compression in Open Source Databases - Percona · Compression in RocksDB RocksDB – LSM Based Storage Engine for MongoDB and MySQL LSM works very well with compression Supports,

53

Thank You! Peter Zaitsev

[email protected] https://www.linkedin.com/in/peterzaitsev