Top Banner
PostgreSQL on EXT3/4, XFS, BTRFS and ZFS comparing modern (Linux) file systems Tomas Vondra <[email protected]>
53

PostgreSQL on EXT4, XFS, BTRFS and ZFS

Jul 19, 2015

Download

Software

Tomas Vondra
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: PostgreSQL on EXT4, XFS, BTRFS and ZFS

PostgreSQL on EXT3/4, XFS, BTRFS and ZFS

comparing modern (Linux) file systems

Tomas Vondra <[email protected]>

Page 2: PostgreSQL on EXT4, XFS, BTRFS and ZFS

Linux file systems

● plenty of choices, with different– goals, features, tuning options

– maturity level, reliability

– ext3/4, XFS

– traditional, design from the 90s

– improving over time, reasonably “modern”

● BTRFS, ZFS– next-generation, new architecture / design

● other (not included in this talk)– log-organized file systems, distributed, clustered, ...

Page 3: PostgreSQL on EXT4, XFS, BTRFS and ZFS

EXT3, EXT4, XFS

Page 4: PostgreSQL on EXT4, XFS, BTRFS and ZFS

EXT3, EXT4, XFS - history

● ext3 (2001) / ext4 (2008)– evolution of original Linux filesystem (ext, ext2, ...)

– continuous improvements / fixes

● XFS (2002)– originally from SGI Irix 5.3 (1994)

– 2000 released under GPL

– 2002 merged into 2.5.36

● both are– reliable journaling file systems

– proven by time on many deployments

Page 5: PostgreSQL on EXT4, XFS, BTRFS and ZFS

EXT3, EXT4, XFS - features

● traditional design with journal● not handling

– multiple devices

– volume management

– snapshots

– ...

● need additional layers for those things– hardware RAID

– software RAID (dm)

– LVM / LVM2

Page 6: PostgreSQL on EXT4, XFS, BTRFS and ZFS

EXT3, EXT4, XFS - evolution

● conceived in times of rotational storage– mostly work with SSD

– stop-gap for future storage (NVRAM, ...)

● evolution, not a revolution (mostly)– fixing bugs (some real, some imaginary)

– adding features (e.g. TRIM, barriers, ...)

– scalability improvements (metadata, ...)

– be careful when reading old articles / benchmarks

– be vary of anecdotal evidence (without context)

– synthetic benchmarks are misleading

Page 7: PostgreSQL on EXT4, XFS, BTRFS and ZFS

EXT3, EXT4, XFS - sources

● Linux Filesystems: Where did they come from?(Dave Chinner @ linux.conf.au 2014)https://www.youtube.com/watch?v=SMcVdZk7wV8

● Ted Ts'o on the ext4 Filesystem(Ted Ts'o, NYLUG, 2013)https://www.youtube.com/watch?v=2mYDFr5T4tY

● XFS: There and Back … and There Again?(Dave Chinner @ Vault 2015)https://lwn.net/Articles/638546/

● XFS: Recent and Future Adventures in Filesystem Scalability(Dave Chinner, linux.conf.au 2012)https://www.youtube.com/watch?v=FegjLbCnoBw

● XFS: the filesystem of the future?(Jonathan Corbet, Dave Chinner, LWN, 2012)http://lwn.net/Articles/476263/

Page 8: PostgreSQL on EXT4, XFS, BTRFS and ZFS

BTRFS, ZFS

Page 9: PostgreSQL on EXT4, XFS, BTRFS and ZFS

BTRFS, ZFS - goals

● ideas– integrate the layers

– design for commodity hardware (expect failures)

– design for huge data volumes

● so that we get …– flexible management

– built-in snapshotting

– compression, deduplication

– checksums

– ...

Page 10: PostgreSQL on EXT4, XFS, BTRFS and ZFS

BTRFS, ZFS - history

● BTRFS– merged in 2009, but considered “experimental”

– on-disk format “stable” (1.0)

– some claim it’s “stable” but I doubt that …

– (What are the criteria for filesystem to be “stable”?)

● ZFS– originally from Solaris, but got Oracled :-(

– today a bit fragmented development

– available on other BSD systems (FreeBSD)

– “ZFS on Linux” project (CDDL vs. GPL)

Page 11: PostgreSQL on EXT4, XFS, BTRFS and ZFS

Tuning options

Page 12: PostgreSQL on EXT4, XFS, BTRFS and ZFS

Generic tuning options

● TRIM (discard)– enable / disable TRIM on SSDs

– impacts garbage collection / wear leveling

● write barriers– prevent disk from optimizing order of writes

– still may loose data, but no filesystem corruption

– write cache + battery => disable barriers

● SSD alignment– alignment on SSDs matter (pages, blocks, …)

– not dedicated tuning options (can use stripe unit / width)

Page 13: PostgreSQL on EXT4, XFS, BTRFS and ZFS

BTRFS tuning options

● nodatacow (BTRFS)– disable copy on write

– still can do snapshots (will do necessary COW)

– disables checksums (needs full COW)

● zfs_arc_max– limit the size of ARC cache

– should be released automatically, but ...

Page 14: PostgreSQL on EXT4, XFS, BTRFS and ZFS

BTRFS tuning options

● recordsize=8kB– match the fs page with PostgreSQL page

● ashift=13 (8kB)– align the writes to SSD pages

● primarycache=metadata– prevent double buffering (shared buffers)

http://open-zfs.org/wiki/Performance_tuning

Page 15: PostgreSQL on EXT4, XFS, BTRFS and ZFS

file systems

Page 16: PostgreSQL on EXT4, XFS, BTRFS and ZFS

● ext3 (default)● default

● ext4● default● discard, nobarrier, stripe-width

● xfs● default● LVM● LVM + snapshot● discard, nobarrier● discard, nobarrier, agcount, sunit/swidth

Page 17: PostgreSQL on EXT4, XFS, BTRFS and ZFS

● btrfs● default● nodatacow● nodiscard (+fstrim)

● zfs● default● recordsize=8k, ashift=13, primarycache=metadata (open-zfs)● recordsize=8k, ashift=13, max_arc_size=5GB (custom)

Page 18: PostgreSQL on EXT4, XFS, BTRFS and ZFS

benchmarks

Page 19: PostgreSQL on EXT4, XFS, BTRFS and ZFS

pgbench (TPC-B)

● transactional benchmark– small queries (access by PK, ...)

● modes– read-only

– read-write

● scales– small (~200MB)

– medium (~50% RAM)

– large (~200% RAM)

Page 20: PostgreSQL on EXT4, XFS, BTRFS and ZFS

TPC-DS

● warehouse, analytical– large amounts of data

– queries processing a lot of data

● complex queries– aggregations

– joins

– CTEs

– …

● successor to TPC-H– more elaborate / realistic

Page 21: PostgreSQL on EXT4, XFS, BTRFS and ZFS

System

● PostgreSQL 9.4.1● Gentoo with kernel 3.17● CPU: Intel i5-2500k

– 4 cores @ 3.3 GHz (3.7GHz)

– 6MB cache

– 2011-2013

● 8GB RAM (DDR3 1333)● SSD Intel S3500 100GB (SATA)

Page 22: PostgreSQL on EXT4, XFS, BTRFS and ZFS

pgbench read-only

Page 23: PostgreSQL on EXT4, XFS, BTRFS and ZFS

btrfs

btrfs-nodatacow

btrfs-nodiscard-fstrim

ext3

ext4

ext4-discard-nobarrier-stripe

xfs

xfs-discard-lvm-snapshot

xfs-discard-nobarrier

xfs-lvm

xfs-tuned-agcount-su-sw

zfs

zfs-tuned

zfs-tuned-2

0 10000 20000 30000 40000 50000 60000

pgbench / small (150MB) / read-only

transactions per second

Page 24: PostgreSQL on EXT4, XFS, BTRFS and ZFS

btrfs

btrfs-nodatacow

btrfs-nodiscard-fstrim

ext3

ext4

ext4-discard-nobarrier-stripe

xfs

xfs-discard-lvm-snapshot

xfs-discard-nobarrier

xfs-lvm

xfs-tuned-agcount-su-sw

zfs

zfs-tuned

zfs-tuned-2

0 10000 20000 30000 40000 50000 60000

pgbench / medium (50% RAM) / read-only

transactions per second

Page 25: PostgreSQL on EXT4, XFS, BTRFS and ZFS

btrfs

btrfs-nodatacow

btrfs-nodiscard-fstrim

ext3

ext4

ext4-discard-lvm-snapshot

ext4-discard-nobarrier-stripe

xfs

xfs-discard-lvm-snapshot

xfs-discard-nobarrier

xfs-lvm

xfs-tuned-agcount-su-sw

zfs

zfs-tuned

zfs-tuned-2

0 5000 10000 15000 20000 25000 30000 35000 40000 45000

pgbench / large (200% RAM) / read-only

transactions per second

Page 26: PostgreSQL on EXT4, XFS, BTRFS and ZFS

pgbench read-write

Page 27: PostgreSQL on EXT4, XFS, BTRFS and ZFS

btrfs

btrfs-nodatacow

btrfs-nodiscard-fstrim

ext3

ext4

ext4-discard-nobarrier-stripe

xfs

xfs-discard-lvm-snapshot

xfs-discard-nobarrier

xfs-lvm

xfs-tuned-agcount-su-sw

zfs

zfs-tuned

zfs-tuned-2

0 1000 2000 3000 4000 5000 6000 7000 8000

pgbench / small (150MB) / read-write

transactions per second

Page 28: PostgreSQL on EXT4, XFS, BTRFS and ZFS

btrfs

btrfs-nodatacow

btrfs-nodiscard-fstrim

ext3

ext4

ext4-discard-nobarrier-stripe

xfs

xfs-discard-lvm-snapshot

xfs-discard-nobarrier

xfs-lvm

xfs-tuned-agcount-su-sw

zfs

zfs-tuned

zfs-tuned-2

0 1000 2000 3000 4000 5000 6000

pgbench / medium (50% RAM) / read-write

transactions per second

Page 29: PostgreSQL on EXT4, XFS, BTRFS and ZFS

btrfs

btrfs-nodatacow

btrfs-nodiscard-fstrim

ext3ext4

ext4-discard-lvm-snapshot

ext4-discard-nobarrier-stripexfs

xfs-discard-lvm-snapshot

xfs-discard-nobarrier

xfs-lvm

xfs-tuned-agcount-su-swzfs

zfs-tuned

zfs-tuned-2

0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000

pgbench / large (200% RAM) / read-write

transactions per second

Page 30: PostgreSQL on EXT4, XFS, BTRFS and ZFS

performance variability

Page 31: PostgreSQL on EXT4, XFS, BTRFS and ZFS
Page 32: PostgreSQL on EXT4, XFS, BTRFS and ZFS
Page 33: PostgreSQL on EXT4, XFS, BTRFS and ZFS
Page 34: PostgreSQL on EXT4, XFS, BTRFS and ZFS
Page 35: PostgreSQL on EXT4, XFS, BTRFS and ZFS
Page 36: PostgreSQL on EXT4, XFS, BTRFS and ZFS
Page 37: PostgreSQL on EXT4, XFS, BTRFS and ZFS
Page 38: PostgreSQL on EXT4, XFS, BTRFS and ZFS
Page 39: PostgreSQL on EXT4, XFS, BTRFS and ZFS

EXT / XFS conclusions

EXT4● good “default” choice● disable barriers (with protected write cache)● tune alignment to match the SSD● very “smooth” results

XFS● does not outperform ext4 (in this test)● not much worse, if properly tuned● disable write barriers, tune alignment to SSD● more anomalies than ext4 (sudden performance drops, ...)

Page 40: PostgreSQL on EXT4, XFS, BTRFS and ZFS

BTRFS & ZFS

Page 41: PostgreSQL on EXT4, XFS, BTRFS and ZFS
Page 42: PostgreSQL on EXT4, XFS, BTRFS and ZFS
Page 43: PostgreSQL on EXT4, XFS, BTRFS and ZFS
Page 44: PostgreSQL on EXT4, XFS, BTRFS and ZFS
Page 45: PostgreSQL on EXT4, XFS, BTRFS and ZFS
Page 46: PostgreSQL on EXT4, XFS, BTRFS and ZFS
Page 47: PostgreSQL on EXT4, XFS, BTRFS and ZFS
Page 48: PostgreSQL on EXT4, XFS, BTRFS and ZFS

TPC-DS

Page 49: PostgreSQL on EXT4, XFS, BTRFS and ZFS

mkfs / mount options

● ext4, xfs– mkfs.ext4 ­E stripe­width=256 /dev/sda1– mkfs.xfs ­d su=512k,sw=1 ­l su=512k ­f /dev/sda1– mount: defaults,noatime,discard,nobarrier

● btrfs– mkfs.btrfs ­l 8192 ­L pgdata /dev/sda1– mount: defaults,noatime,ssd,discard,nobarrier [compress=lzo]

● zfs– zpool create pgpool /dev/sda1– zfs create pgpool/pgdata– zfs set recordsize=8k pgpool/pgdata– zfs set atime=off pgpool/pgdata

Page 50: PostgreSQL on EXT4, XFS, BTRFS and ZFS

ext4 xfs btrfs btrfs (lzo) zfs zfs (lz4)0

1000

2000

3000

4000

5000

6000

TPC-DS load duration

on EXT4, XFS, BTRFS and ZFS

data indexes

du

ratio

n [

seco

nd

s]

Page 51: PostgreSQL on EXT4, XFS, BTRFS and ZFS

ext4 xfs btrfs btrfs lzo zfs zfs (lz4)0

100

200

300

400

500

600

700

TPC-DS query performance

EXT4, XFS, BTRFS and ZFS

du

ratio

n [

seco

nd

s]

Page 52: PostgreSQL on EXT4, XFS, BTRFS and ZFS

ext4 xfs btrfs btrfs lzo zfs zfs (lz4)0

10

20

30

40

50

60

70

TPC-DS space used

on EXT4, XFS, BTRFS and ZFS

size

[G

B]

Page 53: PostgreSQL on EXT4, XFS, BTRFS and ZFS

TPC-DS summary

● EXT4, XFS, BTRFS– about the same performance

● compression is nice– uncompressed: 60GB

– compressed: ~30GB

● mostly storage capacity, queries not faster● ZFS much slower :-(