Top Banner
Using Compression filters in HDF5 Euge Wintersberger ICALEPCS 2017, 8.10.2017 HDF5s` new external filter interface in action
13

Using Compression filters in HDF5 · Using Compression filters in HDF5 Euge Wintersberger ICALEPCS 2017, 8.10.2017 HDF5s` new external filter interface in action

Aug 12, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Using Compression filters in HDF5 · Using Compression filters in HDF5 Euge Wintersberger ICALEPCS 2017, 8.10.2017 HDF5s` new external filter interface in action

Using Compression filters in HDF5

Euge WintersbergerICALEPCS 2017, 8.10.2017

HDF5s` new external filter interface in action

Page 2: Using Compression filters in HDF5 · Using Compression filters in HDF5 Euge Wintersberger ICALEPCS 2017, 8.10.2017 HDF5s` new external filter interface in action

Eugen Wintersberger | Using compression filters | 8.10.2017 | Page 2

Motivation

Applying different compression algorithms to individual datasets is one of the key features of HDF5.

➔ Apply compression only where feasible

➔ Other data can be read and written without any performance penalty

➔ We can pick the optimum algorithm for each dataset

Performance key figures for a compression algorithm:

➔ Throughput (Mbyte/sec)

➔ Compression ratio depend on

Nature of the data passed to the

algorithm!

Page 3: Using Compression filters in HDF5 · Using Compression filters in HDF5 Euge Wintersberger ICALEPCS 2017, 8.10.2017 HDF5s` new external filter interface in action

Eugen Wintersberger | Using compression filters | 8.10.2017 | Page 3

The situation before HDF5 1.8.11

Two issues

➔ Need to change sourcecode

➔ Not possible for commercial

applications!

#define H5Z_FILTER_BZIP2 305

/* declare a filter function */size_t H5Z_filter_bzip2(unsigned flags, size_t cd_nelmts, const unsigned cd_values[], size_t nbytes, size_t *buf_size,void**buf);

const H5Z_class2_t H5Z_BZIP2[1] = {{ H5Z_CLASS_T_VERS, /* H5Z_class_t version */ /* Filter id number */ (H5Z_filter_t)H5Z_FILTER_BZIP2, 1,/* encoder_present flag (set to true) */ 1,/* decoder_present flag (set to true) */ "bzip2",/* Filter name for debugging */ NULL, /* The "can apply" callback */ NULL, /* The "set local" callback */ /* The actual filter function */ (H5Z_func_t)H5Z_filter_bzip2, }};

/* somewhere in the code */status = H5Zregister(H5Z_BZIP2);

Currently used

➔ Eiger detector

➔ PyTables

➔ h5py

Could use custom filter algorithms for reading and writing

Page 4: Using Compression filters in HDF5 · Using Compression filters in HDF5 Euge Wintersberger ICALEPCS 2017, 8.10.2017 HDF5s` new external filter interface in action

Eugen Wintersberger | Using compression filters | 8.10.2017 | Page 4

New approach since HDF5 1.8.12

Application

HDF5 library

libLZ4.so

libbitshuffle.so

libBZ2.so

HDF5_PLUGIN_PATH=...

FilterID

The library looks for the appropriate filter by itself using the ID of the filter!

Page 5: Using Compression filters in HDF5 · Using Compression filters in HDF5 Euge Wintersberger ICALEPCS 2017, 8.10.2017 HDF5s` new external filter interface in action

Eugen Wintersberger | Using compression filters | 8.10.2017 | Page 5

Where to get the filter plugins?

Supported platforms

➔ Windows

➔ Linux

➔ macOS

Page 6: Using Compression filters in HDF5 · Using Compression filters in HDF5 Euge Wintersberger ICALEPCS 2017, 8.10.2017 HDF5s` new external filter interface in action

Eugen Wintersberger | Using compression filters | 8.10.2017 | Page 6

Installing the filters – on Windows

Page 7: Using Compression filters in HDF5 · Using Compression filters in HDF5 Euge Wintersberger ICALEPCS 2017, 8.10.2017 HDF5s` new external filter interface in action

Eugen Wintersberger | Using compression filters | 8.10.2017 | Page 7

Install the filters – on Linux (Debian)

Add repository key and sources list

$ wget -q -O - http://repos.pni-hdri.de/debian_repo.pub.gpg | apt-key add -

$ cd /etc/apt/sources.d

$ wget http://repos.pni-hdri.de/jessie-pni-hdri.list

Install the package

$ apt-get update

$ apt-get install hdf5-plugin-lz4

Page 8: Using Compression filters in HDF5 · Using Compression filters in HDF5 Euge Wintersberger ICALEPCS 2017, 8.10.2017 HDF5s` new external filter interface in action

Eugen Wintersberger | Using compression filters | 8.10.2017 | Page 8

Install the filters – on Linux (Ubuntu)

Supported versions

➔ Ubuntu 14.04 (Trusty Tahr)

➔ Ubuntu 16.04 (Xenial Xerus)

Page 9: Using Compression filters in HDF5 · Using Compression filters in HDF5 Euge Wintersberger ICALEPCS 2017, 8.10.2017 HDF5s` new external filter interface in action

Eugen Wintersberger | Using compression filters | 8.10.2017 | Page 9

Install the filters – on macOS

Installing the dependencies$ brew install cmake$ brew install git$ brew install hdf5$ brew install lz4

$ git clone https://github.com/nexusformat/HDF5-External-Filter-Plugins.git$ cd HDF5-External-Filter-Plugins$ git checkout new_build$ cmake -DENABLE_LZ4_PLUGIN=ON -DENABLE_BITSHUFFLE_PLUGIN=ON \ -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=/usr/local/opt/hdf5$ make$ make test$ make install

Build the code

Make installationavailable

Page 10: Using Compression filters in HDF5 · Using Compression filters in HDF5 Euge Wintersberger ICALEPCS 2017, 8.10.2017 HDF5s` new external filter interface in action

Eugen Wintersberger | Using compression filters | 8.10.2017 | Page 10

Using the filter plugins (from Python)

> Reading – there is nothing you have to do

> Writing

import h5py

f = h5py.File("bitshuffle_file.h5","w")filter_id = 32008d1 = f.create_dataset("with_lz4",(100,100),compression=filter_id, compression_opts=(0,2))d2 = f.create_dataset("without_lz4",(100,100),compression=filter_id)

➔ No additional packages must be imported

➔ You need to know

The filters ID

The compression options accepted by the filter

Page 11: Using Compression filters in HDF5 · Using Compression filters in HDF5 Euge Wintersberger ICALEPCS 2017, 8.10.2017 HDF5s` new external filter interface in action

Eugen Wintersberger | Using compression filters | 8.10.2017 | Page 11

Current status

➔ Included filters:

BZIP2

LZ4

LZ4+bitshuffle

➔ Installation packages for:

Windows (VS2015),

Linux (Debian, Ubuntu)

➔ Simplified build for Windows using Conan

Page 12: Using Compression filters in HDF5 · Using Compression filters in HDF5 Euge Wintersberger ICALEPCS 2017, 8.10.2017 HDF5s` new external filter interface in action

Eugen Wintersberger | Using compression filters | 8.10.2017 | Page 12

Todos

➔ Create GitHub pages

➔ Update the documentation

➔ Review of the LZ4 API calls for the new LZ4 1.4 version

➔ BLOSC filter is still missing

➔ Installation packages for

MacOS

RPM based Linux distributions (RedHat, CentOS, …)

Update Debian packages

Page 13: Using Compression filters in HDF5 · Using Compression filters in HDF5 Euge Wintersberger ICALEPCS 2017, 8.10.2017 HDF5s` new external filter interface in action

Eugen Wintersberger | Using compression filters | 8.10.2017 | Page 13

Thank you for your attention!

Questions?