Top Banner
www.hdfgroup.org The HDF Group HDF5 Overview Elena Pourmal [email protected] The HDF Group 1 10/17/15 ICALEPCS 2015
58

HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

Aug 12, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

The HDF Group

HDF5 Overview

Elena Pourmal [email protected]

The HDF Group

1 10/17/15 ICALEPCS 2015

Page 2: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

Outline

•  The HDF Group company •  Products and services

•  Overview of HDF5 •  What is coming in HDF5 1.10.0 release? •  Future directions

2 10/17/15 ICALEPCS 2015

Page 3: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

THE HDF GROUP COMPANY

3 10/17/15 ICALEPCS 2015

Page 4: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

Champaign, Illinois, USA

4 10/17/15 ICALEPCS 2015

Page 5: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

The HDF Group

www.hdfgroup.org •  Not-for-profit company (since 2006), ex-NCSA

at University of Illinois •  Offices in 5 states •  About 40 employees (more than 50% growth

in the past 9 years) - Core software developers - Domain specialists - Documentation team - Technical support

•  Mission-driven

5 10/17/15 ICALEPCS 2015

Page 6: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

The HDF Group Mission

To ensure long-term accessibility of HDF data through sustainable development and support of HDF technologies.

10/17/15 6 ICALEPCS 2015

Page 7: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

The HDF Group philosophy

•  Committed to Open Source •  HDF software is free •  BSD type of license •  Community involvement

•  Testing •  Patches • New features (e.g., CMake support)

•  Serving diverse user base •  Remote sensing, HPC, non-destructive testing,

medical records, scientific modeling, etc.

7 10/17/15 ICALEPCS 2015

Page 8: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

Revenue by Source

8 10/17/15

62%  

0%  3%  

28%  

0%   3%  4%  

2014  

Earth  science  

Finance  

General  

HPC  

Oil  &  gas  

Par<cle  science  

Na<onal  Labs  

Light  Sources  

NASA,  NOAA  

ICALEPCS 2015

Page 9: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

Revenue by Project Type

10/17/15 9

Consul,ng  8%  

Development  24%  

Enterprise  support  45%  

Premium  support  1%  

R&D  22%  

Training  and  other  outreach  

0%  

Revenues    by  type  of  project  

ICALEPCS 2015

Page 10: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

PRODUCTS AND SERVICES

10 10/17/15 ICALEPCS 2015

Page 11: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

The HDF Group products

•  Main product: HDF Technology Suite - For managing high volume complex,

heterogeneous data - Flagship: HDF5 data store - Flexible and efficient storage and I/O - Portable - Highly customizable - Misc. tools

- Specialized software and tools (e.g., JPSS)

11 10/17/15 ICALEPCS 2015

Page 12: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

HDF5 IN 5 MINUTES Data challenges addressed by HDF5

12 10/17/15 ICALEPCS 2015

Page 13: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

HDF5 Technology Platform

•  HDF5 Abstract Data Model •  Defines the “building blocks” for data organization and specification •  Files, Groups, Links, Datasets, Attributes, Datatypes, Dataspaces

•  HDF5 Software •  Tools •  Language Interfaces (C, Fortran, C++, Java) •  HDF5 Library

•  HDF5 Binary File Format •  Bit-level organization of HDF5 file •  Defined by HDF5 File Format Specification

•  HDF5 Ecosystem •  Tools and services (h5py, MATLAB, IDL, OPeNDAP, etc.) •  Communities (Earth Sciences, medical imaging, modeling and

visualization) •  Community standards (NeXus, HDF-EOS5, h5part, CGNS) •  Institutional support and endorsement (NASA, NOAA, DOE)

13 10/17/15 ICALEPCS 2015

Page 14: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

Members of the HDF community

14 10/17/15 ICALEPCS 2015

Page 15: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

Success stories

•  Petabytes of NASA remote sensing data in HDF4 and HDF5 file formats •  New NASA/JPSS missions chose HDF5 format

for data archiving

15 10/17/15

Need to organize complex collections of data

Long term data preservation

Efficient, scalable storage and access

   

lat  |  lon  |  temp  -­‐-­‐-­‐-­‐|-­‐-­‐-­‐-­‐-­‐|-­‐-­‐-­‐-­‐-­‐    12  |    23  |    3.1    15  |    24  |    4.2    17  |    21  |    3.6  

   

   

ICALEPCS 2015

Page 16: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

Success story: Trillion Particle Simulation

16 10/17/15

•  Physics plasma simulation at NERSC Cray XE6 •  Simulation ran on 120,000 cores using

80% of computing resources 90% of available memory 50% of Lustre scratch system and writing 10 one-trillion particle dumps of 30-42 TBs in HDF5 files; sustained ~ 27 GB/sec; total 350 TBs in HDF5

ICALEPCS 2015

Page 17: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

The HDF Group services

•  Helpdesk and mailing lists - [email protected] - [email protected] - Open to all users of HDF

•  HDF5 Documentation https://www.hdfgroup.org/HDF5/doc/index.html •  HDF Examples (C, Fortran, C++, Java,

Python, MATLAB) https://www.hdfgroup.org/HDF5/examples/

17 10/17/15 ICALEPCS 2015

Page 18: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

The HDF Group services

•  Standard support •  Assistance in general areas of HDF usage

•  Premium support •  Access to our consulting and training resources •  Limited consulting hours are included

•  Enterprise support •  Help with developing common strategies for

managing HDF data within organization •  Organization shares consulting/troubleshooting

services •  Training •  Consulting, custom development and support

18 10/17/15 ICALEPCS 2015

Page 19: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

HDF5 1.10.0 RELEASE New Upcoming Features

19 10/17/15 ICALEPCS 2015

Page 20: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

PERSISTENT FILE FREE SPACE TRACKING

Reusing free file space in a file

20 10/17/15 ICALEPCS 2015

Page 21: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

Unused space in HDF5 file

•  HDF5 library currently only tracks free space while file is open •  Space from deleted objects •  Space from resized compressed chunks

•  Free space in the file is “lost” after file is closed •  h5repack is used to remove “holes” in the file •  New function H5Pset_file_space

•  Sets a property to track free space in the file that can be reused when file is reopened

•  Allows fine tuning space tracking

21 10/17/15 ICALEPCS 2015

Page 22: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

SCALABLE CHUNK INDEXING Improving performance and saving space

22 10/17/15 ICALEPCS 2015

Page 23: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

Optimizing chunking storage and performance

•  HDF5 has an ability to add more data to existing datasets (data arrays)

•  Special storage mechanism – chunked storage •  B-trees are used to index chunks in the file

•  O(log n) lookup time •  HDF5 takes advantage of the access pattern

and properties of the datasets •  O(1) lookup time •  File space savings when storing HDF5

metadata

23 10/17/15 ICALEPCS 2015

Page 24: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

Optimizing chunking storage and performance

•  B-tree implementation was reworked to use less space in the file •  Used for datasets with more than one unlimited

dimension •  New indexing structures were introduced to

achieve O(1) performance and storage savings in special cases

24 10/17/15 ICALEPCS 2015

Page 25: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

Optimizing chunking storage and performance

•  Examples of O(1) lookup access: •  Fixed-size chunked dataset with no

compression filters •  Algorithmic lookup

•  Fixed-size chunked dataset with compression filters •  Array to index chunks

•  Fixed-size dataset stored in one chunk (i.e., we now allow compression for contiguous dataset) • No index

•  Dataset with one unlimited dimension •  Extensible array to index chunks

25 10/17/15 ICALEPCS 2015

Page 26: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

CONCURRENCY: SINGLE-WRITER/MULTIPLE-READER

26 10/17/15 ICALEPCS 2015

Page 27: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

Concurrent Access to Data

10/17/15 27

HDF5 File

Writer Reader

…which can be read by a reader…

with no IPC necessary.  

New data elements

… are added to a dataset in the file…

ICALEPCS 2015

Page 28: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

VIRTUAL DATASET (VDS) Managing data stored across HDF5 files

28 10/17/15 ICALEPCS 2015

Page 29: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

4 granules in 9 GMODO-SVM07… files

29 10/17/15

VDS Use Case with NPP satellite data

Visualization with IDV ICALEPCS 2015

Page 30: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

30 10/17/15

One virtual dataset with 36 granules stored in one file

VDS Use Case with NPP satellite data

Visualization with IDV ICALEPCS 2015

Page 31: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

VDS use case: Percival detector

10/17/15

31

Series  of  images  

a.h5   b.h5   c.h5  

B   C   D  

d.h5  

Virtual  Dataset  VDS  has  images  A,  B,  C  and  D  interleaved  

VDS.h5  

Dataset  B   Dataset  C   Dataset  D  

A  

C   D  

A  B  

t1  t2  t3  

t4  

t3+4k  

t1+4k  

Dataset  A  

reader

writer writer writer writer

Page 32: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

VDS: Conceptual View

10/17/15 32

Page 33: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

METADATA CACHE IMAGE Performance boost when opening and closing HDF5 files

33 10/17/15 ICALEPCS 2015

Page 34: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

Problem: Metadata Cache Image

!  HDF5 metadata is typically small and scattered throughout the file.

!  Resulting many small I/Os a major problem for parallel file systems.

!  Metadata cache minimizes this during normal operation, but must still populate cache on file open, and flush it on file close.

!  Problem if files are opened and closed often.

10/17/15 34 ICALEPCS 2015

Page 35: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

Solution: Metadata Cache Image

!  Store the contents of the metadata cache in a single block at file close, and then populate the cache with the stored entries on file open.

!  If access pattern is similar over close and reopen, should save a significant number of small I/O operations.

!  This solution is implemented in the metadata cache image feature.

10/17/15 35 ICALEPCS 2015

Page 36: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

Metadata Cache Image

!  To enable, set cache image FAPL property on file create or open:

H5AC_cache_image_config_t cache_image_config = {H5AC__CURR_CACHE_IMAGE_CONFIG_VERSION, TRUE, 0};

fapl_id = H5Pcreate(H5P_FILE_ACCESS); H5Pset_libver_bounds(fapl_id, H5F_LIBVER_LATEST,

H5F_LIBVER_LATEST); H5Pset_mdc_image_config(fapl_id, &cache_image_config);

!  Then create or open file as usual.

10/17/15 36 ICALEPCS 2015

Page 37: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

Metadata Cache Image

!  Metadata cache image is read and deleted automatically on file open.

!  Must set cache image FAPL property again if a new cache image is desired on file close.

!  Earlier versions of HDF5 that don't understand the cache image will refuse to open the file.

!  One can use a light-weight utility to remove caching info making file compatible with 1.8

!  Prototype implementation showed order of magnitude speedup on parallel systems

10/17/15 37 ICALEPCS 2015

Page 38: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

DATA AGGREGATION AND PAGE BUFFERING

Performance imporvemnts

38 10/17/15 ICALEPCS 2015

Page 39: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

Page buffering/ Data aggregation

10/17/15 39

Aggregate and align metadata and small data, perform I/O in aligned pages

Page 40: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

Data and Metadata Aggregators

The new aggregators pack small raw data and metadata allocations into aligned blocks which work with the page buffer.

10/17/15 40

HDF5  File  

Metadata  Data  

       

Small  alloca<ons  

ICALEPCS 2015

Page 41: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

HDF5 Page Buffering

10/17/15 41

Page  buffer  contains  MD  pages  (L2  cache)  

HDF5  File  

Metadata  blocks  are  mul<ples  of  64K  

Metadata  blocks  are  aligned  

Page 42: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

IMPROVEMENTS FOR PARALLEL ACCESS

HDF5 Parallel

42 10/17/15 ICALEPCS 2015

Page 43: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org 10/17/15 43

Problems We Solved for PHDF5

•  Slowness on opening and closing HDF5 files " Metadata Cache Optimizations

- Avoiding the Metadata Read Storm - Collective Metadata Writes

" Avoid Truncate Feature •  Writing/reading multiple variables

" Collective I/O on multiple datasets or Multi-Dataset I/O •  I/O on selections bigger than 2GB with MPICH 3.1.4 •  Page Buffering

" Page Buffering - a layer under the VFD to capture small I/Os and cache them for larger paged size I/Os.

ICALEPCS 2015

Page 44: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

Metadata reads with CGNS and netCDF-4

10/17/15 44

CGNS  reads  on  Blue  Gene,  GPFS  

netCDF-­‐4  reads  on  Cray  XE6  ,  GPFS  

ICALEPCS 2015

Page 45: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

Collective I/O on multiple datasets

10/17/15 45

•  Two  new  rou<nes  H5Dread_mul<()  and  H5Dwrite_mul<()  

The  plot  shows  the  performance  difference  between  using  a  single  H5Dwrite()  mul<ple  <mes  and  using  H5Dwrite_mul<  ()  once  on  30  chunked  datasets  on Cray  XE-­‐6  with  Lustre  file  system  (hopper).  

ICALEPCS 2015

Page 46: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

BACKWARD/FORWARD COMPATIBILITY ISSUES

HDF5 1.10.0

10/17/15 46 ICALEPCS 2015

Page 47: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

Backward/Forward compatibility issues

10/17/15 47

•  HDF5 1.10.0 will always read files created by the earlier versions

•  HDF5 1.10.0 by default will create files that can be read by HDF5 1.8.*

•  HDF5 1.10.0 will create files incompatible with 1.8 version if new features are used

•  Tools to “downgrade” the file created by HDF5 1.10.0 "  h5format_convert (SWMR files; doesn’t rewrite raw data) "  h5repack (VDS, SWMR and other; does rewrite data)

Page 48: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

EXPLORING NEW DIRECTIONS Examples

48 10/17/15 ICALEPCS 2015

Page 49: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

HDF5 ODBC Driver

•  Open DataBase Connectivity (ODBC)

•  Industry standard middleware API for accessing database management sys.

•  All analytics apps. have an ODBC client

•  HiFive – ODBC driver for HDF5 •  Windows, [Linux, MacOS X] •  Client & Client/Serve •  Accessing HDF5 files from Excel & R

49 10/17/15

Thanks to Gerd Heber, THG ICALEPCS 2015

Page 50: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

HDF5 for the Web

•  Can I access HDF5 files remotely? •  API? My (mobile) client speaks HTTP! •  What is a file system? Who uses files

anymore? •  Cloud computing w/ HDF5

50 10/17/15

Thanks to John Readey, THG

ICALEPCS 2015

Page 51: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

Emerging Trends in Exascale I/O

10/17/15 51

•  Characteristics of Exascale Application I/O •  Application I/O will be object-oriented, not file-

based •  Application I/O will be asynchronous •  Applications responsible for managing I/O

conflicts •  Applications use transactional I/O model

•  The HDF Group has been working with Intel and others on the Fast Forward Project to investigate and contribute to those trends

ICALEPCS 2015

Page 52: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

HDF5 role in the Fast Forward Storage Stack

•  Object storage •  Virtual Object Layer (VOL)

•  Data Integrity/ Fault Tolerance •  Transaction •  End-to-end checksums

•  Data Analysis Extensions •  Query/View/Index APIs •  Analysis Shipping

•  AIO (some prototyping was done in the past)

10/17/15 52 ICALEPCS 2015

Page 53: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

HDF5 as an interface to non-HDF5 storage

10/17/15 53

h`ps://wiki.hpdd.intel.com/display/PUB/Fast+Forward+Storage+and+IO+Program+Documents  

ICALEPCS 2015

Page 54: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

HDF5 as an interface to non-HDF5 storage

10/17/15 54

•  Different File Formats plugins:

ICALEPCS 2015

Page 55: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

DATA INDEXING Features we are investigating

55 10/17/15 ICALEPCS 2015

Page 56: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

Indexing and HDF5

10/17/15 56

•  New  APIs  for  indexing  and  querying  of  both  structure  and  contents  of  HDF5  file

•  H5Q API defines  query  to  apply  to  a  file    Create/combine queries (OR, AND)

•  Basic operators supported (≤ , ≥ ,=, ≠ ) on either dataset/attribute values, link/attribute names

•  HDF5V API retrieves data •  HDF5X API adds third-party indexing

plugins

Page 57: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

Example: Combined query

10/17/15 57

Page 58: HDF5 Overviewcontrols.diamond.ac.uk/downloads/other/files... · The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community

www.hdfgroup.org

The HDF Group

Thank You!

Questions?

10/17/15 58 ICALEPCS 2015