Top Banner
October 15, 2008 HDF and HDF-EOS Workshop XII 1 What will be new in HDF5?
30

October 15, 2008HDF and HDF-EOS Workshop XII1 What will be new in HDF5?

Jan 03, 2016

Download

Documents

Alexia Day
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: October 15, 2008HDF and HDF-EOS Workshop XII1 What will be new in HDF5?

October 15, 2008 HDF and HDF-EOS Workshop XII 1

What will be new in HDF5?

Page 2: October 15, 2008HDF and HDF-EOS Workshop XII1 What will be new in HDF5?

HDF5 Road Map

October 15, 2008 HDF and HDF-EOS Workshop XII 2

Performance

Robustness

Ease of use

Innovation

Page 3: October 15, 2008HDF and HDF-EOS Workshop XII1 What will be new in HDF5?

October 15, 2008 HDF and HDF-EOS Workshop XII 3

Outline

• Performance improvements• Fortran 2003 features• HDF5 file recover (metadata journaling)

Page 4: October 15, 2008HDF and HDF-EOS Workshop XII1 What will be new in HDF5?

October 15, 2008 HDF and HDF-EOS Workshop XII 4

Performance Improvements

Page 5: October 15, 2008HDF and HDF-EOS Workshop XII1 What will be new in HDF5?

Performance Improvements in HDF5

• Examples of completed work:• New implementation of metadata cache to

improve I/O performance and memory usage when accessing many objects (HDF5 1.8.0)

• Faster, more scalable storage and access for large groups (HDF5 1.8.0)

October 15, 2008 HDF and HDF-EOS Workshop XII 5

Page 6: October 15, 2008HDF and HDF-EOS Workshop XII1 What will be new in HDF5?

Performance Improvements in HDF5

• Work in progress• New implementation of free-space management

• Affects “dynamic” applications that add/delete/modify existing objects

• When creating HDF5 objects, space is allocated from available space tracked by the free-space manager

• When deleting objects, unused space is added to the free-space pool via free-space manager

• Current implementation uses O(N2) operations for each N allocations or freeing space

• New implementation O(log2N)

October 15, 2008 HDF and HDF-EOS Workshop XII 6

Page 7: October 15, 2008HDF and HDF-EOS Workshop XII1 What will be new in HDF5?

Free-space Management

• Test: creating/deleting attributes• Create first set of attributes

• Delete odd-numbered attributes from the first set

• Create second set of attributes

• Delete all attributes from the second set

October 15, 2008 HDF and HDF-EOS Workshop XII 7

Number of attributes

Old implementation

New implementation

Improvement ratio

500 786.5 sec 68.2 sec 11.5x

1000 11000 sec 289 sec 38x

Page 8: October 15, 2008HDF and HDF-EOS Workshop XII1 What will be new in HDF5?

Performance Improvements in HDF5

• Work in progress• Fast data append (along slowest changing

dimension)

• Future areas of interest• Efficient chunking cache implementation

(NetCDF4)

• Efficient handling of the variable-length data including compression (will affect sizes of NPOESS files)

October 15, 2008 HDF and HDF-EOS Workshop XII 8

Page 9: October 15, 2008HDF and HDF-EOS Workshop XII1 What will be new in HDF5?

October 15, 2008 HDF and HDF-EOS Workshop XII 9

Fortran 2003 features

Page 10: October 15, 2008HDF and HDF-EOS Workshop XII1 What will be new in HDF5?

Status of the HDF5 Fortran Library

• HDF5 Fortran library is a part of standard HDF5 distribution

• First release goes back to 1999• Implemented as Fortran90 wrappers on top of the

HDF5 C library• Supported on Linux, Windows, Mac Intel, Solaris,

VMS, clusters, etc.• Compilers

• Open source gfortran, g95

• Vendors (SUN, Intel, PGI, Absoft)

• 32 and 64-bit versions

October 15, 2008 HDF and HDF-EOS Workshop XII 10

Page 11: October 15, 2008HDF and HDF-EOS Workshop XII1 What will be new in HDF5?

HDF5 Fortran Library

• Mimics HDF5 C APIs• Fortran 90 features used

• Modules

• Function overloading

• Function interfaces

• Dynamic memory allocation

• Optional parameters• Many Fortran APIs are simpler than their C

counterparts

October 15, 2008 HDF and HDF-EOS Workshop XII 11

Page 12: October 15, 2008HDF and HDF-EOS Workshop XII1 What will be new in HDF5?

Current Limitations

• Supports only “native” Fortran types such as• INTEGER

• REAL

• CHARACTER

• DOUBLE PRECISION (obsolete Fortran feature)

• No support for INTEGER*1, INTEGER*2, INTEGER*8, INTEGER*16, REAL*8, REAL*16• Cannot write/read buffers of those types

• Fortran types have to match C types• No support for –r8 and –r16 flags (Fortran real =/=

C float)

October 15, 2008 HDF and HDF-EOS Workshop XII 12

Page 13: October 15, 2008HDF and HDF-EOS Workshop XII1 What will be new in HDF5?

Current Limitations

• No support for INTEGER(KIND=n) and REAL(KIND=m)

• Integers n and m are called KIND parameters

• Returned byselected _int_kind (r)

-10r < n < 10r

selected_real_kind(p,r)

p – precision

r – decimal exponent range

October 15, 2008 HDF and HDF-EOS Workshop XII 13

Page 14: October 15, 2008HDF and HDF-EOS Workshop XII1 What will be new in HDF5?

Current Limitations

• Limited support for derived types (compare with C structures and HDF5 compound datatypes)• Supports derived types with “native” fields only

• Doesn’t support complex HDF5 datatypes

(e.g., with array member, or nested compound)

• Writes/reads HDF5 compound datasets by fields only

• Cannot be used in parallel applications

• No support for enum types

• No support for callback functions

October 15, 2008 HDF and HDF-EOS Workshop XII 14

Page 15: October 15, 2008HDF and HDF-EOS Workshop XII1 What will be new in HDF5?

Current Limitations - Summary

• Any application written according to Fortran 95/2003 standard will struggle using HDF5 Fortran Library

• Many HDF5 features are not available to Fortran applications

October 15, 2008 HDF and HDF-EOS Workshop XII 15

Page 16: October 15, 2008HDF and HDF-EOS Workshop XII1 What will be new in HDF5?

Fortran 2003 Features

• Fortran 2003 provides a standardized mechanism for interoperating with C• Module ISO_C_BINDING for interoperability of

intrinsic types

• C_PTR and C_FUNCPTR for interoperability with C pointers

• C_LOC(x) and C_FUNLOC(x) inquiry functions for getting addresses of variables and procedures

• BIND attribute for interoperability of derived types and C structures and enumerated types

October 15, 2008 HDF and HDF-EOS Workshop XII 16

Page 17: October 15, 2008HDF and HDF-EOS Workshop XII1 What will be new in HDF5?

Fortran 2003 Features and HDF5

• New 2003 features allowed us to support• Any Fortran INTEGER and REAL type data in

HDF5 files

• Fortran derived types and HDF5 compound datatypes

• Fortran enumerated types and HDF5 enumerated types

• HDF5 APIs with callbacks

October 15, 2008 HDF and HDF-EOS Workshop XII 17

Page 18: October 15, 2008HDF and HDF-EOS Workshop XII1 What will be new in HDF5?

Fortran Compilers with 2003 Features

Compiler Current versions and status of HDF5

Future versions

Intel Versions 10.1 and 11All F2003 functionality works in HDF5

Version 11 will have a fix that will allow us to remove common blocks

g95 August 2008 and later; All Fortran 2003 functionality works in HDF5

gfortran Version 4.4 Limited support for C interoperability; HDF5 doesn’t work

No plans from compiler developers to improve the support

SUN compilers Express-July 2008 build; HDF5 doesn’t work

No timeline for fixes; may be in a year

PGI Version 7.2.1; HDF5 doesn’t work

Fixes will be available in Version 8.

October 15, 2008 HDF and HDF-EOS Workshop XII 18

Page 19: October 15, 2008HDF and HDF-EOS Workshop XII1 What will be new in HDF5?

Example

October 15, 2008 HDF and HDF-EOS Workshop XII 19

Page 20: October 15, 2008HDF and HDF-EOS Workshop XII1 What will be new in HDF5?

Example

October 15, 2008 HDF and HDF-EOS Workshop XII 20

Page 21: October 15, 2008HDF and HDF-EOS Workshop XII1 What will be new in HDF5?

Example

October 15, 2008 HDF and HDF-EOS Workshop XII 21

HDF5 "SDScompound.h5" {GROUP "/" { DATASET "ArrayOfStructures" { DATATYPE H5T_COMPOUND { H5T_ARRAY { [13] H5T_STRING { STRSIZE 1; STRPAD H5T_STR_SPACEPAD; CSET H5T_CSET_ASCII; CTYPE H5T_C_S1; } } "chr_name"; H5T_STD_I8LE "a_name"; H5T_IEEE_F64LE "c_name"; H5T_IEEE_F32LE "b_name"; } …..

Page 22: October 15, 2008HDF and HDF-EOS Workshop XII1 What will be new in HDF5?

Information

October 15, 2008 HDF and HDF-EOS Workshop XII 22

Page 23: October 15, 2008HDF and HDF-EOS Workshop XII1 What will be new in HDF5?

23

HDF5 file recovery orSurviving a System Failure

through Metadata Journaling

October 15, 2008 23HDF and HDF-EOS Workshop XII

Page 24: October 15, 2008HDF and HDF-EOS Workshop XII1 What will be new in HDF5?

October 15, 2008 HDF and HDF-EOS Workshop XII 2424

Surviving a System Failure in HDF5

• Problem:• Data in an opened HDF5 files susceptible to

corruption in the event of an application or system crash.

• Corruption possible if an opened HDF5 file has been updated when the crash occurs.

• Initial Objective:• Guarantee an HDF5 file with consistent metadata

can be reconstructed in the event of a crash.

• No guarantee on state of raw data – contains whatever made it to disk prior to crash.

Page 25: October 15, 2008HDF and HDF-EOS Workshop XII1 What will be new in HDF5?

October 15, 2008 HDF and HDF-EOS Workshop XII 2525

Crash Survivability in HDF5

• Approach: Metadata Journaling• When an HDF5 file is opened with Metadata

Journaling enabled, a companion Journal file is created.

• When an HDF5 API function that modifies metadata is completed, a transaction is recorded in the Journal file. 

• If the application crashes, a recovery program can replay the journal by applying in order all metadata writes until the end of the last completed transaction written to the journal file.

Page 26: October 15, 2008HDF and HDF-EOS Workshop XII1 What will be new in HDF5?

Oct. 16 2008 HDF and HDF-EOS Workshop XII 26

Application crashed

HDF5 Metadata Journaling Recovery

RestoredHDF5 File

h5recover tool

liFe Corrupted 5DFH

Companion Journal File

Page 27: October 15, 2008HDF and HDF-EOS Workshop XII1 What will be new in HDF5?

October 15, 2008 HDF and HDF-EOS Workshop XII 2727

Implementation Status

• Serial HDF5 with synchronous write mode• Alpha1 released August 2008

• User interface (API definition and h5recover tool) and file format may change

Page 28: October 15, 2008HDF and HDF-EOS Workshop XII1 What will be new in HDF5?

October 15, 2008 HDF and HDF-EOS Workshop XII 2828

Metadata Journaling Plans

• Serial HDF5 with synchronous write mode• Finalize User interface definitions and file format

• Serial HDF5 with asynchronous write mode• To improve Journal file write speed

• More features (need funding)• Make raw data operations atomic

• Allow "super‐transactions" to be created by applications

• Enable journaling for Parallel HDF5

Page 29: October 15, 2008HDF and HDF-EOS Workshop XII1 What will be new in HDF5?

October 15, 2008 HDF and HDF-EOS Workshop XII 29

Questions?

Page 30: October 15, 2008HDF and HDF-EOS Workshop XII1 What will be new in HDF5?

Acknowledgement

• This report is based upon work supported in part by a Cooperative Agreement with the National Aeronautics and Space Administration (NASA) under NASA Awards NNX06AC83A and NNX08AO77A. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Aeronautics and Space Administration.

October 16, 2008 30HDF and HDF-EOS Workshop XII