Top Banner
© 2006 The MathWorks, Inc. Implementing HDF5 in MATLAB Jeff Mather & Alec Rogers The MathWorks, Inc. 29 November 2006
25

Implementing HDF5 in MATLAB

Dec 18, 2014

Download

Technology

The MathWorks introduced MATLAB support for HDF5 in 2002 via three high-level functions: HDF5INFO, HDF5READ, and HDF5WRITE. These functions worked well for their purpose-providing simple interfaces to a complicated file format-but MATLAB users requested finer control over their HDF5 files and the HDF5 library. MATLAB 7.3 (R2006b) adds this precise level of support for version 1.6.5 of the HDF5 library via a close mapping of the HDF5 C API to MATLAB function calls.

This presentation will briefly introduce the earlier, high-level HDF5 interface (and its limitations) before showing in detail the low-level HDF5 functions. It will show how to interact with the HDF5 library and files using the thirteen classes of functions in MATLAB, which encapsulate groupings of functionality found in the HDF5 C API. But because MATLAB is itself a higher-level language than C, we will also present MATLAB's extensions and modifications of the HDF5 C API that make it more MATLAB-like, work with defined values, and perform ID and memory management.

Wrapping a library like HDF5 requires a great deal of effort and design, and we will briefly present a general-purpose mechanism for creating close mappings between library interfaces and an application like MATLAB. One of our goals in this presentation is to facilitate communication with The HDF Group about how The MathWorks builds our HDF5 interfaces in order to ease adoption of future versions of the HDF5 library in large, general-purpose applications.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Implementing HDF5 in MATLAB

© 2

006

The

Mat

hWor

ks, I

nc.

Implementing HDF5 in MATLAB

Jeff Mather & Alec Rogers

The MathWorks, Inc.

29 November 2006

Page 2: Implementing HDF5 in MATLAB

2

HDF4

1-1 mapping of C API first. (1998)

Customer requests for high-level functions.

HDFREAD, HDFWRITE, HDFINFO. (2000)

Page 3: Implementing HDF5 in MATLAB

3

HDF5

High-level first. (2003)

Customer requests for lower-level functionality.

1-1 mapping of C API. (2006)

Page 4: Implementing HDF5 in MATLAB

4

HDF4 / HDF5 APIs

API Supported by MATLAB

Customerapplication

Customerapplication

Customerapplication

Customerapplication

The World of HDF Applications

High-level access functions

Page 5: Implementing HDF5 in MATLAB

5

HDF5READ

DATA = HDF5READ(FILENAME,DATASETNAME) returns in the variable DATA all data from the file FILENAME for the data set named DATASETNAME.

DATA has to be extremely general because of the wide variety of datatypes that HDF5 accomodates.

Simple access only:● No subsetting.● Limited datatype control.

More control needed to match the uniqueness of customer datasets and files.

Page 6: Implementing HDF5 in MATLAB

6

HDF5INFO

FILEINFO = HDF5INFO(FILENAME) returns a structure whose fields contain information about the contents of an HDF5 file. FILENAME is a string that specifies the name of the HDF file.

Page 7: Implementing HDF5 in MATLAB

7

HDF5WRITE

HDF5WRITE(FILENAME, LOCATION, DATASET) adds the data in DATASET to the HDF5 file named FILENAME. LOCATION defines where to write the DATASET in the file and resembles a Unix-style path. The data in DATASET is mapped to HDF5 datatypes using the rules below. . . .

HDF5WRITE is completely symmetric with HDF5READ.

The values in DATASET are cumbersome for non-native MATLAB types (e.g., arrays, compound, and references).

Objects disambiguate datatypes.

Page 8: Implementing HDF5 in MATLAB

8

Customer HDF5 Requests

Library upgrades (1.4.5, 1.6.4, 1.6.5, 1.8) Better support for large data Hyperselection, chunking New platform support (Solaris 64, MacIntel) GZIP, SZIP compression HDF5 file interrogation Bitfield, date/time datatypes Data translators: HDF5 --> MATLAB

Page 9: Implementing HDF5 in MATLAB

9

Use Cases

Read parts of an HDF5 dataset (a hyperslab).

Page 10: Implementing HDF5 in MATLAB

10

Use Cases

Read complicated datatypes without the overhead of MATLAB objects for datasets.

mydata(1).Data(1).Data(1)

Page 11: Implementing HDF5 in MATLAB

11

Use Cases

Allow users to extend our HDF5 functionality without waiting for us.

Page 12: Implementing HDF5 in MATLAB

12

Use Cases

Be able to drop in new versions of the HDF5 library when they become available.

HDF5 1.8

Page 13: Implementing HDF5 in MATLAB

13

Use Cases

Use a variety of esoteric HDF5 features at once:

“I'm trying to use HDF5 files [with] grouping features like compound data types, group links, and reference data types.”

Page 14: Implementing HDF5 in MATLAB

14

Schedule

First draft specificationSept. 2005

Feature completeMay 2006

Final specificationFeb. 2006

Internal design reviews

March 2006

Iterative development

Page 15: Implementing HDF5 in MATLAB

15

MATLAB is not C

[out1, out2] = function(in1, in2);

status = function(in1, in2, &out1, &out2);

Hmm . . . I might throw an error.

MATLAB

C

Page 16: Implementing HDF5 in MATLAB

16

MATLAB is not C

mxArray

void * p_realvoid * p_complexsize_t dims[]size_t ndimsmxCLASS_ID type...

...

Page 17: Implementing HDF5 in MATLAB

17

The Interface

H5Xfcn

C API

H5X.fcn

MATLAB API

AD

FG

ST

...

× 12ML

hid_t identifiers

Identifier objects

Exceptions

Page 18: Implementing HDF5 in MATLAB

18

Special MATLAB Functions

H5ML.compare_values

H5ML.get_constant_names

H5ML.get_constant_value

H5ML.get_function_names

H5ML.get_mem_datatype

H5ML.hoffset

H5ML.sizeof

Page 19: Implementing HDF5 in MATLAB

19

Library Model

HDF5 Library

LibraryConstants

LibraryProcedures

ProceduresParameters

Datatype Conversions

Lefthand to righthand mapping

Page 20: Implementing HDF5 in MATLAB

20

Implementing the HDF5 Library

Step 1: Determine auto vs. manual conversion Step 2: Convert .h to .xml Step 3: Convert XML to C++ Step 4: Code manual functions Step 5: Integrate Step 6: Test

Page 21: Implementing HDF5 in MATLAB

21

The conversion process

Page 22: Implementing HDF5 in MATLAB

22

Converting XML to C++

// Definition#define ADD_PROCEDURE_1_5(name,pfn,ret,a1,a2,a3,a4,a5) \ addMethod(new LibraryProcedure_1_5< LibraryParameter_T<ret>, \ LibraryParameter_T<a1>, \ LibraryParameter_T<a2>, \ LibraryParameter_T<a3>, \ LibraryParameter_T<a4>, \ LibraryParameter_T<a5> > \ (name, atts, pfn));

Page 23: Implementing HDF5 in MATLAB

23

Converting XML to C++

// Usage (x ~220 functions)atts.init(0,1,5,5);atts.setParamFlags(0, ParameterAttributes::OUTPUT, 1);atts.setParamFlags(1, ParameterAttributes::INPUT | ParameterAttributes::STRING_CONVERT, 1);atts.setParamFlags(2, ParameterAttributes::INPUT, 1);atts.setParamFlags(3, ParameterAttributes::INPUT | ParameterAttributes::STRING_CONVERT, 1);atts.setParamFlags(4, ParameterAttributes::INPUT | ParameterAttributes::STRING_CONVERT, 1);atts.setParamFlags(5, ParameterAttributes::INPUT | ParameterAttributes::STRING_CONVERT, 1);ADD_PROCEDURE_1_5("H5Acreate", H5Acreate, hid_t, hid_t, const char *, hid_t, hid_t, hid_t);

Page 24: Implementing HDF5 in MATLAB

24

The HDF Group and The MathWorks

Continue to communicate future directions. Don't change the existing API functions. Communicate API functionality changes. Produce a machine parsable version of hdf5.h.

Page 25: Implementing HDF5 in MATLAB

25

The Future

HDF5 MAT-File

● 64-bits for large arrays● Data subsetting on load● Type conversion on load● Parallel I/O?