Top Banner
1 HDF HDF HDF5 Advanced Topics HDF5 Advanced Topics Object’s Properties Object’s Properties Storage Methods and Storage Methods and Filters Filters Datatypes Datatypes HDF and HDF-EOS Workshop VIII HDF and HDF-EOS Workshop VIII October 26, 2004 October 26, 2004
56

HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

Dec 27, 2015

Download

Documents

Horatio Byrd
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

1 HDFHDF

HDF5 Advanced TopicsHDF5 Advanced TopicsObject’s PropertiesObject’s Properties

Storage Methods and FiltersStorage Methods and FiltersDatatypesDatatypes

HDF and HDF-EOS Workshop VIIIHDF and HDF-EOS Workshop VIII

October 26, 2004October 26, 2004

Page 2: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

2 HDFHDF

TopicsTopics

General Introduction to HDF5 properties HDF5 Dataset properties

I/O and Storage Properties (filters)

HDF5 File properties I/O and Storage Properties (drivers)

Datatypes Compound Variable Length Reference to object and dataset region

Page 3: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

3 HDFHDF

General Introduction to General Introduction to HDF5 PropertiesHDF5 Properties

Page 4: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

4 HDFHDF

PropertiesPropertiesDefinitionDefinition

• Mechanism to control different features of the HDF5 objects– Implemented via H5P Interface (‘Property lists’)

– HDF5 Library sets objects’ default features

– HDF5 ‘Property lists’ modify default features• At object creation time (creation properties)

• At object access time (access or transfer properties)

Page 5: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

5 HDFHDF

PropertiesPropertiesDefinitionsDefinitions

• A property list is a list of name-value pairs– Values may be of any datatype

• A property list is passed as an optional parameters to the HDF5 APIs

• Property lists are used/ignored by all the layers of the library, as needed

Page 6: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

6 HDFHDF

Type of PropertiesType of Properties

• Predefined and User defined property lists• Predefined:

– File creation

– File access

– Dataset creation

– Dataset access

• Will cover each of these

Page 7: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

7 HDFHDF

Properties (Example)Properties (Example)HDF5 FileHDF5 File

• H5Fcreate(…,creation_prop_id,…)• Creation properties (how file is created?)

– Library’s defaults• no user’s block• predefined sizes of offsets and addresses of the objects in the

file (64-bit for DEC Alpha, 32-bit on Windows)

– User’s settings • User’s block • 32-bit sizes on 64-bit platform• Control over B-trees for chunking storage (split factor)

Page 8: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

8 HDFHDF

Properties (Example)Properties (Example)HDF5 FileHDF5 File

• H5Fcreate(…,access_prop_id)• Access properties or drivers (How is file accessed?

What is the physical layout on the disk?)– Library defaults

• STDIO Library (UNIX fwrite, fread)

– User’s defined• MPI I/O for parallel access• Family of files (100 Gb HDF5 represented by 50 2Gb UNIX

files)• Size of the chunk cache

Page 9: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

9 HDFHDF

Properties (Example)Properties (Example)HDF5 DatasetHDF5 Dataset

• H5Dcreate(…,creation_prop_id)• Creation properties (how dataset is created)

– Library’s defaults• Storage: Contiguous• Compression: None• Space is allocated when data is first written• No fill value is written

– User’s settings • Storage: Compact, or chunked, or external • Compression• Fill value• Control over space allocation in the file for raw data

– at creation time– at write time

Page 10: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

10 HDFHDF

Properties (Example)Properties (Example)HDF5 DatasetHDF5 Dataset

• H5Dwrite<read>(…,access_prop_id)• Access (transfer) properties

– Library defaults• 1MB conversion buffer• Error detection on read (if was set during write)• MPI independent I/O for parallel access

– User defined• MPI collective I/O for parallel access• Size of the datatype conversion buffer• Control over partial I/O to improve performance

Page 11: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

11 HDFHDF

Properties Properties Programming modelProgramming model

• Use predefined property type– H5P_FILE_CREATE – H5P_FILE_ACCESS– H5P_DATASET_CREATE– H5P_DATASET_ACCESS

• Create new property instance– H5Pcreate – H5Pcopy– H5*get_access_plist; H5*get_create_plist

• Modify property (see H5P APIs)• Use property to modify object feature• Close property when done

– H5Pclose

Page 12: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

12 HDFHDF

PropertiesPropertiesProgramming modelProgramming model

• General model of usage: get plist, set values, pass to library

hid_t plist = H5Pcreate(copy)(predefined_plist); OR hid_t plist = H5Xget_create(access)_plist(…);

H5Pset_foo( plist, vals);

H5Xdo_something( Xid, …, plist);

H5Pclose(plist);

Page 13: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

13 HDFHDF

HDF5 Dataset Creation HDF5 Dataset Creation Properties and Predefined Properties and Predefined

FiltersFilters

Page 14: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

14 HDFHDF

Dataset Creation PropertiesDataset Creation Properties

• Storage– Contiguous (default)– Compact – Chunked – External

• Filters applied to raw data– Compression– Checksum

• Fill value• Space allocation for raw data in the file

Page 15: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

15 HDFHDF

Dataset Creation Properties Dataset Creation Properties Storage LayoutsStorage Layouts

• Storage layout is important for I/O performance and size of the HDF5 files

• Contiguous (default)• Used when data will be written/read at once• H5Dcreate(…,H5P_DEFAULT)

• Compact • Used for small datasets (order of O(bytes)) for better I/O• Raw data is written/read at the time when dataset is open• File is less fragmented• To create a compact dataset follow the ‘Properties

programming model’

Page 16: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

16 HDFHDF

Creating Compact DatasetCreating Compact Dataset

• Create a dataset creation property list• Set property list to use compact storage layout• Create dataset with the above property list

plist = H5Pcreate(H5P_DATASET_CREATE); H5Pset_layout(plist, H5D_COMPACT); dset_id = H5Dcreate (…, “Compact”,…, plist); H5Pclose(plist);

Page 17: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

17 HDFHDF

Creating chunked DatasetCreating chunked Dataset

• Chunked layout is needed for– Extendible datasets– Compression and other filters– To improve partial I/O for big datasets

Better subsetting access time; extendiblechunked

Only two chunks will be written/read

Page 18: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

18 HDFHDF

Creating Chunked DatasetCreating Chunked Dataset

• Create a dataset creation property list• Set property list to use chunked storage layout• Create dataset with the above property list

plist = H5Pcreate(H5P_DATASET_CREATE); H5Pset_chunk(plist, rank, ch_dims); dset_id = H5Dcreate (…, “Chunked”,…, plist); H5Pclose(plist);

Page 19: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

19 HDFHDF

Dataset Creation Properties Dataset Creation Properties Compression and other I/O Pipeline FiltersCompression and other I/O Pipeline Filters

• HDF5 provides a mechanism (“I/O filters”) to manipulate data while transferring it between memory and disk

• H5Z and H5P interfaces• HDF5 predefined filters (H5P interface)

– Compression (gzip, szip)– Shuffling and checksum filters

• User defined filters (H5Z and H5P interfaces)– Example: Bzip2 compression

http://hdf.ncsa.uiuc.edu/HDF5/papers/bzip2

Page 20: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

20 HDFHDF

Compression and other I/O Pipeline FiltersCompression and other I/O Pipeline Filters(continued)(continued)

• Currently used only with chunked datasets• Filters can be combined together

– GZIP + shuffle+checksum filters– Checksum filter + user define encryption filter

• Filters are called in the order they are defined on writing and in the reverse order on reading

• User is responsible for “filters pipeline sanity”– GZIP +SZIP + shuffle doesn’t make sense– Shuffle + SZIP does

Page 21: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

21 HDFHDF

Creating compressed DatasetCreating compressed Dataset

• Compression– Improves transmission speed– Improves storage efficiency– Requires chunking– May increase CPU time needed for compression

Compressed

Memory File

Page 22: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

22 HDFHDF

Creating compressed datasetsCreating compressed datasets

• Create a dataset creation property list• Set chunking (and specify chunk dimensions)• Set compression method• Create dataset with the above property list

plist = H5Pcreate(H5P_DATASET_CREATE); H5Pset_chunk (plist, ndims, chkdims);

H5Pset_deflate (plist, level); /*GZIP */ OR

H5Pset_szip (plist, options-mask, numpixels);/*SZIP*/

dset_id = H5Dcreate (file_id, “comp-data”, “H5T_NATIVE_FLOAT,space_id, plist);

Page 23: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

23 HDFHDF

Creating external DatasetCreating external Dataset

• Dataset’s raw data is stored in an external file• Easy to include existing data into HDF5 file• Easy to export raw data if application needs it• Disadvantage: user has to keep track of additional files

to preserve integrity of the HDF5 file

Metadata for “A”

Dataset “A”

HDF5 fileHDF5 file

External fileExternal file

Raw data for “ARaw data for “A””

Raw data can be stored in external file

Page 24: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

24 HDFHDF

Creating External DatasetCreating External Dataset

• Create a dataset creation property list• Set property list to use external storage layout• Create dataset with the above property list

plist = H5Pcreate(H5P_DATASET_CREATE); H5Pset_external(plist,

“raw_data.ext”, offset, size); dset_id = H5Dcreate (…, “Chunked”,…, plist); H5Pclose(plist);

Page 25: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

25 HDFHDF

Example of External FilesExample of External Files

This example shows how a contiguous, one-dimensional dataset is partitioned into three parts and each of those parts is stored in a segment of an external file.

plist = H5Pcreate (H5P_DATASET_CREATE);HPset_external (plist, “raw.data”, 3000, 1000);H5Pset_external (plist, “raw.data”, 0, 2500);H5Pset_external (plist, “raw.data”, 4500, 1500);

Page 26: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

26 HDFHDF

Checksum FilterChecksum Filter

• HDF5 includes the Fletcher32 checksum algorithm

for error detection. • It is automatically included in HDF5• To use this filter you must add it to the filter pipeline

with H5Pset_filter.

Checksum value

Memory

Page 27: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

27 HDFHDF

Enabling Checksum FilterEnabling Checksum Filter

• Create a dataset creation property list• Set chunking (and specify chunk dimensions)• Add the filter to the pipeline• Create your dataset specifying this property list• Close property list

plist = H5Pcreate(H5P_DATASET_CREATE); H5Pset_chunk (plist, ndims, chkdims); H5Pset_filter (plist, H5Z_FILTER_FLETCHER32, 0, 0, NULL); H5Dcreate (…,”Checksum”,…,plist) H5Pclose(plist);

Page 28: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

28 HDFHDF

Shuffling filterShuffling filter

• Predefined HDF5 filter• Not a compression; change of byte order in a stream of data• Example

– 1 23 43

• Hexadecimal form– 0x01 0x17 0x2B

• Big-endian machine– 0x00 0x00 0x00 0x01 0x00 0x00 0x00 0x17 0x00 0x00 0x00 0x2B

• Shuffling– 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x01 0x17 0x2B

Page 29: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

29 HDFHDF

00 00 00 01 00 00 00 17 00 00 00 2B

00 00 00 00 00 00 01 17 2B

00 00 00 01 00 00 00 17 00 00 00 2B

00 00 00

Page 30: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

30 HDFHDF

Enabling Shuffling FilterEnabling Shuffling Filter

• Create a dataset creation property list• Set chunking (and specify chunk dimensions)• Add the filter to the pipeline• Define compression filter• Create your dataset specifying this property list• Close property list

plist = H5Pcreate(H5P_DATASET_CREATE); H5Pset_chunk (plist, ndims, chkdims); H5Pset_shuffle(plist); H5Pset_deflate(plist,level); H5Dcreate (…,”BetterComp”,…,plist) H5Pclose(plist);

Page 31: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

31 HDFHDF

Effect of data shuffling (H5Pset_shuffle Effect of data shuffling (H5Pset_shuffle + H5Pset_deflate)+ H5Pset_deflate)

File size Total time Write Time

No Shuffle 102.9MB 671.049 629.45

Shuffle 67.34MB 83.353 78.268

Compression combined with shuffling provides•Better compression ratio•Better I/O performance

• Write 4-byte integer dataset 256x256x1024 (256MB)• Using chunks of 256x16x1024 (16MB)• Values: random integers between 0 and 255

Page 32: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

32 HDFHDF

HDF5 Dataset Access (Transfer) HDF5 Dataset Access (Transfer) PropertiesProperties

Page 33: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

33 HDFHDF

Dataset Access/Transfer PropertiesDataset Access/Transfer Properties

• Improve performance• H5Pset_buffer

– Sets the size of the datatype conversion buffer during I/O

– Size should be large enough to hold the slice along the slowest changing dimension

– Example: Hyperslab 100x200x300, buffer 200x300

• H5Pset_hyper_vector_size– Sets the number of hyperslab offset and length pairs– Improves performance for partial I/O

Page 34: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

34 HDFHDF

Dataset Access/Transfer PropertiesDataset Access/Transfer Properties

• H5Pset_edc_check– For datasets created with error detection filter enabled

– Enables error checking during read operation

– H5Z_ENABLE_EDC (default)

– N5Z_DISABLE_EDC

• H5Pset_dxpl_mpio– Sets data transfer mode for parallel I/O

– H5FD_MPIO_INDEPENDENT (default)

– H5FD_MPIO_COLLECTIVE

Page 35: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

35 HDFHDF

User-defined FiltersUser-defined Filters

Page 36: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

36 HDFHDF

Standard Interface for User-defined FiltersStandard Interface for User-defined Filters

• H5Zregister : Register filter so that HDF5

knows about it• H5Zunregister: Unregister a filter• H5Pset_filter: Adds a filter to the filter pipeline• H5Pget_filter: Returns information about a filter

in the pipeline• H5Zfilter_avail: Check if filter is available

Page 37: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

37 HDFHDF

File Creation PropertiesFile Creation Properties

Page 38: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

38 HDFHDF

File Creation PropertiesFile Creation Properties

• H5Pset_userblock– User block stores user-defined information (e.g ASCII text to describe a file) at

the beginning of the file– Cat my.txt hdf5.h5 > myhdf5.h5– Sets the size of the user block – 512 bytes, 1024 bytes, 2^N

• H5Pset_sizes– Sets the byte size of the offsets and lengths used to address objects in the file

• H5Pset_sym_k– Controls the rank of groups B-trees for groups – Default is 16

• H5Pset_istore_k– Controls the rank of groups B-trees for chunked datasets– Default is 32

Page 39: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

39 HDFHDF

File Access PropertiesFile Access Properties

Page 40: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

40 HDFHDF

File Access Properties (Performance)File Access Properties (Performance)

• H5Pset_cache– Sets metadata cache and raw data chunk parameters– Improper size will degrade performance

• H5Pset_meta_block_size– Reduces the number of small objects in the file– Block of metadata is written in a single I/O operation (default 2K)– VFL driver has to set H5FD_AGGREGATE_METADATA

• H5Pset_sieve_buffer– Improves partial I/O– Need a picture

• VFL layer: file drivers

Page 41: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

41 HDFHDF

File Access Properties (Physical storage File Access Properties (Physical storage and Usage of Low-level I/O Libraries)and Usage of Low-level I/O Libraries)

• VFL layer: file drivers

• Define physical storage of the HDF5 file– Memory driver (HDF5 file in the application’s memory)

– Stream driver (HDF5 file written to a socket)

– Split(multi) files driver

– Family driver

• Define low level I/O library– MPI I/O driver for parallel access

– STDIO vs. SEC2

Page 42: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

42 HDFHDF

Files needn’t be files - Virtual File LayerFiles needn’t be files - Virtual File Layer

VFL: A public API for writing I/O drivers

memorympiostdio

Hid_t

Files Memory

““File” HandleFile” Handle

I/O drivers

network

Network

VFL: Virtual File I/O LayerVFL: Virtual File I/O Layer

““Storage”Storage”

splitfamily

SRB

SRB Repository

Page 43: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

43 HDFHDF

Split FilesSplit Files• Allows you to split metadata and data into separate files

• May reside on different file systems for better I/O

• Disadvantage: User has to keep track of the files

Dataset “A”

Dataset “B” Data A

Data B

Metadata file Raw data file

HDF5 file

Page 44: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

44 HDFHDF

Creating Split FilesCreating Split Files

• Create a file access property list• Set up file access property list to use split files• Create the file with this property list• Close the property

plist = H5Pcreate (H5P_FILE_ACCESS);

H5Pset_fapl_family(plist, “.met”, H5P_DEFAULT,”.dat”, H5P_DEFAULT);

file = H5Fcreate (H5FILE_NAME, H5F_ACC_TRUNC, H5P_DEFAULT,

plist);H5Pclose(plist);

Page 45: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

45 HDFHDF

File FamiliesFile Families

• Allows you to access files larger than 2GB on file systems that don't support large files

• Any HDF5 file can be split into a family of files and vice versa

• A family member size must be a power of two

Page 46: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

46 HDFHDF

Creating a File FamilyCreating a File Family

• Create a file access property list

• Set up file access property list to use file family

• Create the file with this property list

plist = H5Pcreate (H5P_FILE_ACCESS);

H5Pset_fapl_family (plist, family_size, H5P_DEFAULT);

file = H5Fcreate (H5FILE_NAME, H5F_ACC_TRUNC,

H5P_DEFAULT, plist);

H5Pclose(plist);

Page 47: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

47 HDFHDF

HDF5 DatatypesHDF5 Datatypes

Page 48: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

48 HDFHDF

DatatypesDatatypes

• A datatype is– A classification specifying the interpretation of

a data element– Specifies for a given data element

• the set of possible values it can have• the operations that can be performed• how the values of that type are stored

– May be shared between different datasets in one file

Page 49: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

49 HDFHDF

HDF5 datatypesHDF5 datatypes

• Atomic types– standard integer & float– user-definable scalars (e.g. 13-bit integer)– bitfields– variable length types (e.g. strings)– pointers - references to objects/dataset regions– enumeration - names mapped to integers

Page 50: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

50 HDFHDF

General Operations on HDF5 DatatypesGeneral Operations on HDF5 Datatypes

• Create – H5Tcreate creates a datatype of the HT_COMPOUND, H5T_OPAQUE,

and H5T_ENUM classes

• Copy– H5Tcopy creates another instance of the datatype; can be applied to any

datatypes

• Commit– H5Tcommit creates an Datatype Object in the HDF5 file; comitted

datatype can be shared between different datatsets

• Open– H5Topen opens the datatypes stored in the file

• Close– H5Tclose closes datatype object

Page 51: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

51 HDFHDF

Programming model for HDF5 DatatypesProgramming model for HDF5 Datatypes

• Use predefined HDF5 types – No need to close

• OR– Create

• Create a datatype (by copying existing one or by creating from the one of H5T_COMPOUND(ENAUM,OPAQUE) classes)

• Create a datatype by queering datatype of a dataset

– Open committed datatype from the file

• (Optional) Discover datatype properties (size, precision, members, etc.)

• Use datatype to create a dataset/attribute, to write/read dataset/attribute, to set fill value

• (Optional) Save datatype in the file• Close

Page 52: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

52 HDFHDF

HDF5 Compound DatatypesHDF5 Compound Datatypes

• Compound types– Comparable to C structs – Members can be atomic or compound types – Members can be multidimensional– Can be written/read by a field or set of fields– Non all data filters can be applied (shuffling, SZIP)– H5Tcreate(H5T_COMPOUND), H5Tinsert calls to

create a compound datatype– See H5Tget_member* functions for discovering

properties of the HDF5 compound datatype

Page 53: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

53 HDFHDF

•Data

Time•Data

•Data

•Data

•Data

•Data

•Data

•Data

•Data

Time

HDF5 Fixed and Variable length array HDF5 Fixed and Variable length array storagestorage

Page 54: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

54 HDFHDF

HDF5 Variable Length DatatypesHDF5 Variable Length DatatypesProgramming issuesProgramming issues

• Each element is represented by C struct typedef struct {

size_t length;

void *p;

} hvl_t;

• Base type can be any HDF5 type

Page 55: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

55 HDFHDF

HDF5 Variable Length DatatypesHDF5 Variable Length Datatypes

Global heapGlobal heap

Dataset with variable length datatypeDataset with variable length datatype

Raw dataRaw data

Page 56: HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.

56 HDFHDF

HDF InformationHDF Information

• HDF Information Center– http://hdf.ncsa.uiuc.edu/

• HDF Help email address– [email protected]

• HDF users mailing list– [email protected]