1 HDF HDF HDF5 Advanced Topics HDF5 Advanced Topics Object’s Properties Object’s Properties Storage Methods and Storage Methods and Filters Filters Datatypes Datatypes HDF and HDF-EOS Workshop VIII HDF and HDF-EOS Workshop VIII October 26, 2004 October 26, 2004
56
Embed
HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
• Storage layout is important for I/O performance and size of the HDF5 files
• Contiguous (default)• Used when data will be written/read at once• H5Dcreate(…,H5P_DEFAULT)
• Compact • Used for small datasets (order of O(bytes)) for better I/O• Raw data is written/read at the time when dataset is open• File is less fragmented• To create a compact dataset follow the ‘Properties
programming model’
16 HDFHDF
Creating Compact DatasetCreating Compact Dataset
• Create a dataset creation property list• Set property list to use compact storage layout• Create dataset with the above property list
• Create a dataset creation property list• Set chunking (and specify chunk dimensions)• Set compression method• Create dataset with the above property list
• Dataset’s raw data is stored in an external file• Easy to include existing data into HDF5 file• Easy to export raw data if application needs it• Disadvantage: user has to keep track of additional files
Example of External FilesExample of External Files
This example shows how a contiguous, one-dimensional dataset is partitioned into three parts and each of those parts is stored in a segment of an external file.
for error detection. • It is automatically included in HDF5• To use this filter you must add it to the filter pipeline
with H5Pset_filter.
Checksum value
Memory
27 HDFHDF
Enabling Checksum FilterEnabling Checksum Filter
• Create a dataset creation property list• Set chunking (and specify chunk dimensions)• Add the filter to the pipeline• Create your dataset specifying this property list• Close property list
• Create a dataset creation property list• Set chunking (and specify chunk dimensions)• Add the filter to the pipeline• Define compression filter• Create your dataset specifying this property list• Close property list
• H5Pset_edc_check– For datasets created with error detection filter enabled
– Enables error checking during read operation
– H5Z_ENABLE_EDC (default)
– N5Z_DISABLE_EDC
• H5Pset_dxpl_mpio– Sets data transfer mode for parallel I/O
– H5FD_MPIO_INDEPENDENT (default)
– H5FD_MPIO_COLLECTIVE
35 HDFHDF
User-defined FiltersUser-defined Filters
36 HDFHDF
Standard Interface for User-defined FiltersStandard Interface for User-defined Filters
• H5Zregister : Register filter so that HDF5
knows about it• H5Zunregister: Unregister a filter• H5Pset_filter: Adds a filter to the filter pipeline• H5Pget_filter: Returns information about a filter
in the pipeline• H5Zfilter_avail: Check if filter is available
37 HDFHDF
File Creation PropertiesFile Creation Properties
38 HDFHDF
File Creation PropertiesFile Creation Properties
• H5Pset_userblock– User block stores user-defined information (e.g ASCII text to describe a file) at
the beginning of the file– Cat my.txt hdf5.h5 > myhdf5.h5– Sets the size of the user block – 512 bytes, 1024 bytes, 2^N
• H5Pset_sizes– Sets the byte size of the offsets and lengths used to address objects in the file
• H5Pset_sym_k– Controls the rank of groups B-trees for groups – Default is 16
• H5Pset_istore_k– Controls the rank of groups B-trees for chunked datasets– Default is 32
• H5Pset_cache– Sets metadata cache and raw data chunk parameters– Improper size will degrade performance
• H5Pset_meta_block_size– Reduces the number of small objects in the file– Block of metadata is written in a single I/O operation (default 2K)– VFL driver has to set H5FD_AGGREGATE_METADATA
• H5Pset_sieve_buffer– Improves partial I/O– Need a picture
• VFL layer: file drivers
41 HDFHDF
File Access Properties (Physical storage File Access Properties (Physical storage and Usage of Low-level I/O Libraries)and Usage of Low-level I/O Libraries)
• VFL layer: file drivers
• Define physical storage of the HDF5 file– Memory driver (HDF5 file in the application’s memory)
• Use datatype to create a dataset/attribute, to write/read dataset/attribute, to set fill value
• (Optional) Save datatype in the file• Close
52 HDFHDF
HDF5 Compound DatatypesHDF5 Compound Datatypes
• Compound types– Comparable to C structs – Members can be atomic or compound types – Members can be multidimensional– Can be written/read by a field or set of fields– Non all data filters can be applied (shuffling, SZIP)– H5Tcreate(H5T_COMPOUND), H5Tinsert calls to
create a compound datatype– See H5Tget_member* functions for discovering
properties of the HDF5 compound datatype
53 HDFHDF
•Data
Time•Data
•Data
•Data
•Data
•Data
•Data
•Data
•Data
Time
HDF5 Fixed and Variable length array HDF5 Fixed and Variable length array storagestorage