02/18/14 HDF and HDF-EOS Workshop X, Landover, MD 1 HDF5 Tools Update Peter Cao The HDF Group [email protected]November 28, 2006 This report is based upon work supported in part by a Cooperative Agreement with NASA under NASA NNG05GC60A. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Aeronautics and Space Administration.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
02/18/14 HDF and HDF-EOS Workshop X, Landover, MD 1
This report is based upon work supported in part by a Cooperative Agreement with NASA under NASA NNG05GC60A. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Aeronautics and Space Administration.
02/18/14 HDF and HDF-EOS Workshop X, Landover, MD5
h5dump
Dumps the content of an HDF5 file to stdout and optionally to following types of files
• ASCII text file• XML file• Binary file (new feature)
02/18/14 HDF and HDF-EOS Workshop X, Landover, MD6
h5dump -H SDS.h5
HDF5 "SDS.h5" {
GROUP "/" {
GROUP "Floats" {
DATASET "FloatArray" {
DATATYPE H5T_IEEE_F32LE
DATASPACE SIMPLE { ( 4, 3 ) / ( 4, 3 ) }
}
}
DATASET "IntArray" {
DATATYPE H5T_STD_I32LE
DATASPACE SIMPLE { ( 5, 6 ) / ( 5, 6 ) }
}
}
}
02/18/14 HDF and HDF-EOS Workshop X, Landover, MD7
h5dump -d /Floats/FloatArray SDS.h5
HDF5 "SDS.h5" {
DATASET "/Floats/FloatArray" {
DATATYPE H5T_IEEE_F32LE
DATASPACE SIMPLE { ( 4, 3 ) / ( 4, 3 ) }
DATA {
(0,0): 0.01, 0.02, 0.03,
(1,0): 0.1, 0.2, 0.3,
(2,0): 1, 2, 3,
(3,0): 10, 20, 30
}
}
}
02/18/14 HDF and HDF-EOS Workshop X, Landover, MD8
h5dump -x SDS.h5
02/18/14 HDF and HDF-EOS Workshop X, Landover, MD9
h5dump Binary Output
-b F, --binary=F
The form of the binary output (F):• MEMORY -- for memory type• FILE -- for the disk file type • LE -- for pre-defined little endian type• BE -- for pre-defined big endian type
02/18/14 HDF and HDF-EOS Workshop X, Landover, MD10
02/18/14 HDF and HDF-EOS Workshop X, Landover, MD14
h5repack
Copies an HDF5 file to a new file with/without compression/chunking• Remove un-used space• Apply compression filter• Apply layout
02/18/14 HDF and HDF-EOS Workshop X, Landover, MD15
h5repack: new filters
-f FILTER• GZIP, to apply GZIP compression• SZIP, to apply SZIP compression• SHUF, to apply the HDF5 shuffle filter• FLET, to apply the HDF5 checksum filter• NBIT, to apply NBIT compression• SOFF, to apply the HDF5 Scale/Offset filter• NONE, to remove all filters
For exampleh5repack -i SDS2.h5 -o SDS2_compressed.h5 /IntArray:GZIP=9
02/18/14 HDF and HDF-EOS Workshop X, Landover, MD16
h5repack: data layout
-l LAYOUT• CHUNK, to apply chunking layout• COMPA, to apply compact layout• CONTI, to apply continuous layout
For exampleh5repack -i SDS.h5 -o SDS_chunk.h5
-l /Floats/FloatArray,/IntArray:CHUNK=2x3
02/18/14 HDF and HDF-EOS Workshop X, Landover, MD17
new h5repack: using H5Ocopy()
0.27
13.52
0.03
8.32
17.30 18.83
80.97
15.34
30.18
41.6248.39
32.06
0.00
10.00
20.00
30.00
40.00
50.00
60.00
70.00
80.00
90.00
attrs20k.h5
dataset_comp_80000k.h5
float16kx16k_chunk512x512_deflate9.h5
float16kx16k_chunk512x512.h5
groups10k.h5 int16kx16k.h5
New h5repack Old h5repack
02/18/14 HDF and HDF-EOS Workshop X, Landover, MD18
h5repart
Repartitions a file or family of files
For exampleh5repart -m 200m int16kx16k.h5 part200m%d.h5
977 MB
200 MB part200m0.h5
200 MB part200m1.h5
200 MB part200m2.h5
200 MB part200m3.h5
177 MB part200m1.h5
02/18/14 HDF and HDF-EOS Workshop X, Landover, MD19
h5import
Imports binary/ASCII data into an HDF5 file• h5import infile -c config_file [infile -c config_file2 ...] -outfile outfile
For eaxmpleh5import float5x4x2.txt -c First_set.conf -o First_set.h5
02/18/14 HDF and HDF-EOS Workshop X, Landover, MD26
h5copy
usage: h5copy [OPTIONS] [OBJECTS...] -i, --input input file name -o, --output output file name -s, --source source object name -d, --destination destination object name
-f, --flagshallow Copy only immediate members for groupssoft Expand soft links into new objectsext Expand external links into new objectsref Copy objects that are pointed by referencesnoattr Copy object without copying attributes
02/18/14 HDF and HDF-EOS Workshop X, Landover, MD27
h5copy
For exampleh5copy -i SDS.h5 -o SDS_cp.h5 -s /Floats/FloatArray -d /FloatArray
/
FloatArray
FloatsIntArray
/
FloatArray
SDS.h5
SDS_cp.h5
02/18/14 HDF and HDF-EOS Workshop X, Landover, MD28
h5copy -f shallow
/
i1
floatsintegers
64-bit
i2
f32 f2f1
/
floats
64-bitf32
f2f1
/
floats
64-bitf32
-f shallow
02/18/14 HDF and HDF-EOS Workshop X, Landover, MD29
h5copy -f soft
/
-f soft
dset_SL
/f1/f1
f1
/
dset_SL
/f1/f1
f1
/
dset_SL
/f1/f1
02/18/14 HDF and HDF-EOS Workshop X, Landover, MD30
h5copy -f ref
/
-f ref
d1
dset_ref
d2
1895
763
/
d1
dset_ref
d2
679
1287
/
dset_ref
0
0
02/18/14 HDF and HDF-EOS Workshop X, Landover, MD31
h5copy todo
• Fix references embedded in compound datatype• Follow external links• Test functionalities• Test performance
02/18/14 HDF and HDF-EOS Workshop X, Landover, MD32
h5stat
• Prints different statistics about HDF5 file• Helps
• To troubleshoot size overhead in HDF5 files• To choose specific object’s properties and storage
strategies
02/18/14 HDF and HDF-EOS Workshop X, Landover, MD33
h5check
A validation tool that verifies if an HDF5 file is encoded according to the HDF5 File Format Specification
02/18/14 HDF and HDF-EOS Workshop X, Landover, MD34
Why is it needed?
• Verify if the file is compliant with the File Format to ensure the data model integrity and long term compatibility between evolving versions of the HDF5 library
• As a verification tool required by the application of HDF5 File Format to be an ANSI standard
• Serves as a watch dog that the HDF5 library implementation is compliant with the File Format
02/18/14 HDF and HDF-EOS Workshop X, Landover, MD35
What does it do?
Given a file, it scans through the encoded content against the defined File Format• If it finds any non-compliance, it prints out the error
and reason of non-compliance.• After finding any non-compliance, it tries to continue
scanning the file if possible.• Eventually, it exits with non-zero.• If it does not find any non-compliance, it prints out an
approval statement at the end and exits with zero.
02/18/14 HDF and HDF-EOS Workshop X, Landover, MD36
How is it implemented? (1/2)
• The tool is coded from scratch and does not use the formal HDF5 library API calls
• It does not link with the HDF5 library at all• It may borrow coding, including algorithms or data
structure from the HDF5 library source code but after close verification that they are in compliance with the File Format.
02/18/14 HDF and HDF-EOS Workshop X, Landover, MD37
How is it implement? (2/2)
• It links external libraries that HDF5 library uses. E.g.,• Zlib• szlib
02/18/14 HDF and HDF-EOS Workshop X, Landover, MD38
How to use it?
• H5check [-vn] <filename>-vn verboseness mode
n=0 Terse—only prints if the file is compliant or not
n=1 Default—prints its progress and all errors found
n=2 Verbose—prints everything it knows, usually for debugging
02/18/14 HDF and HDF-EOS Workshop X, Landover, MD39
Example: a compliant file
• % h5check example1.h5• VALIDATING example1.h5• FOUND super block signature• VALIDATING the super block at 0...• VALIDATING the object header at 928...• VALIDATING the btree at 384...• FOUND btree signature.• VALIDATING the local heap at 96...• FOUND local heap signature.• …• Result: File is in compliance.
02/18/14 HDF and HDF-EOS Workshop X, Landover, MD40
Example: a non-compliant file
h5check invalid2.h5• FOUND super block signature• VALIDATING the super block at 0...• VALIDATING the object header at 928...• VALIDATING the btree at 384...• FOUND btree signature.• VALIDATING the SNOD at 1248...• FOUND SNOD signature.• VALIDATING the object header at 976...• check_sym(at 1248): Errors from check_obj_header()• decode_validate_messages(): Failure in type->decode().• H5O_sdspace_decode(): Bad version number in simple dataspace message.• VALIDATING the local heap at 96...• FOUND local heap signature.• Main(): Errors from check_obj_header().• decode_validate_messages(): Failure in type->decode().• H5O_attr_decode(): Can't decode attribute dataspace.• H5O_sdspace_decode(): Bad version number in simple dataspace message.• …• Result: File is not in compliance.
02/18/14 HDF and HDF-EOS Workshop X, Landover, MD41
Implementation Status
• All basic File Format components are implemented• Coding recognition of HDF5 files created by non-
default Virtual File Driver such as the Multi-File format
• Alpha release planned in December 2006
02/18/14 HDF and HDF-EOS Workshop X, Landover, MD42
h5ub
• Combine nub, h5jam and h5unjam• nub -- NPOESS user block tool for HDF5 files