Top Banner
HDF4 and HDF5 Performance Preliminary Results Elena Pourmal IV HDF-EOS Workshop September 19 - 21 2000
33

HDF4 and HDF5 Performance Preliminary Results Elena Pourmal IV HDF-EOS Workshop September 19 - 21 2000.

Dec 21, 2015

Download

Documents

Bruce Foster
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: HDF4 and HDF5 Performance Preliminary Results Elena Pourmal IV HDF-EOS Workshop September 19 - 21 2000.

HDF4 and HDF5 PerformancePreliminary Results

Elena Pourmal

IV HDF-EOS Workshop

September 19 - 21 2000

Page 2: HDF4 and HDF5 Performance Preliminary Results Elena Pourmal IV HDF-EOS Workshop September 19 - 21 2000.

Why compare?

• HDF5 emerges as a new standard– proved to be robust – most of the planned features have been implemented

in HDF5-1.2.2– has a lot of new features compared to HDF4– time for performance study and tuning

• Users move their data and applications to HDF5• HDF4 is not “bad,” but has limited capabilities

Page 3: HDF4 and HDF5 Performance Preliminary Results Elena Pourmal IV HDF-EOS Workshop September 19 - 21 2000.

HDF5 HDF4 • Files over 2GB• Unlimited number of objects• One data model (multidimensional

array of structures)• || support• Thread safe• Mounting files• Diversity of datatypes (compound,

VL, opaque) and operations (create, write, read, delete, shared)

• “Native” file is portable• Modifiable I/O pipe-line

(registration of compression methods)

• Selections (unions and regular blocks)

• Files less than 2GB• Max limit 20000 of objects• Different data models for SD, GR, RI,

Vdatas

• N/A• N/A• N/A• Only predefined datatypes such as

float32, int16, char8

• “Native” file is not portable• N/A

• Selections (simple regular subsampling)

Page 4: HDF4 and HDF5 Performance Preliminary Results Elena Pourmal IV HDF-EOS Workshop September 19 - 21 2000.

What to compare?(short list of common features)

• File I/O operations – plain read and write– hyperslab selections– regular subsampling– access to large number of objects– storage overhead

• Data organization in the file and access to it – Vdata vs compound datasets

• Chunking, unlimited dimensions, compression

Page 5: HDF4 and HDF5 Performance Preliminary Results Elena Pourmal IV HDF-EOS Workshop September 19 - 21 2000.

Benchmark Environment

• 440-Mhz UltraSPARC i-IIi– 1G memory

– Sun OS 5.7

– gettimeofday()

• 2 - 550 Mhz Pentium III Xeon– 1G memory

– RedHat 6.2

– clock()

• each measurement was taken 10 times, average and best times were collected

Page 6: HDF4 and HDF5 Performance Preliminary Results Elena Pourmal IV HDF-EOS Workshop September 19 - 21 2000.

Benchmarks

• Writing 1Dim and 2Dim datasets of integers• Reading 2Dim contiguous hyperslabs of integers• Reading 2Dim contiguous hyperslabs of integers

with subsampling• Reading fixed size hyperslabs of integers from

different locations in the dataset• Writing and reading Vdatas and Compound

Datasets• CERES data

Page 7: HDF4 and HDF5 Performance Preliminary Results Elena Pourmal IV HDF-EOS Workshop September 19 - 21 2000.

Writing 1Dim and 2Dim Datasets

Page 8: HDF4 and HDF5 Performance Preliminary Results Elena Pourmal IV HDF-EOS Workshop September 19 - 21 2000.

Writing 1Dim Datasets

• In this test we created one-dimensional arrays of integers with sizes varying from 8Kbytes to 8000 Kbytes in steps of 8Kbytes. We measured the average and best times for writing these arrays into HDF4 and HDF5 files.

• Test was performed on Solaris platform. Neither HDF4 nor

HDF5 performed data conversion.

Page 9: HDF4 and HDF5 Performance Preliminary Results Elena Pourmal IV HDF-EOS Workshop September 19 - 21 2000.

Writing 1Dim dataset (best time)

00.2

0.40.6

0.81

1.21.4

1.61.8

28

47

2

93

6

14

00

18

64

23

28

27

92

32

56

37

20

41

84

46

48

51

12

55

76

60

40

65

04

69

68

74

32

78

96

Dataset size (Kbytes)

Tim

e (

se

co

nd

s)

HDF4

HDF5

Writing 1Dim Datasets

HDF5 performs about 8 times better than HDF4.System activity affects timing results.

Page 10: HDF4 and HDF5 Performance Preliminary Results Elena Pourmal IV HDF-EOS Workshop September 19 - 21 2000.

Writing 2Dim Datasets

• In this test we created two-dimensional arrays with sizes varying from 40 X 40 bytes to 4000 X 4000 bytes in steps of 40 bytes for each dimension. We measured the average and best times for writing these arrays into HDF4 and HDF5 files. The graphs were plotted by averaging the values obtained for the same array size, without considering the shape of the array.

• Test was performed on Solaris platform. Neither HDF4 nor

HDF5 performed data conversion.

Page 11: HDF4 and HDF5 Performance Preliminary Results Elena Pourmal IV HDF-EOS Workshop September 19 - 21 2000.

Writing 2Dim Datasets (best time)

0

50000

100000

150000

200000

250000

300000

350000

400000

450000

0.3

9

79

.3

18

3

30

2

43

2

57

7

73

2

89

9

10

76

12

66

14

66

16

84

19

20

21

88

24

90

28

83

35

63

Dataset size (Kbytes)

Tim

e (

mic

ros

ec

on

ds

)

HDF4

HDF5

Writing 2Dim Datasets

HDF4 shows nonlinear growth. HDF5 performs about 10 times betterthan HDF4.

Page 12: HDF4 and HDF5 Performance Preliminary Results Elena Pourmal IV HDF-EOS Workshop September 19 - 21 2000.

Reading 2Dim Contiguous Hyperslabs

Page 13: HDF4 and HDF5 Performance Preliminary Results Elena Pourmal IV HDF-EOS Workshop September 19 - 21 2000.

Reading Contiguous Hyperslabs

• In this test we created a file with 1000 X 1000 array of integers. Subsequently, we read hyperslabs of different sizes starting from a fixed position in the array and the measurements for read were averaged over 10 runs. HDF5-1.2.2, HDF5-1.2.2-patched and HDF5 development libraries were tested.

• Test was performed on Solaris platform. Neither HDF4 nor

HDF5 performed data conversion.

Page 14: HDF4 and HDF5 Performance Preliminary Results Elena Pourmal IV HDF-EOS Workshop September 19 - 21 2000.

Hyperslab selection, best time HDF5-1.2.2

0

50000

100000

150000

200000

2500001

00

27

90

0

64

80

0

1E

+0

5

2E

+0

5

2E

+0

5

3E

+0

5

3E

+0

5

4E

+0

5

5E

+0

5

6E

+0

5

7E

+0

5

8E

+0

5

Size of hyperslab (number of elements)

Tim

e (

mic

ros

ec

on

ds

)

HDF4

HDF5

Reading Hyperslabs

For hyperslabs > 1MB, HDF5 becomes more than 3 times slower than HDF4. It also shows nonlinear growth.

Page 15: HDF4 and HDF5 Performance Preliminary Results Elena Pourmal IV HDF-EOS Workshop September 19 - 21 2000.

Hyperslab selection, best time HDF5 development branch

0

20000

40000

60000

80000

1000001

00

27

60

0

64

50

0

1E

+0

5

2E

+0

5

2E

+0

5

3E

+0

5

3E

+0

5

4E

+0

5

5E

+0

5

6E

+0

5

7E

+0

5

8E

+0

5

Size of hyperslab (number of elements)

Tim

e (

mic

ros

ec

on

ds

)

HDF4

HDF5

Reading Hyperslabs (latest version of the HDF5 development branch)

For hyperslabs > 2MB, HDF5 becomes more about 1.5 times slower than HDF4. It still shows nonlinear growth.

Page 16: HDF4 and HDF5 Performance Preliminary Results Elena Pourmal IV HDF-EOS Workshop September 19 - 21 2000.

Reading contiguous hyperslabs(fixed size)

• In this test, the size of the hyperslab was fixed to 100x100 elements. The hyperslab was moved, first along the X axis, then along the Y axis, and finally along the diagonal

and the read performance was measured. • Test was performed on Solaris platform. Neither HDF4 nor

HDF5 performed data conversion.

Page 17: HDF4 and HDF5 Performance Preliminary Results Elena Pourmal IV HDF-EOS Workshop September 19 - 21 2000.

Selection of 100x100 hyperslab (best time)

0

1000

2000

3000

4000

5000

6000

1 2 3 4 5 6 7 8 9 10

Events

Tim

e (

mic

ros

ec

on

ds

)

HDF4

HDF5-1.2.2

HDF5-1.2.2-patched

HDF5 development

Reading 100x100 Hyperslabs from Different Locations

For small hyperslabs HDF5 performs about 3 times better than HDF4.

Page 18: HDF4 and HDF5 Performance Preliminary Results Elena Pourmal IV HDF-EOS Workshop September 19 - 21 2000.

Reading Hyperslabs with Subsampling

Page 19: HDF4 and HDF5 Performance Preliminary Results Elena Pourmal IV HDF-EOS Workshop September 19 - 21 2000.

Subsampling Hyperslabs

• In this test we created a file with 1000x1000 array of integers. Subsequently, we read every second element of the hyperslabs of different sizes starting from a fixed position in the array and the measurements for read were averaged over 10 runs. HDF5-1.2.2, and HDF5 development libraries were tested.

• Test was performed on Solaris platform. Neither HDF4 nor

HDF5 performed data conversion.

Page 20: HDF4 and HDF5 Performance Preliminary Results Elena Pourmal IV HDF-EOS Workshop September 19 - 21 2000.

Hyperslabs with subsampling each second element (best time)

0

5

10

15

20

25

30

35

10

0

89

00

19

60

0

32

00

0

45

50

0

59

50

0

74

70

0

91

00

0

1E

+0

5

1E

+0

5

1E

+0

5

2E

+0

5

2E

+0

5

2E

+0

5

2E

+0

5

3E

+0

5

3E

+0

5

3E

+0

5

Size of hyperslab (number of elements)

Tim

e (

se

co

nd

s)

HDF4

HDF5

Reading Each Second Element of the Hyperslabs

HDF5 shows nonlinear growth. HDF4 performs about 3 times

for the hyperslabs with the size > .5MB

Page 21: HDF4 and HDF5 Performance Preliminary Results Elena Pourmal IV HDF-EOS Workshop September 19 - 21 2000.

Hyperslabs with selection (best time)

0

5

10

15

20

25

301

00

94

00

21

00

0

34

20

0

48

50

0

63

90

0

80

30

0

97

60

0

1E

+0

5

1E

+0

5

2E

+0

5

2E

+0

5

2E

+0

5

2E

+0

5

2E

+0

5

3E

+0

5

Size of hyperslab

Tim

e (

min

ute

s)

HDF4

HDF5

HDF5 (latest)

First Attempt to Improve the Performance

HDF4 still performs 2 times better for the hyperslabs > 2MB.HDF5 shows nonlinear growth.

Page 22: HDF4 and HDF5 Performance Preliminary Results Elena Pourmal IV HDF-EOS Workshop September 19 - 21 2000.

Hyperslab with selection (best time)

02468

1012141618

10

0

85

00

18

60

0

30

00

0

42

70

0

56

40

0

70

80

0

85

80

0

1E

+0

5

1E

+0

5

1E

+0

5

2E

+0

5

2E

+0

5

2E

+0

5

2E

+0

5

2E

+0

5

3E

+0

5

3E

+0

5

Hyperslab size (number of elements)

Tim

e (

se

co

nd

s)

HDF4

HDF5

Current Behavior (HDF5 development branch)

HDF5 growth linear and performs about 10 times better than HDF4.

Page 23: HDF4 and HDF5 Performance Preliminary Results Elena Pourmal IV HDF-EOS Workshop September 19 - 21 2000.

Vdatas vs Compound Datasets

Page 24: HDF4 and HDF5 Performance Preliminary Results Elena Pourmal IV HDF-EOS Workshop September 19 - 21 2000.

Vdatas and Compound Datasets

• In this test we created HDF4 files with Vdata and HDF5 files with compound dataset with sizes from 1000 to 1000000 number of records:

• float a; short b;float c[3]; char d;

• write operation, write with packing data and partial read were tested.

• Test was performed on Linux platforms. We also looked

into data conversion issues.

Page 25: HDF4 and HDF5 Performance Preliminary Results Elena Pourmal IV HDF-EOS Workshop September 19 - 21 2000.

Writing Vdatas and Compound Datasets(average time)

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.81

00

0

38

00

0

75

00

0

1E

+0

5

1E

+0

5

2E

+0

5

2E

+0

5

3E

+0

5

3E

+0

5

3E

+0

5

4E

+0

5

4E

+0

5

4E

+0

5

Number of records (19bytes each)

Tim

e (

in s

ec

on

ds

)

HDF4 native

HDF4 with conversion

HDF5 native

HDF5 with conversion

Conversion does not affect HDF4 performance. It does affectHDF5 ( more than in 15 times)

Writing Data (VSwrite and H5Dwrite)

Page 26: HDF4 and HDF5 Performance Preliminary Results Elena Pourmal IV HDF-EOS Workshop September 19 - 21 2000.

Writing Vdatas and Compound DatasetsEffect of data packing in HDF4 and HDF5

(average time)

0

0.5

1

1.5

2

2.5

3

3.5

10

00

73

00

0

1E

+0

5

2E

+0

5

3E

+0

5

4E

+0

5

4E

+0

5

5E

+0

5

6E

+0

5

6E

+0

5

7E

+0

5

8E

+0

5

9E

+0

5

9E

+0

5

Number of records

Tim

e (

se

co

nd

s)

HDF4

HDF4 with packing

HDF5

HDF5 with packing

Data packing was added to the previous test. For HDF5 we have very small effect.

Writing Data (timing includes packing:VSpack and H5Tpack)

Page 27: HDF4 and HDF5 Performance Preliminary Results Elena Pourmal IV HDF-EOS Workshop September 19 - 21 2000.

Reading Vdatas and Compound datasetsNative read

(average time)

00.10.20.30.40.50.60.70.80.9

1

10

00

79

00

0

2E

+0

5

2E

+0

5

3E

+0

5

4E

+0

5

5E

+0

5

5E

+0

5

6E

+0

5

7E

+0

5

8E

+0

5

9E

+0

5

9E

+0

5

Number of records

Tim

e (

se

co

nd

s)

HDF4

HDF4 without unpckingdata

HDF5

Reading Two Fields

Unpacking slows down HDF4 significantly ( about 8 times)HDF5 was reading packed data in this test.

Page 28: HDF4 and HDF5 Performance Preliminary Results Elena Pourmal IV HDF-EOS Workshop September 19 - 21 2000.

CERES Data File

Page 29: HDF4 and HDF5 Performance Preliminary Results Elena Pourmal IV HDF-EOS Workshop September 19 - 21 2000.

Structure of CERES file

Vgroup CERES_ES8

VgroupGeolocation Fields

VgroupData Fields

SDS SDSVdata Vdata

18 19 2 1

Page 30: HDF4 and HDF5 Performance Preliminary Results Elena Pourmal IV HDF-EOS Workshop September 19 - 21 2000.

Ceres File• Used H4toH5 converter to create an HDF5 version

of the file– 81MB (HDF4), 80MB (HDF5)

– 1 min 55 sec on Linux– 3 min 56 sec on Solaris

• Benchmarks – read up to 14 datasets (2148x660 floats)– subsampling: read two columns from the same datasets

• Benchmark was run on Solaris and Linux platforms

Page 31: HDF4 and HDF5 Performance Preliminary Results Elena Pourmal IV HDF-EOS Workshop September 19 - 21 2000.

Reading CERES data on different platforms(best times)

0

0.5

1

1.5

2

2.5

3

1 2 3 4 5 6 7 8 9 10 11 12 13 14

Number of 2148x660 datasets read

Tim

e (

se

co

nd

s)

HDF4 (LE)

HDF5 (LE)

HDF4 (BE)

HDF5 (BE)

Reading CERES data on big and little - endian machines

On Solaris platform, HDF5 was twice faster than HDF4.On Linux (data conversion is on), HDF4 was about 1.3-1.5 faster.

Page 32: HDF4 and HDF5 Performance Preliminary Results Elena Pourmal IV HDF-EOS Workshop September 19 - 21 2000.

Selection of two columns from 2148x660 CERES dataset

(best times)

0

5

10

15

20

25

1 2 3 4 5 6 7 8 9 10 11 12 13 14

Number of datasets

Tim

e (

se

co

nd

s)

HDF4

HDF5

HDF5 tuned

Subsetting CERES Data

Current version of HDF5 shows about 3 times better performance.

Page 33: HDF4 and HDF5 Performance Preliminary Results Elena Pourmal IV HDF-EOS Workshop September 19 - 21 2000.

Conclusion

• Goal: tune HDF5 and give our users recommendations on its efficient usage

• Continue to study HDF4 and HDF5 performance– try more platforms: O2K, NT/Windows– try other features (e.g. chunking, compression)– specific HDF5 features (e.g. writing/reading big files, VL

datatypes, compound datatypes, selections)

• Users input is necessary, send us access patterns you use!

• Results will be available @http://hdf.ncsa.uiuc.edu