Top Banner
Towards Long-Term Archiving of NASA HDF-EOS and HDF Data Data Maps and the Use of Mark-Up Language QuickTime™ and a TIFF (Uncompressed) decompres are needed to see this pict Duerr, Mike Folk, Muqun Yang, Chris Lynnes, Peter
18

Towards Long-Term Archiving of NASA HDF-EOS and HDF Data - Data Maps and the Use of Mark-Up Language

Dec 18, 2014

Download

Technology

The Hierarchical Data Format (HDF) has been a data format standard in NASA's Earth Observing System Data and Information System (EOSDIS) since the 1990s. Its rich structure, platform independence, full-featured Application Programming Interface (API), and internal compression make it very useful for archiving science data and utilizing them with a rich set of software tools. However, a key drawback for long-term archiving is the complex internal byte layout of HDF files, requiring one to use the API to access HDF data. This makes the long-term readability of HDF data for a given version dependent on long-term allocation of resources to support that version.

The majority of the data from NASA's Earth Observing System (EOS) have been archived in HDF Version 4 (HDF4) format. To address the long-term archival issues for these data a collaborative study between The HDF Group and NASAs EOSDIS data centers is underway. One of the first activities undertaken has been an assessment of the range of HDF4 formatted data held by NASA to determine the capabilities inherent in the HDF format that have been used in practice. Based on the results of this assessment, methods for producing a map of the layout of the HDF Version 4 files held by NASA will be prototyped using a markup-language-based HDF tool to map the layout of the HDF Version 4 files. The resulting maps should allow a separate program to read the file without recourse to the HDF API. To verify this, two independent tools based solely on the map files will be developed and tested with a variety of data products archived by NASA.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Towards Long-Term Archiving of NASA HDF-EOS and HDF Data - Data Maps and the Use of Mark-Up Language

Towards Long-Term Archiving of NASA HDF-EOS and HDF Data

Data Maps and the Use of Mark-Up Language

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Ruth Duerr, Mike Folk, Muqun Yang, Chris Lynnes, Peter Cao

Page 2: Towards Long-Term Archiving of NASA HDF-EOS and HDF Data - Data Maps and the Use of Mark-Up Language

Presented at the HDF and HDF-EOS Workshop XI - Nov. 6-8, 2007Landover, Maryland

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Outline

• Background

• Data Mapping Project Description

• Plans and Early Results

Page 3: Towards Long-Term Archiving of NASA HDF-EOS and HDF Data - Data Maps and the Use of Mark-Up Language

Presented at the HDF and HDF-EOS Workshop XI - Nov. 6-8, 2007Landover, Maryland

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Outline

• Background

• Data Mapping Project Description

• Plans and Early Results

Page 4: Towards Long-Term Archiving of NASA HDF-EOS and HDF Data - Data Maps and the Use of Mark-Up Language

Presented at the HDF and HDF-EOS Workshop XI - Nov. 6-8, 2007Landover, Maryland

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

A Concern

• The majority of the data from NASA’s Earth Observing System (EOS) have been archived in HDF Version 4 (HDF4) or HDF-EOS 2 format.

• HDF files have a complex internal byte layout, requiring one to use the API to access HDF data

• Long-term readability of HDF data depends on long-term allocation of resources to support the API

Page 5: Towards Long-Term Archiving of NASA HDF-EOS and HDF Data - Data Maps and the Use of Mark-Up Language

Presented at the HDF and HDF-EOS Workshop XI - Nov. 6-8, 2007Landover, Maryland

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

A Proposal from the Workshop Last Year

• Chris Lynnes noted that What was needed was a map to the

contents of an HDF file The output of the HDF4 tools (e.g., hdfls,

hdp, etc.) already provide much of the information needed

Extending these tools to create a map to the contents of the file might be feasible

Page 6: Towards Long-Term Archiving of NASA HDF-EOS and HDF Data - Data Maps and the Use of Mark-Up Language

Presented at the HDF and HDF-EOS Workshop XI - Nov. 6-8, 2007Landover, Maryland

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Outline

• Background

• Data Mapping Project Description

• Plans and Early Results

Page 7: Towards Long-Term Archiving of NASA HDF-EOS and HDF Data - Data Maps and the Use of Mark-Up Language

Presented at the HDF and HDF-EOS Workshop XI - Nov. 6-8, 2007Landover, Maryland

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Data Mapping Project Description

• Assess and categorize NASA holdings of HDF4 data

• Investigate methods of mapping HDF4 files• Develop requirements for tools to create

maps of HDF4 files• Create a prototype tool to create maps• Test the utility of these maps by developing 2

independent tools that use the maps to read real data

Page 8: Towards Long-Term Archiving of NASA HDF-EOS and HDF Data - Data Maps and the Use of Mark-Up Language

Presented at the HDF and HDF-EOS Workshop XI - Nov. 6-8, 2007Landover, Maryland

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Data Mapping Project Description (continued)

• Assess the utility of this approach

• Document our findings

• Present results and options for proceeding to the user community

• Evaluate the effort required for a full solution that meets community needs

• Submit a proposal for that effort

Page 9: Towards Long-Term Archiving of NASA HDF-EOS and HDF Data - Data Maps and the Use of Mark-Up Language

Presented at the HDF and HDF-EOS Workshop XI - Nov. 6-8, 2007Landover, Maryland

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Outline

• Background

• Data Mapping Project Description

• Plans and Early Results

Page 10: Towards Long-Term Archiving of NASA HDF-EOS and HDF Data - Data Maps and the Use of Mark-Up Language

Presented at the HDF and HDF-EOS Workshop XI - Nov. 6-8, 2007Landover, Maryland

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Assess and Categorize NASA Holdings

While the volume of NASA data stored in HDF4/HDF-EOS2 format is measured in PB; the fraction of the total number of NASA data sets archived in HDF4/ HDF-EOS2 is “small”

• NASA provided a starter list of data sets held

• NASA data centers were requested to provide a list at a project briefing

• Results from each DAAC being compared to ECHO assessment of data sets using a .hdf extension

Page 11: Towards Long-Term Archiving of NASA HDF-EOS and HDF Data - Data Maps and the Use of Mark-Up Language

Presented at the HDF and HDF-EOS Workshop XI - Nov. 6-8, 2007Landover, Maryland

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Assess and Categorize NASA Holdings (continued)

• Examples of each of the hdf4 data sets have been obtained and examined*

• Information kept summarized below:

• Product id/name• Data Center• Product Version• Multi-file product?• HDF/EOS info (if any)

HDF/EOS version Point info Swath info Grid info

• HDF info Version Raster image info Palette SDS info V data info Annotation

* For the most part

Page 12: Towards Long-Term Archiving of NASA HDF-EOS and HDF Data - Data Maps and the Use of Mark-Up Language

Presented at the HDF and HDF-EOS Workshop XI - Nov. 6-8, 2007Landover, Maryland

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Assess and Categorize NASA Holdings (continued)

• Very preliminary findings Roughly 50/50 split between HDF-EOS

and plain HDF Point data is relatively rare and when found

is not accompanied by swath or grid data No indexes yet While a few products use the image types,

there are no palettes yet

Page 13: Towards Long-Term Archiving of NASA HDF-EOS and HDF Data - Data Maps and the Use of Mark-Up Language

Presented at the HDF and HDF-EOS Workshop XI - Nov. 6-8, 2007Landover, Maryland

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Investigate Methods of Mapping HDF4 Files

• NSIDC and GES-DISC have provided THG sample data files• Preliminary priorities for capabilities to tackle:

Contiguous SDS Contiguous SDS with unlimited dimension Chunked SDS Compressed SDS Chunked and compressed SDS SDS and attributes Vdata and attributes Annotation Vgroup Raster image and attributes

Page 14: Towards Long-Term Archiving of NASA HDF-EOS and HDF Data - Data Maps and the Use of Mark-Up Language

Presented at the HDF and HDF-EOS Workshop XI - Nov. 6-8, 2007Landover, Maryland

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Investigate Methods of Mapping HDF4 Files

• NSIDC and GES-DISC have provided THG sample data files• Preliminary priorities for capabilities to tackle:

Contiguous SDS Contiguous SDS with unlimited dimension Chunked SDS Compressed SDS Chunked and compressed SDS SDS and attributes Vdata and attributes Annotation Vgroup Raster image and attributes

Page 15: Towards Long-Term Archiving of NASA HDF-EOS and HDF Data - Data Maps and the Use of Mark-Up Language

Presented at the HDF and HDF-EOS Workshop XI - Nov. 6-8, 2007Landover, Maryland

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Develop Requirements for Tools to Create Maps

• Maps will be XML-based

• A draft of a map format specification has been started

Page 16: Towards Long-Term Archiving of NASA HDF-EOS and HDF Data - Data Maps and the Use of Mark-Up Language

Presented at the HDF and HDF-EOS Workshop XI - Nov. 6-8, 2007Landover, Maryland

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Create a Prototype Tool to Create Maps

• An iterative process is being used to create the prototype

• Each iteration adds the next capability from the prioritized list shown earlier

• At this point, the tool just creates a text description

Page 17: Towards Long-Term Archiving of NASA HDF-EOS and HDF Data - Data Maps and the Use of Mark-Up Language

Presented at the HDF and HDF-EOS Workshop XI - Nov. 6-8, 2007Landover, Maryland

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Communications Plan

• Bi-weekly telecons with our sponsors (may move to monthly)

• Briefing to NASA Data Center managers held, expect to provide periodic updates

• Brief community at the HDF-Workshop and other relevant meetings (e.g., AGU)

• Submit a paper to the special issue of IEEE Transactions of Geoscience and Remote Sensing devoted to Data Archiving and Distribution

• Public wiki established but not yet populated

Page 18: Towards Long-Term Archiving of NASA HDF-EOS and HDF Data - Data Maps and the Use of Mark-Up Language

Presented at the HDF and HDF-EOS Workshop XI - Nov. 6-8, 2007Landover, Maryland

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Summary

• We’ve started a project to assess and prototype the ability to create maps to the contents of HDF4 files that allow programmers to develop code to read data without using the HDF APIs

• We welcome community involvement