Top Banner
A Multi-Discipline Metadata Registry for Science Interoperability J. Steven Hughes/JPL - [email protected] Daniel J. Crichton/JPL - [email protected] Jason J. Hyon/JPL - [email protected] Sean C. Kelly/UTA - [email protected] Open Forum on Metadata Registries January 17-21, 2000 Santa Fe, New Mexico
26

A Multi-Discipline Metadata Registry for Science Interoperability

Jan 14, 2016

Download

Documents

Nessa

A Multi-Discipline Metadata Registry for Science Interoperability. J. Steven Hughes/JPL - [email protected] Daniel J. Crichton/JPL - [email protected] Jason J. Hyon/JPL - [email protected] Sean C. Kelly/UTA - [email protected]. Open Forum on Metadata Registries - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: A Multi-Discipline Metadata Registry  for Science Interoperability

A Multi-Discipline Metadata Registry for Science Interoperability

J. Steven Hughes/JPL - [email protected] J. Crichton/JPL - [email protected]

Jason J. Hyon/JPL - [email protected] C. Kelly/UTA - [email protected]

Open Forum on Metadata RegistriesJanuary 17-21, 2000

Santa Fe, New Mexico

Page 2: A Multi-Discipline Metadata Registry  for Science Interoperability

A Multi-Discipline Metadata Registry for Science Interoperability

•Background

•Problem Statement

•System Overview

•Profile Development

•Conclusion and Issues

Page 3: A Multi-Discipline Metadata Registry  for Science Interoperability

Background

•NASA’s Office of Space Science•Planetary Science

•Planetary Data System (PDS)•5 Science disciplines nodes - 2 Support nodes 1 Central node•Heterogeneous domains - short term missions

•Astrophysics•Astrophysics Data System

•100s to 1000s of nodes•Homogeneous domains - long term missions

•Space Physics•Space Physics Data System*

•Several identified nodes

Page 4: A Multi-Discipline Metadata Registry  for Science Interoperability

Background

•Planetary Data System (PDS)

•Archives essentially all science data from solar system exploration missions

•Prototype - 1986, Operational - 1990

•Publishes archive quality products

•Well defined standards architecture

Page 5: A Multi-Discipline Metadata Registry  for Science Interoperability

BackgroundPlanetary Science Standards Architecture

VolumeOrganization Stds.

ProductLabels

CatalogTemplates

Planetary ScienceData Dictionary

Object Description Language

Standard ArchiveProduct Architecture

StandardDescription

Standard Vocabulary

Standard Grammar

+ Peer Review

ArchiveQualityProduct

Page 6: A Multi-Discipline Metadata Registry  for Science Interoperability

Background

•Planetary Science Data Dictionary•1000+ Data Elements spanning Planetary Science disciplines

•Nomenclature Standard•Meaning, type, ranges, enumerated values

•Planetary Science Data Model•Developed as Planetary Science enterprise E/R model

•Planetary Science Entities - Spacecraft, Instruments•Science Data Entities - Data Products, Projections, ...•Data Organization Entities - Volumes•Management Entities - Nodes, Personnel

•Implemented as the PDS Data Set Catalog in an RDBMS•Distributed in Object Description Language

Page 7: A Multi-Discipline Metadata Registry  for Science Interoperability

Background

•Challenge

•Develop single interface for locating space science data.

•Provide data system interoperability.

•Support correlative Science.

Page 8: A Multi-Discipline Metadata Registry  for Science Interoperability

Problem Statement

Space scientists can not easily locate or use data across the hundreds if not thousands of autonomous, heterogeneous, and distributed data systems currently in the Space Science community.

•Heterogeneous Systems•Data Management - RDBMS, ODBMS, HomeGrownDBMS, BinaryFiles•Platforms - UNIX, LINUX, WIN3.x/9x/NT, Mac, VMS, …•Interfaces - Web, Windows, Command Line•Data Formats - HDF, CDF, NetCDF, PDS, FITS, VICR, ASCII, ...•Data Volume - KiloBytes to TeraBytes

•Heterogeneous Disciplines•Moving targets and stationary targets•Multiple coordinate systems•Multiple data object types (images, cubes, time series, spectrum, tables,

binary, document)•Multiple interpretations of single object types•Multiple software solutions to same problem.•Incompatible and/or missing metadata

Page 9: A Multi-Discipline Metadata Registry  for Science Interoperability

Proposed Solution

•Encapsulate individual data systems. (Hide uniqueness.)

•Communicate using metadata that describe resources•Data (e.g. data sets, images)•non-Data (e.g. catalogs, services)

•Enable interoperability based on metadata compatibility.

•Refocus problem on metadata development.

Page 10: A Multi-Discipline Metadata Registry  for Science Interoperability

Proposed Solution (cont) • Object_Oriented Data Technology Task (OODT)

– Domain independent data management infrastructure

• Domain independent data structures– XML - Standard interchange language

– Metadata management

• Resource profile

– Message passing

• Domain independent system infrastructure– CORBA for interoperability between computer systems and languages

– Message passing to simply interface design

– Standardized reusable server components

Page 11: A Multi-Discipline Metadata Registry  for Science Interoperability

System OverviewObject Oriented Data Technology Framework

SeaWinds Staging

OODT ServerPDS Staging

PTI Staging

Profile Server

Query Server

Archive Server Product

ServerArchive Server

Profile Server

Profile Server

Sybase

Oracle

Profile Server

PDS Systems

Product Server

ProfProf

Prof

Prof

Scientist

Web Server

Page 12: A Multi-Discipline Metadata Registry  for Science Interoperability

System OverviewProfile Service

• Profile describes a resource– Available datasets and products– Types of resources and where they’re located

• Optionally reference other profile servers

Profile Server

Prof Data system 1

Data system 2

Profile Server

Prof

Profile Server

Profile Server

Page 13: A Multi-Discipline Metadata Registry  for Science Interoperability

System OverviewQuery Service

• Knows how to “crawl” through servers to produce a result– Crawls through profiles to discover other

profiles and product servers– Crawls through product servers to display

available products

• Accessible through CORBA API or through web browser

Page 14: A Multi-Discipline Metadata Registry  for Science Interoperability

Profile Development Objective

•Objective•Design and develop domain generic structure that will capture the metadata necessary for identifying and locating science data resources across distributed heterogeneous data systems.

•Result •Profile - A resource description (subset of meta-model) sufficient to determine if the resource might resolve a query.

Page 15: A Multi-Discipline Metadata Registry  for Science Interoperability

Profile Development Approach

•Choose a common interchange format.

•Develop a domain generic language.

•Implement domain specific instances.•Model the domain. •Capture the meta-data.

•Develop system to manage the results.

Page 16: A Multi-Discipline Metadata Registry  for Science Interoperability

Profile Development Choose a common interchange format

•XML

•eXtensible Markup Language•More expressive than HTML•More simple than SGML

•A meta-language used to define domain languages.•XSIL - eXtensible Scientific Interchange Language.•XIL - Instrument control language.

•Wide acceptance as an interchange format.•Electronic data interchange (EDI) standard.

Page 17: A Multi-Discipline Metadata Registry  for Science Interoperability

Profile Development Develop a domain generic language

•Define a generic structure (XML DTD) that can describe heterogeneous domain-specific resources.

•Profile - A resource description with sufficient information to determine if the resource satisfies a query.

•Profile elements •name, syntax, unit, value_instance, meaning, alias, …•encodes selected domain attributes and their values specific to this resource

•Resource attributes - id, title, discipline, location_id, …

•Profile attributes - id, title, desc, type, data_dictionary_id, …

Page 18: A Multi-Discipline Metadata Registry  for Science Interoperability

Profile Development Develop a domain generic language

prof.dtd

<!ELEMENT PROFILES (PROFILE+)>

<!ELEMENT PROFILE (PROFILE_ATTRIBUTES, RESOURCE)>

<!ATTLIST PROFILE PROFILE_ID CDATA #REQUIRED >

<!ELEMENT PROFILE_ATTRIBUTES (ID, TITLE*, DESC*, TYPE*, STATUS_ID*, SECURITY_TYPE*, PARENT_ID*, CHILD_ID*, REVISION_NOTE*, DATA_DICTIONARY_ID*)>

<!ELEMENT RESOURCE (RESOURCE_ATTRIBUTES, PROFILE_ELEMENT*)>

<!ELEMENT RESOURCE_ATTRIBUTES (RESOURCE_ID, RESOURCE_TITLE, RESOURCE_DISCIPLINE, RESOURCE_AGGREGATION, RESOURCE_CLASS, RESOURCE_LOCATION_ID, RESULT_MIME_TYPE)>

<!ELEMENT PROFILE_ELEMENT (ELEMENT_NAME, ELEMENT_MEANING*, ELEMENT_ALIAS*, VALUE_SYNTAX*, VALUE_UNIT*, (VALUE_INSTANCE | (MINIMUM_VALUE, MAXIMUM_VALUE))*)>

Page 19: A Multi-Discipline Metadata Registry  for Science Interoperability

Profile Development Profile Example - PDS Distributed Inventory System

<PROFILE PROFILE_ID = "PROFILE_PDS_DIS_V1.3.n" > <PROFILE_ATTRIBUTES> <ID> PROFILE_PDS_DIS_V1.3.n </ID> <TITLE> Planetary Data System - Distributed Inventory System - Profile V1.0 </TITLE> <DESC> This profile describes the Planetary Data System (PDS) Distributed Inventory System (DIS) ... <TYPE> PROFILE </TYPE> <DATA_DICTIONARY_ID> OODT_PDS_DATA_SET_DD_V1.0 </DATA_DICTIONARY_ID> </PROFILE_ATTRIBUTES> <RESOURCE> <RESOURCE_ATTRIBUTES> <RESOURCE_ID> PDS_DIS_V1.3.n </RESOURCE_ID> <RESOURCE_TITLE> Planetary Data System - Distributed Inventory System </RESOURCE_TITLE> <RESOURCE_DISCIPLINE> PDS </RESOURCE_DISCIPLINE> <RESOURCE_AGGREGATION> GRANULE+ </RESOURCE_AGGREGATION> <RESOURCE_CLASS> INVENTORY </RESOURCE_CLASS> <RESOURCE_LOCATION_ID> http://pds.jpl.nasa.gov/pdsbrows.htm </RESOURCE_LOCATION_ID> <RESULT_MIME_TYPE> text/html </RESULT_MIME_TYPE> </RESOURCE_ATTRIBUTES>

...

Page 20: A Multi-Discipline Metadata Registry  for Science Interoperability

Profile Development Profile Example (cont) - PDS Distributed Inventory System

…<PROFILE_ELEMENT> <ELEMENT_NAME> DATA_OBJECT_TYPE </ELEMENT_NAME> <ELEMENT_MEANING> The data_object_type element provides the type ... <VALUE_SYNTAX> ENUMERATION </VALUE_SYNTAX> <VALUE_UNIT> N/A </VALUE_UNIT> <VALUE_INSTANCE> IMAGE </VALUE_INSTANCE>... </PROFILE_ELEMENT> <PROFILE_ELEMENT> <ELEMENT_NAME> DATA_SET_NAME </ELEMENT_NAME> <ELEMENT_MEANING> The data_set_name element identifies a PDS data set. -- example ... <VALUE_SYNTAX> ENUMERATION </VALUE_SYNTAX> <VALUE_UNIT> N/A </VALUE_UNIT> <VALUE_INSTANCE> VO1/VO2 MARS VISUAL IMAGING SUBSYSTEM DIGITAL ... <VALUE_INSTANCE> VO2 MARS RADIO SCIENCE SUBSYSTEM RESAMPLED LOS …... </PROFILE_ELEMENT> <PROFILE_ELEMENT> <ELEMENT_NAME> TARGET_NAME </ELEMENT_NAME> <ELEMENT_MEANING> The target_name element provides the names of the targets ... <ELEMENT_ALIAS> ADS.OBJECT_ID </ELEMENT_ALIAS> <VALUE_SYNTAX> ENUMERATION </VALUE_SYNTAX> <VALUE_UNIT> N/A </VALUE_UNIT> <VALUE_INSTANCE> IDA </VALUE_INSTANCE> <VALUE_INSTANCE> JUPITER </VALUE_INSTANCE>... </PROFILE_ELEMENT> </RESOURCE>

Page 21: A Multi-Discipline Metadata Registry  for Science Interoperability

Profile Development Develop a domain generic language

•Specialize the profile class

•Profile - One profile to one resource (e.g. inventory)

•Inventory - One profile to many resources (e.g. data set, image)•Minimized profile element attributes

•no meanings•subsets of preferred values

•Dictionary - One profile to one discipline•Maximize profile element attributes

•aliases , meanings•union of all preferred values

Page 22: A Multi-Discipline Metadata Registry  for Science Interoperability

Profile Development Develop a domain generic language

•Profile element hierarchy•Dictionary - Planetary Science Data Dictionary

•data elements - union of all data elements in all profiles•preferred values - union of all data element values•e.g. TARGET_NAME = {ADRASTEA, …, VENUS}

•Profile - Planetary Image Atlas - Viking, Galileo, MPF, ...•data elements - union of all data elements for all

entities managed by resource•preferred values - union of data element values•e.g. TARGET_NAME = {MARS, DEIMOS, PHOBOS, JUPITER, ...}

•Inventory - Viking Orbiter Image Catalog•data elements - data elements associated with inventory item.•perferred values - data element values for inventory item.•e.g. TARGET_NAME = {MARS, DEIMOS, PHOBOS}

Page 23: A Multi-Discipline Metadata Registry  for Science Interoperability

Profile Development Implement domain specific instances

•Apply domain generic language to specific domain.E.g. Space/Earth Science data and other resources.

•Model the domain •Data Dictionary•Data Model

•Capture the meta-data•Extracted from domain metadata repository

Page 24: A Multi-Discipline Metadata Registry  for Science Interoperability

Profile Development Implement domain specific instances

Inventory Example - PDS Data Set

<RESOURCE> <RESOURCE_ATTRIBUTES> <RESOURCE_ID> VO1/VO2-M-VIS-5-DIM-V1.0 </RESOURCE_ID> <RESOURCE_TITLE> VO1/VO2 MARS VISUAL IMAGING SUBSYSTEM DIGITAL IMAGING MODEL ... <RESOURCE_DISCIPLINE> PDS </RESOURCE_DISCIPLINE> <RESOURCE_AGGREGATION> GRANULE+ </RESOURCE_AGGREGATION> <RESOURCE_CLASS> DATA </RESOURCE_CLASS> <RESOURCE_LOCATION_ID> http://pds.jpl.nasa.gov/cgi-bin/pdsserv.pl?OBJECT_ID=PDS100676 ... <RESULT_MIME_TYPE> text/html </RESULT_MIME_TYPE> </RESOURCE_ATTRIBUTES> <PROFILE_ELEMENT> <ELEMENT_NAME> DATA_SET_NAME </ELEMENT_NAME> <VALUE_INSTANCE> VO1/VO2 MARS VISUAL IMAGING SUBSYSTEM DIGITAL IMAGING MODEL ... </PROFILE_ELEMENT> <PROFILE_ELEMENT> <ELEMENT_NAME> DATA_OBJECT_TYPE </ELEMENT_NAME> <VALUE_INSTANCE> IMAGE </VALUE_INSTANCE> </PROFILE_ELEMENT> <PROFILE_ELEMENT> <ELEMENT_NAME> TARGET_NAME </ELEMENT_NAME> <VALUE_INSTANCE> MARS </VALUE_INSTANCE> </PROFILE_ELEMENT> <PROFILE_ELEMENT> <ELEMENT_NAME> VOLUME_ID </ELEMENT_NAME> <VALUE_INSTANCE> VO_2001 </VALUE_INSTANCE>... <VALUE_INSTANCE> VO_2014 </VALUE_INSTANCE> </PROFILE_ELEMENT> </RESOURCE>

Page 25: A Multi-Discipline Metadata Registry  for Science Interoperability

Conclusion Profile Development - Review

•Choose a common interchange format. (XML)

•Develop a domain generic language. (X2PL)(XML eXtensible Profile Language)

•Implement domain specific instances. (Resource Profiles)

•Develop system to manage the profiles. (Profile Servers)

Page 26: A Multi-Discipline Metadata Registry  for Science Interoperability

Conclusion Issues

•Develop space science metadata registry•~10 high level concepts - “Anchor Points”•Complete development of discipline registries

•Determine management policy•Design meta-model and mandate conformance•Evolved meta-model through voluntary conformance

•Determine space science metadata standards•NASA Data Entity Dictionary Specification Language (DEDSL - XML syntax) currently being used