M.Benno Blumenthal and John del Corral International Research Institute for Climate and Society OpenDAP 2007 http:// iridl.ldeo.columbia.edu/ontologie s / Using Resource Description Framework (RDF) to carry metadata for datasets
Jan 05, 2016
M.Benno Blumenthal and John del Corral
International Research Institute for Climate and Society
OpenDAP 2007
http://iridl.ldeo.columbia.edu/ontologies/
Using Resource Description Framework (RDF) to carry
metadata for datasets
RDF is important for OpenDAP because
• By embedding OpenDAP in an RDF document, metadata (a.k.a. attributes) not understood by OpenDAP code are easily carried in a semantically-valid way
• Explicit relationships between OpenDAP variables can cleanly solve netcdf common name vs OpenDAP GRID/MAP structures, while avoiding retransmission of common independent variables
• Explicit mapping between the different data models of the different OpenDAP APIs
RDF is important for OpenDAP because
• Support for different languages can be built on top of RDF object support, e.g. Ruby ActiveRDF
Why RDF?
Web-based system for interoperating semantics
A key part of the Semantic Web
RDF/OWL is an interesting technology, but it is even more interesting when it is clear that it can help solve our problems
Standard Metadata
Users
Datasets
Tools
Standard Metadata Schema/Data Services
Many Data Communities
Tools
Users
Datasets
Standard Metadata Schema
Tools
Users
Datasets
Standard Metadata Schema
Tools
Users
Datasets
Standard Metadata Schema
Tools
Users
Datasets
Standard Metadata Schema
Tools
Users
Datasets
Standard Metadata Schema
Super Schema
Tools
Users
Datasets
Standard Metadata Schema
Tools
Users
Datasets
Standard Metadata Schema
Tools
Users
Datasets
Standard Metadata Schema
Tools
Users
Datasets
Standard Metadata Schema
Tools
Users
Datasets
Standard Metadata Schema
Standard metadata schema
Super Schema: direct
Tools
Users
Datasets
Standard Metadata Schema
Tools
Users
Datasets
Standard Metadata Schema
Tools
Users
Datasets
Standard Metadata Schema
Tools
Users
Datasets
Standard Metadata Schema
Tools
Users
Datasets
Standard Metadata Schema
Standard metadata schema/data service
Flaws
• A lot of work
• Super Schema/Service is the Lowest-Common-Denominator
• Science keeps evolving, so that standards either fall behind or constantly change
RDF Standard Data Model Exchange
Tools
Users
Datasets
Standard Metadata Schema
Tools
Users
Datasets
Standard Metadata Schema
Tools
Users
Datasets
Standard Metadata Schema
Tools
Users
Datasets
Standard Metadata Schema
Tools
Users
Datasets
Standard Metadata Schema
Standard metadata schema
RDF
RDF
RDF
RDF
RDF
RDF
Standard metadata schema
Tools
Users
Datasets
Standard Metadata Schema
RDF
RDFRDF
Tools
Users
Datasets
Standard Metadata Schema
RDF
RDFRDF
Tools
Users
Datasets
Standard Metadata Schem
RDF
RDFRDF
RDF Data Model Exchange
RDF
Tools
Users
Datasets
Standard Metadata Schema
RDF
RDFRDF
Tools
Users
Datasets
Standard Metadata Schema
RDF
RDFRDF
RDF Architecture
RDF
RDF RDF
RDF
RDF RDF
RDF
RDF RDF
RDF
RDF
RDF RDF
RDF
RDF RDF
Virtual (derived) RDF
queries queries queries
Why is this better?
• Maps the original dataset metadata into a standard format that can be transported and manipulated
• Still the same impedance mismatch when mapped to the least-common-denominator standard metadata, but
• When a better standard comes along, the original complete-but-nonstandard metadata is already there to be remapped, and “late semantic binding” means everyone can use the new semantic mapping
• Can uses enhanced mappings between models that have common concepts beyond the least-common-denominator
• EASIER – tools to enhance the mapping process, mappings build on other mappings
CF attributes
SWEET Ontologies
Search Terms
CF Standard Names
IRIDL Terms
NC basic attributes
IRIDL attributes
SWEET as Terms
CF Standard NamesAs Terms
Gazetteer Terms
Sample Tool: Faceted Searchhttp://iridl.ldeo.columbia.edu/ontologies/query2.pl?...
Distinctive Features of the search
• Search terms are interrelated
• terms that describe the set of returns are displayed (spanning and not)
• Returned items also have structure (sub-items and superseded items are not shown)
Architectural Features of the search
• Multiple search structures possible
• Multiple languages possible
• Search structure is kept in the database, not in the code
http://iridl.ldeo.columbia.edu/ontologies/query2.pl
Triplets of • Subject• Property (or Predicate)• Object
URI’s identify things, i.e. most of the aboveNamespaces are used as a convenient
shorthand for the URI’s
RDF: framework for writing connections
Datatype Properties
{WOA} dc:title “NOAA NODC WOA01”
{WOA} dc:description “NOAA NODC WOA01: World Ocean Atlas 2001, an atlas of objectively analyzed fields of major ocean parameters at monthly, seasonal, and annual time scales. Resolution: 1x1; Longitude: global; Latitude: global; Depth: [0 m,5500 m]; Time: [Jan,Dec]; monthly”
Object Properties
{WOA} iridl:isContainerOf {Grid-1x1},
{Grid-1x1} iridl:isContainerOf {Monthly}
WOA01 diagram
Standard Properties
{WOA} dcterm:hasPart {Grid-1x1},{Grid-1x1} dcterm:hasPart {MONTHLY}
Alternatively
{WOA} iridl:isContainerOf {Grid-1x1},{iridl:isContainerOf} rdfs:subPropertyOf
{dcterm:hasPart}
{SST} rdf:type {cfatt:non_coordinate_variable}, {SST} cfatt:standard_name {cf:sea_surface_temperature}, {SST} netcdf:hasDimension {longitude}
netcdf/CF in RDF
Object properties provide a framework for explicitly writing down relationships between data objects/components, e.g. vague meaning of nesting is made explicit
Properties also can be related, since they are objects too
RDF Tools
• Transport/Exchange (RDF/XML)
• Storage
• RDF APIs (Redland,Jena,Sesame)
• Query (SPARQL,SeRQL, …)
• Basic Semantics
Search Interface Term
• http://iri.columbia.edu/~benno/sampleterm.pdf
Ontologies
Use Conventions to connect concepts to established sets of concepts
Generate additional “virtual” triples from the original set and semantics
RDFS – some property/class semantics
OWL – additional property/class semantics: more sophisticated (ontological) relationships
OWL
Language for expressing ontologies, i.e. the semantics are very important. However, even without a reasoner to generate the implied RDF statements, OWL classes and properties represent a sophistication of the RDF Schema
However, there is a serious split in world view from what we have been talking about: concepts as classes vs concepts as individuals
Faceted Search Explicated
Search Interface
• Items (datasets/maps)
• Terms
• Facets
• Taxa
Search Interface Semantic API
{item} dc:title dc:description rss:link iridl:icon dcterm:isPartOf {item2} dcterm:isReplacedBy {item2}
{item} trm:isDescribedBy {term}
{term} a {facet} of {taxa} of {trm:Term},{facet} a {trm:Facet}, {taxa} a {trm:Taxa},{term} trm:directlyImplies {term2}
Faceted Search w/Querieshttp://iridl.ldeo.columbia.edu/ontologies/query2.pl?...
RDF Architecture
RDF
RDF RDF
RDF
RDF RDF
RDF
RDF RDF
RDF
RDF
RDF RDF
RDF
RDF RDF
Virtual (derived) RDF
queries queries queries
Data ServersOntologies
MMI
JPL
StandardsOrganizations
Start Point
RDF Crawler
RDFS SemanticsOwl SemanticsSWRL Rules
SeRQL CONSTRUCT
Search Queries
LocationCanonicalizer
TimeCanonicalizer
Sesame
Search Interface
bibliography
IRI RDF Architecture
CF attributes
SWEET Ontologies
Search Terms
CF Standard Names
IRIDL Terms
NC basic attributes
IRIDL attributes
SWEET as Terms
CF Standard NamesAs Terms
Gazetteer Terms
RDF is important for OpenDAP because
• By embedding OpenDAP in an RDF document, metadata (a.k.a. attributes) not understood by OpenDAP code are easily carried in a semantically-valid way
• Explicit relationships between OpenDAP variables can cleanly solve netcdf common name vs OpenDAP GRID/MAP structures, while avoiding retransmission of common independent variables
• Explicit mapping between the different data models of the different OpenDAP APIs
• Build on language support of RDF objects
Embedded OpenDAP Ontology
Topics/Issues
• OpenDAP and RDF: can we transport data semantics without fixing the entire schema?
• netcdf/HDF and RDF: do we need non-contextual modeling in our metadata transport/storage?
• Concepts as classes vs concepts as individuals
• Sub-classes vs sub-categories