Top Banner
Opportunities for earth science data interoperability through coordinated semantic development, using a shared model for observations and measurements Mark Schildhauer Director of Computing, NCEAS Logan Utah: CUAHSI Conference on Hydrologic Data and Information Systems June 2011 SONet
31

Opportunities for earth science data interoperability through coordinated semantic development, using a shared model for observations and measurements.

Dec 31, 2015

Download

Documents

Joan Anderson
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Opportunities for earth science data interoperability through coordinated semantic development, using a shared model for observations and measurements.

Opportunities for earth science data interoperability through coordinated

semantic development, using a shared model for observations and

measurementsMark Schildhauer

Director of Computing, NCEAS

Logan Utah:

CUAHSI Conference on Hydrologic Data and Information Systems

June 2011

SONet

Page 2: Opportunities for earth science data interoperability through coordinated semantic development, using a shared model for observations and measurements.

2

Integrative Environmental Research

Analyses require a wide range of data– Broad scales: geospatial, temporal, biological (micro-macro)

– Diverse topics: abiotic and biotic phenomena

• Predicting impact of invasive insect species on crop production

• Documenting effects of climate change on forest composition

• Large amounts of relevant data…– E.g., over 25,000 data sets are available in the

Knowledge Network for Biocomplexity repository (KNB– http://knb.ecoinormatic.org)

• But researchers struggle to …– Discover relevant datasets for a study

– And combine these into an integrated product to analyze

Page 3: Opportunities for earth science data interoperability through coordinated semantic development, using a shared model for observations and measurements.

3

for Discovery, Access, Interpretation, Re-

use:

SHARED KNOWLEDGE

MODELS• Need consistency and rigor in terminology

• Standardized protocols, methods when possible

• Interoperability (syntax)

• Comparability (semantics)

• Minimally, need a “shared community vocabulary”

• For hydrologists--- WATERML?

• For broader, integrative environmental science--- ?

Page 4: Opportunities for earth science data interoperability through coordinated semantic development, using a shared model for observations and measurements.

• metadata and keywords are good

start, but not enough: ambiguous, idiosyncratic, hard to parse

• controlled vocabularies: an improvement, but can do more with today’s technology

SHARED KNOWLEDGE

MODELS

Page 5: Opportunities for earth science data interoperability through coordinated semantic development, using a shared model for observations and measurements.

5

SHARED KNOWLEDGE

MODELS– Ontologies provide a “shared vocabulary”• Common “external” definitions (namespaces)

• for explicating relationships among terms

• describing data schemas (observations)

• for machine-assisted discovery, reasoning, integration

– Standard technologies for creating and operating on ontologies:

– Syntaxes: RDF, SKOS, OWL

– FOSS applications and frameworks: Jena, Protégé

– Standard Reasoners: Pellet, FaCT++, Racer

Page 6: Opportunities for earth science data interoperability through coordinated semantic development, using a shared model for observations and measurements.

6

Another Opportunity: Observational data

Environmental and earth science data often consists of “observations”

• Data sets are often stored in tables (e.g., flat files, spreadsheets)

• Represent collections of associated measurements

• Highly heterogeneous (format, content, semantics)

• (cell) Values represents measurements

Page 7: Opportunities for earth science data interoperability through coordinated semantic development, using a shared model for observations and measurements.

Examples of “raw” observational data

Page 8: Opportunities for earth science data interoperability through coordinated semantic development, using a shared model for observations and measurements.

Several prospective observation models…

Project Domain Observational data model

VSTO Atmospheric sciences

Ontologies for interoperability among different meteorological metadata standards and other atmospheric measurements

SERONTO Socioecological research

Ontology for integrating socio-ecological data

OGC’s O&M Geospatial Observations and Measurements standard for enhancing sensor data interoperability

SEEK’s OBOE Ecology Extensible Observation Ontology for describing data as observations and measurements

PATO’s EQ Phenotype/Evolution Underlying model for describing phenotypic traits to link with genomic data

Page 9: Opportunities for earth science data interoperability through coordinated semantic development, using a shared model for observations and measurements.

9

Observational Data Models

• High degree of similarity across independently derived models

• Opportunity to enable enhanced data interoperability and uniform access– Domain-neutral “foundational” template

– Abstracts away underlying format issues

– Domain ontologies “extend” core concepts, to formalize semantics of terms used to describe measurements

Page 10: Opportunities for earth science data interoperability through coordinated semantic development, using a shared model for observations and measurements.

10

Observational Data Model

• Implemented as an OWL-DL ontology– Provides basic concepts for describing

observations

– Specific “extension points” for domain-specific terms

Entity

Characteristic

Observation

Measurement

Protocol Standard

+ precision : decimal + method : anyType

1..1

*

1..1

*

*

*

0..1 0..1

1..1

**

Value

1..1

*

*

Context ObservedEntity

Page 11: Opportunities for earth science data interoperability through coordinated semantic development, using a shared model for observations and measurements.

11

Observational Data Model

Observations are of entities (e.g., River, Water, Sample, …)– An observation can have multiple

measurements

– Each measurement is taken of the observed entity

Entity

Characteristic

Observation

Measurement

Protocol Standard

+ precision : decimal + method : anyType

1..1

*

1..1

*

*

*

0..1 0..1

1..1

**

Value

1..1

*

*

Context ObservedEntity

Page 12: Opportunities for earth science data interoperability through coordinated semantic development, using a shared model for observations and measurements.

12

Observational Data Model

A measurement consists of– The characteristic measured (e.g., Ammonium

concentration)– The standard used (e.g., unit, coding scheme)– The measurement protocol– The measurement value

Entity

Characteristic

Observation

Measurement

Protocol Standard

+ precision : decimal + method : anyType

1..1

*

1..1

*

*

*

0..1 0..1

1..1

**

Value

1..1

*

*

Context ObservedEntity

Page 13: Opportunities for earth science data interoperability through coordinated semantic development, using a shared model for observations and measurements.

13

Observational Data Model

Observations can have context

– E.g. geographic, temporal, or biotic/abiotic environment in which some measurement was taken

– Context is an observation too (entity + characteristic)

– Context is transitive Entity

Characteristic

Observation

Measurement

Protocol Standard

+ precision : decimal + method : anyType

1..1

*

1..1

*

*

*

0..1 0..1

1..1

**

Value

1..1

*

*

Context ObservedEntity

Page 14: Opportunities for earth science data interoperability through coordinated semantic development, using a shared model for observations and measurements.

Similarities among Observational Data Models

FeatureOfInterest

ObservationContext

ObservedProperty

OM_Observation

Result

carrierOfCharacteristic

forProperty

relatedContextObservation

hasResult

OM_Process

usesProcedure

OGC’s Observations and Measurements (O&M)

ofFeature

Page 15: Opportunities for earth science data interoperability through coordinated semantic development, using a shared model for observations and measurements.

Similarities among Observational Data Models

Entity

Context (other Observation)

Characteristic

Observation

Standard

hasCharacteristichasMeasurement

ofEntity

hasContext

usesStandard

Protocol

usesProtocol

Precision

hasPrecision

ofCharacteristic

hasValue

SEEK/Semtools Extensible Observation Ontology (OBOE)

Measurement

Page 16: Opportunities for earth science data interoperability through coordinated semantic development, using a shared model for observations and measurements.

Seronto basic classes:value_set

physical_thing

parameter_method

parametermethodselection_description

hasParameterMethodhasInvestigationItem

hasValue

hasSample hasMethod hasParameter

scale

hasScale

unithasUnit

hasValue

value_nominal

value_floatvalue_

nominalvalue_float

Similarities among Observational Data Models

Page 17: Opportunities for earth science data interoperability through coordinated semantic development, using a shared model for observations and measurements.

17

SHARED KNOWLEDGE

MODELS

NSF INTEROP program: foster communication among domains to enable greater interoperability

• Scientific Observations Network, SONet

• Many earth and life science domains participating

• Advanced conceptual modeling

• Unifying abstraction of ‘observation’

• Semantic web & ontologies

• Domain scientists & knowledge engineers

SONet

Page 18: Opportunities for earth science data interoperability through coordinated semantic development, using a shared model for observations and measurements.

Developing a core model (SONet project)

Identify the key observational models in the earth and environmental sciences

Are these various observational models easily reconciled and/or harmonized?

Are there special capabilities and features enabled by some observational approaches?

What services should be developed around these observational models?

Page 19: Opportunities for earth science data interoperability through coordinated semantic development, using a shared model for observations and measurements.

Similarities among Observational Data Models

Entity FeatureOfInterest

Characteristic ObservedProperty

Measurement OM_Observation

Protocol OM_Process

Result

Standard

Value

Precision

Context ObservationContext

OBOE O&M

Page 20: Opportunities for earth science data interoperability through coordinated semantic development, using a shared model for observations and measurements.

SONet/Semtools Semantic Approach

• Data-> metadata-> annotations-> ontologies• Annotations link EML metadata elements to concepts in

ontology thru Observation Ontology• EML metadata describe data and its structures

Page 21: Opportunities for earth science data interoperability through coordinated semantic development, using a shared model for observations and measurements.

Linking data values to concepts through observations

• Link data (or metadata) through observational data model to terms from domain-specific ontologies

• Context can inter-relate values in a tuple• Can provide clarification of semantics of data set as a

whole, not just “independent” measurements

Page 22: Opportunities for earth science data interoperability through coordinated semantic development, using a shared model for observations and measurements.

22

Semantic annotation

Marburg 2011

Attribute mappings

Page 23: Opportunities for earth science data interoperability through coordinated semantic development, using a shared model for observations and measurements.

How to use observational data models…

Marburg 2011

Page 24: Opportunities for earth science data interoperability through coordinated semantic development, using a shared model for observations and measurements.

linking observational data models to data…

Marburg 2011

Page 25: Opportunities for earth science data interoperability through coordinated semantic development, using a shared model for observations and measurements.

Special Mojo of OWL Ontologies

• Class hierarchies– Parent, sibling, child class relationships

• Object properties– to relate instances between classes

• reflexive, symmetric, transitive• Specify domain and ranges Contained in

• Datatype properties– to relate instances to values

• Cardinality

• Polyhierarchies

Page 26: Opportunities for earth science data interoperability through coordinated semantic development, using a shared model for observations and measurements.

Special Mojo of OWL Ontologies

• Reasoning offers axioms such as:

– Disjointness: e.g. can’t be both X and Y• inSediment AND inWaterColumn

– Equivalence (classes) or Same_as (instances)• Synonymy across namespaces

– Properties for mereology • Composite of• Contained in• Connected to

– Reasoner can infer relationships, determine inconsistencies in assertions

Page 27: Opportunities for earth science data interoperability through coordinated semantic development, using a shared model for observations and measurements.

Special Mojo of Observations

• Enables faceted discovery along entity & characteristic hierarchies

• Economical use of concepts: don’t need to have a “red-colored eye”, or “red-colored wing”– instead re-use concept of “red” with variety of entities

• Express whether observations taken from the same instance or not (tuple explication)– E.g. Multiple chemical concentrations measured from single

water sample

• Use of equivalence class (measurement types) to apply to realized measurements in data

Page 28: Opportunities for earth science data interoperability through coordinated semantic development, using a shared model for observations and measurements.

28

Ontology Design Pattern

Page 29: Opportunities for earth science data interoperability through coordinated semantic development, using a shared model for observations and measurements.

29

Ontology Design Pattern

ThesauForm: LaPorte, Huguenot & GarnierTraitNet: Bunker, Ahrestani, Naeem

Page 30: Opportunities for earth science data interoperability through coordinated semantic development, using a shared model for observations and measurements.

Acknowledgements

Mark Schildhauer*, Matthew B. Jones, Ben Leinfelder: NCEAS, Santa Barbara CA, USALuis Bermudez:Open Geospatial Consortium Inc., Wayland MA, USAShawn Bowers: Gonzaga University, Spokane WA, USAPhillip C. Dibner: OGCii, Berkeley CA, USACorinna Gries: University of Wisconsin, Madison WI, USA Deborah L. McGuinness: Rensselaer Polytechnic Institute, Troy NY, USAMargaret O’Brien: UCSB, Santa Barbara CA, USAHuiping Cao: New Mexico State University, Las Cruces NM, USASimon J.D. Cox: Earth Science & Resource Engrg, CSIRO, Bentley WA, AUSSteve Kelling, Carl Lagoze: Cornell University, Ithaca NY, USA Hilmar Lapp: NESCent, Durham NC, USAJoshua Madin: Macquarie University, Sydney NSW, AUS

SONet* presenter

This material is based upon work supported by the National Science Foundation under Grant Numbers 0743429, 0753144.

Page 31: Opportunities for earth science data interoperability through coordinated semantic development, using a shared model for observations and measurements.

31

FIN

“How many fingers, Winston?”

Orwell, 1984