Top Banner
TDWG 2007 Bratislava SPM from an SDD SPM from an SDD perspective: perspective: Generality and Generality and extensibility extensibility Gregor Hagedorn Federal Biological Research Center, Berlin, Germany
21

TDWG 2007 Bratislava SPM from an SDD perspective: Generality and extensibility Gregor Hagedorn Federal Biological Research Center, Berlin, Germany.

Dec 18, 2015

Download

Documents

Dwayne Harris
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: TDWG 2007 Bratislava SPM from an SDD perspective: Generality and extensibility Gregor Hagedorn Federal Biological Research Center, Berlin, Germany.

TDWG 2007Bratislava

SPM from an SDD perspective: SPM from an SDD perspective: Generality and extensibilityGenerality and extensibilitySPM from an SDD perspective: SPM from an SDD perspective: Generality and extensibilityGenerality and extensibility

Gregor HagedornFederal Biological Research Center,

Berlin, Germany

Gregor HagedornFederal Biological Research Center,

Berlin, Germany

Page 2: TDWG 2007 Bratislava SPM from an SDD perspective: Generality and extensibility Gregor Hagedorn Federal Biological Research Center, Berlin, Germany.

SDD Purpose wasSDD Purpose was

(From SDD Charter:)

Develop standard computer-based mechanisms for expressing and transferring descriptive information about biological organisms or taxa (as well as similar entities such as diseases), including terminologies, ontologies, descriptions, identification tools and associated resources.

Page 3: TDWG 2007 Bratislava SPM from an SDD perspective: Generality and extensibility Gregor Hagedorn Federal Biological Research Center, Berlin, Germany.

SPM vs. SDD SPM vs. SDD

SpeciesProfileModel CodedDescription|NaturalLanguageDescription

aboutTaxon: The taxon this information is about. Scope/TaxonName

associatedTaxon: Another taxon associated with this taxon and this piece of information e.g. a parasite or prey

Scope/TaxonName

context: A string representation of when this information is valid.

(Categorical|Quantitative|Text)/Notes

contextOccurrence: An indication of when this information is valid according to a geospatial data.

Scope/GeographicArea

contextValue: An indication of when this information is valid according to a controlled vocabulary.

(Categorical|Quantitative|Text)/Modifier

hasContent: A information about a taxon in the form of a string. Should be interpreted in combination with the type of the InfoItem

(Categorical|Quantitative|Text)/Content

hasValue: A information about a taxon in the form of a controlled vocabulary term.

(Categorical|Quantitative|Text)/State

Page 4: TDWG 2007 Bratislava SPM from an SDD perspective: Generality and extensibility Gregor Hagedorn Federal Biological Research Center, Berlin, Germany.

Richness & AtomizationRichness & Atomization

InfoItem

aboutTaxon

… Character Data

Scopes

Representation

SummaryData

Scopes

SPM SDD

Labels, Definitions, MediaObjects(multilingual)

RevisionData

SampleData …

Taxa, Speci-mens, Observ.,Publications,Parts, Stage,Sex. etc.

Page 5: TDWG 2007 Bratislava SPM from an SDD perspective: Generality and extensibility Gregor Hagedorn Federal Biological Research Center, Berlin, Germany.

Naming differencesNaming differences

Perhaps consider whether SPM: “context” is a good paradigm: A measurement can be made in the context of a study,

and perhaps in the context of a season But is “geographical location”, “frequently”, “sex”,

“above 1000 m” a context? SDD distinguishes between Scope of a

description = criteria by which data have been aggregated (taxon, specimen, geolocation, season, publication source, etc.) and Modifiers that modify/qualify a statement

Page 6: TDWG 2007 Bratislava SPM from an SDD perspective: Generality and extensibility Gregor Hagedorn Federal Biological Research Center, Berlin, Germany.

Naming differencesNaming differences

“Value” for categorical measurements is OK in principle, but may affect extensibility to quantitative data.

Publication references would be needed for source of information being aggregated, or citations therein

Page 7: TDWG 2007 Bratislava SPM from an SDD perspective: Generality and extensibility Gregor Hagedorn Federal Biological Research Center, Berlin, Germany.

Occurrence / DistributionOccurrence / Distribution

Occurrence used in contextOccurrence,Distribution as content term around it.

Is the order of information reversed?

spm:contextValue = “An indication of when this information is valid according to a controlled vocabulary.”

→ perhaps:

Perhaps use a special type here?

<spm:hasInformation><spmi:Distribution> <spm:hasValue rdf:resource="&tv;OccurrenceStatusTerm#Extinct"/> <spm:contextValue rdf:resource="&tv;GeographicRegion#ITA"/></spmi:Distribution></spm:hasInformation>

<spm:hasInformation><spmi:Distribution> <spm:hasValue rdf:resource="&tv;GeographicRegion#ITA"/> <spm:contextValue rdf:resource="&tv;OccurrenceStatusTerm#Extinct"/></spmi:Distribution></spm:hasInformation>

Page 8: TDWG 2007 Bratislava SPM from an SDD perspective: Generality and extensibility Gregor Hagedorn Federal Biological Research Center, Berlin, Germany.

Cardinality?Cardinality?

<spm:hasInformation><spmi:Distribution> <spm:hasValue rdf:resource="&tv;OccurrenceStatusTerm#DoubtfullyNative"/> <spm:hasValue rdf:resource="&tv;OccurrenceStatusTerm#Extinct"/> <spm:contextValue rdf:resource="&tv;GeographicRegion#ITA"/></spmi:Distribution></spm:hasInformation>

<spm:hasInformation><spmi:Distribution> <spm:hasValue rdf:resource="&tv;OccurrenceStatusTerm#DoubtfullyNative"/> <spm:hasValue rdf:resource="&tv;OccurrenceStatusTerm#Extinct"/> <spm:contextValue rdf:resource="&tv;GeographicRegion#ITA"/> <spm:contextValue rdf:resource="&tv;GeographicRegion#ALB"/></spmi:Distribution></spm:hasInformation>

Page 9: TDWG 2007 Bratislava SPM from an SDD perspective: Generality and extensibility Gregor Hagedorn Federal Biological Research Center, Berlin, Germany.

SPM Concepts

Biology Cytology

Physiology

Ecology

MolecularBiology Evolution

Conservation

Distribution

Use

Description

Size

Page 10: TDWG 2007 Bratislava SPM from an SDD perspective: Generality and extensibility Gregor Hagedorn Federal Biological Research Center, Berlin, Germany.

Biology

Description

Overlap!

Size

Cytology

Physiology

MolecularBiology

Ecology

Evolution Conservation

Distribution

Use

Ecology

Distribution

Evolution

Biology

Page 11: TDWG 2007 Bratislava SPM from an SDD perspective: Generality and extensibility Gregor Hagedorn Federal Biological Research Center, Berlin, Germany.

Biology

Description

Conclusive?

Size

Cytology

Physiology

MolecularBiology

Ecology

Evolution Conservation

Distribution

Use

Ecology

Distribution

Evolution

Biology

Anatomy

Biochemistry

Morphology

Secondary metabolites

Page 12: TDWG 2007 Bratislava SPM from an SDD perspective: Generality and extensibility Gregor Hagedorn Federal Biological Research Center, Berlin, Germany.

Biology

Description

Size

Cytology

Physiology

MolecularBiology

Ecology

Evolution Conservation

Distribution

Use

Ecology

Distribution

Evolution

Biology

Anatomy

Biochemistry

Morphology

Secondary metabolites

LifeExpectancy

LookAlikes

DiagnosticDescription

LifeCycle

PopulationBiology

Behavior

Associations

SPM Version2007-08-15

Earlier terms used in SPM example files

Weight

Page 13: TDWG 2007 Bratislava SPM from an SDD perspective: Generality and extensibility Gregor Hagedorn Federal Biological Research Center, Berlin, Germany.

Number of charactersNumber of characters

Size

LIAS has 987 “characters”, incl. ca. 30 “pseudo-characters”

GrassBase has 1090 characters

LifeExpectancy

SDD concluded to separate character standardization from structural separation

Waiting for exchange of existing definitions and patterns to arise rather than round table

Page 14: TDWG 2007 Bratislava SPM from an SDD perspective: Generality and extensibility Gregor Hagedorn Federal Biological Research Center, Berlin, Germany.

SPM content vocabularySPM content vocabulary

A concise “major concept headings” vocabulary like SPM is certainly desirable

But definitions are needed! Human-readable definitions should be developed OWL/RDF currently provides a single semantic

information: Size is subclass of Description Provision of general abstract data structures

(content, value, contextXXX) should perhaps be separated from definition of biological concepts

Page 15: TDWG 2007 Bratislava SPM from an SDD perspective: Generality and extensibility Gregor Hagedorn Federal Biological Research Center, Berlin, Germany.

Ontologies 1 (Descriptive Terms)Ontologies 1 (Descriptive Terms)

Leaf

Green leaf Petal

Cladode(= stem looking like leaf)

Leaflike structure Stem

Coded Summary DescriptionsTaxon 1: Green leaf: Length 7 cmTaxon 2: Green leaf: Length 5 cmTaxon 3: Cladode: Length 8 cmTaxon 4: Cladode: Length 2 cm

Identification: Which species have leaf-like structures on the stem between 7 and 10 cm long?

Flower

Page 16: TDWG 2007 Bratislava SPM from an SDD perspective: Generality and extensibility Gregor Hagedorn Federal Biological Research Center, Berlin, Germany.

Ontologies 2 (Taxonomic Classes)Ontologies 2 (Taxonomic Classes)

ThisFamily

Taxon concepts are a natural ontology with multiple inheritancefrom within taxon conceptclasses and Rank classes.

Identification: Which family has species withleaf-like structures on the stem between 7 and 10 cm long?

Genus

Genus spec1 Genus spec2

Genus

Genus spec1 Genus spec2

Taxonomic Rank

Family

Genus

Species

Page 17: TDWG 2007 Bratislava SPM from an SDD perspective: Generality and extensibility Gregor Hagedorn Federal Biological Research Center, Berlin, Germany.

Break down of communication?Break down of communication?

SDD was designed for the purpose SPM has been developed for

SDD and SPM are strongly analogous SDD has invested much time in trying to find

an application profile supporting rich editing applications in a way consistent with simple identification keys and taxon-page creation software.

SDD structures and terms have not been evaluated for SPM

Page 18: TDWG 2007 Bratislava SPM from an SDD perspective: Generality and extensibility Gregor Hagedorn Federal Biological Research Center, Berlin, Germany.

Interest Group:“Structured Descriptive Data”“Structured Descriptive Data”

→Interest Group:

“Biological Descriptions”“Biological Descriptions”

→TG SDD-Schema→TG SPM

→TG SDD/RDF???

Page 19: TDWG 2007 Bratislava SPM from an SDD perspective: Generality and extensibility Gregor Hagedorn Federal Biological Research Center, Berlin, Germany.

Conveners?Conveners?

Page 20: TDWG 2007 Bratislava SPM from an SDD perspective: Generality and extensibility Gregor Hagedorn Federal Biological Research Center, Berlin, Germany.

Thank you:Thank you:

For volunteering your personal time in discussions, implementation and testing!

Projects and companies for testing and implementing!

GBIF, TDWG, and BMBF for traveling and workshop support!

TDWG-IP for financing an SDD primer!

Page 21: TDWG 2007 Bratislava SPM from an SDD perspective: Generality and extensibility Gregor Hagedorn Federal Biological Research Center, Berlin, Germany.