TDWG 2007 Bratislava SPM from an SDD SPM from an SDD perspective: perspective: Generality and Generality and extensibility extensibility Gregor Hagedorn Federal Biological Research Center, Berlin, Germany
Dec 18, 2015
TDWG 2007Bratislava
SPM from an SDD perspective: SPM from an SDD perspective: Generality and extensibilityGenerality and extensibilitySPM from an SDD perspective: SPM from an SDD perspective: Generality and extensibilityGenerality and extensibility
Gregor HagedornFederal Biological Research Center,
Berlin, Germany
Gregor HagedornFederal Biological Research Center,
Berlin, Germany
SDD Purpose wasSDD Purpose was
(From SDD Charter:)
Develop standard computer-based mechanisms for expressing and transferring descriptive information about biological organisms or taxa (as well as similar entities such as diseases), including terminologies, ontologies, descriptions, identification tools and associated resources.
SPM vs. SDD SPM vs. SDD
SpeciesProfileModel CodedDescription|NaturalLanguageDescription
aboutTaxon: The taxon this information is about. Scope/TaxonName
associatedTaxon: Another taxon associated with this taxon and this piece of information e.g. a parasite or prey
Scope/TaxonName
context: A string representation of when this information is valid.
(Categorical|Quantitative|Text)/Notes
contextOccurrence: An indication of when this information is valid according to a geospatial data.
Scope/GeographicArea
contextValue: An indication of when this information is valid according to a controlled vocabulary.
(Categorical|Quantitative|Text)/Modifier
hasContent: A information about a taxon in the form of a string. Should be interpreted in combination with the type of the InfoItem
(Categorical|Quantitative|Text)/Content
hasValue: A information about a taxon in the form of a controlled vocabulary term.
(Categorical|Quantitative|Text)/State
Richness & AtomizationRichness & Atomization
InfoItem
aboutTaxon
…
…
… Character Data
Scopes
…
…
Representation
SummaryData
Scopes
SPM SDD
Labels, Definitions, MediaObjects(multilingual)
RevisionData
SampleData …
Taxa, Speci-mens, Observ.,Publications,Parts, Stage,Sex. etc.
Naming differencesNaming differences
Perhaps consider whether SPM: “context” is a good paradigm: A measurement can be made in the context of a study,
and perhaps in the context of a season But is “geographical location”, “frequently”, “sex”,
“above 1000 m” a context? SDD distinguishes between Scope of a
description = criteria by which data have been aggregated (taxon, specimen, geolocation, season, publication source, etc.) and Modifiers that modify/qualify a statement
Naming differencesNaming differences
“Value” for categorical measurements is OK in principle, but may affect extensibility to quantitative data.
Publication references would be needed for source of information being aggregated, or citations therein
Occurrence / DistributionOccurrence / Distribution
Occurrence used in contextOccurrence,Distribution as content term around it.
Is the order of information reversed?
spm:contextValue = “An indication of when this information is valid according to a controlled vocabulary.”
→ perhaps:
Perhaps use a special type here?
<spm:hasInformation><spmi:Distribution> <spm:hasValue rdf:resource="&tv;OccurrenceStatusTerm#Extinct"/> <spm:contextValue rdf:resource="&tv;GeographicRegion#ITA"/></spmi:Distribution></spm:hasInformation>
<spm:hasInformation><spmi:Distribution> <spm:hasValue rdf:resource="&tv;GeographicRegion#ITA"/> <spm:contextValue rdf:resource="&tv;OccurrenceStatusTerm#Extinct"/></spmi:Distribution></spm:hasInformation>
Cardinality?Cardinality?
<spm:hasInformation><spmi:Distribution> <spm:hasValue rdf:resource="&tv;OccurrenceStatusTerm#DoubtfullyNative"/> <spm:hasValue rdf:resource="&tv;OccurrenceStatusTerm#Extinct"/> <spm:contextValue rdf:resource="&tv;GeographicRegion#ITA"/></spmi:Distribution></spm:hasInformation>
<spm:hasInformation><spmi:Distribution> <spm:hasValue rdf:resource="&tv;OccurrenceStatusTerm#DoubtfullyNative"/> <spm:hasValue rdf:resource="&tv;OccurrenceStatusTerm#Extinct"/> <spm:contextValue rdf:resource="&tv;GeographicRegion#ITA"/> <spm:contextValue rdf:resource="&tv;GeographicRegion#ALB"/></spmi:Distribution></spm:hasInformation>
SPM Concepts
Biology Cytology
Physiology
Ecology
MolecularBiology Evolution
Conservation
Distribution
Use
Description
Size
Biology
Description
Overlap!
Size
Cytology
Physiology
MolecularBiology
Ecology
Evolution Conservation
Distribution
Use
Ecology
Distribution
Evolution
Biology
Biology
Description
Conclusive?
Size
Cytology
Physiology
MolecularBiology
Ecology
Evolution Conservation
Distribution
Use
Ecology
Distribution
Evolution
Biology
Anatomy
Biochemistry
Morphology
Secondary metabolites
Biology
Description
Size
Cytology
Physiology
MolecularBiology
Ecology
Evolution Conservation
Distribution
Use
Ecology
Distribution
Evolution
Biology
Anatomy
Biochemistry
Morphology
Secondary metabolites
LifeExpectancy
LookAlikes
DiagnosticDescription
LifeCycle
PopulationBiology
Behavior
Associations
SPM Version2007-08-15
Earlier terms used in SPM example files
Weight
Number of charactersNumber of characters
Size
LIAS has 987 “characters”, incl. ca. 30 “pseudo-characters”
GrassBase has 1090 characters
LifeExpectancy
SDD concluded to separate character standardization from structural separation
Waiting for exchange of existing definitions and patterns to arise rather than round table
SPM content vocabularySPM content vocabulary
A concise “major concept headings” vocabulary like SPM is certainly desirable
But definitions are needed! Human-readable definitions should be developed OWL/RDF currently provides a single semantic
information: Size is subclass of Description Provision of general abstract data structures
(content, value, contextXXX) should perhaps be separated from definition of biological concepts
Ontologies 1 (Descriptive Terms)Ontologies 1 (Descriptive Terms)
Leaf
Green leaf Petal
Cladode(= stem looking like leaf)
Leaflike structure Stem
Coded Summary DescriptionsTaxon 1: Green leaf: Length 7 cmTaxon 2: Green leaf: Length 5 cmTaxon 3: Cladode: Length 8 cmTaxon 4: Cladode: Length 2 cm
Identification: Which species have leaf-like structures on the stem between 7 and 10 cm long?
Flower
Ontologies 2 (Taxonomic Classes)Ontologies 2 (Taxonomic Classes)
ThisFamily
Taxon concepts are a natural ontology with multiple inheritancefrom within taxon conceptclasses and Rank classes.
Identification: Which family has species withleaf-like structures on the stem between 7 and 10 cm long?
Genus
Genus spec1 Genus spec2
Genus
Genus spec1 Genus spec2
Taxonomic Rank
Family
Genus
Species
Break down of communication?Break down of communication?
SDD was designed for the purpose SPM has been developed for
SDD and SPM are strongly analogous SDD has invested much time in trying to find
an application profile supporting rich editing applications in a way consistent with simple identification keys and taxon-page creation software.
SDD structures and terms have not been evaluated for SPM
Interest Group:“Structured Descriptive Data”“Structured Descriptive Data”
→Interest Group:
“Biological Descriptions”“Biological Descriptions”
→TG SDD-Schema→TG SPM
→TG SDD/RDF???
Thank you:Thank you:
For volunteering your personal time in discussions, implementation and testing!
Projects and companies for testing and implementing!
GBIF, TDWG, and BMBF for traveling and workshop support!
TDWG-IP for financing an SDD primer!