An update on taxonomic concept reasoning Pleas e @taxonbyte s Nico Franz 1 & Bertram Ludäscher 2 1 School of Life Sciences, Arizona State University 2 iSchool, University of Illinois at Urbana-Champaign TDWG 2016 – Biodiversity Information Standards December 06, 2016 – Instituto Tecnológico de Costa Rica (#TDWG16) @ http ://www.slideshare.net/taxonbytes /franz-ludaescher-tdwg-2016-an-update-on-taxonomic-concept-reasoning
55
Embed
Franz ludaescher tdwg 2016 an update on taxonomic concept reasoning
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
An update ontaxonomic concept reasoning
Please
@taxonbytes
Nico Franz1 & Bertram Ludäscher2
1 School of Life Sciences, Arizona State University2 iSchool, University of Illinois at Urbana-Champaign
TDWG 2016 – Biodiversity Information Standards
December 06, 2016 – Instituto Tecnológico de Costa Rica (#TDWG16)
Source: Rylands & Mittermeyer. 2014. Primate taxonomy: species and conservation. doi:10.1002/evan.21387
"100 yearsof primate
taxonomies"
The pluralistic domain of human taxonomy making• Taxonomies are endorsed by us (humans); more or less democratically.
• They consist of sets of labels, data, and theories about the natural world.
Source: Rylands & Mittermeyer. 2014. Primate taxonomy: species and conservation. doi:10.1002/evan.21387
"100 yearsof primate
taxonomies"
The pluralistic domain of human taxonomy making• Taxonomies are endorsed by us (humans); more or less democratically.
• They consist of sets of labels, data, and theories about the natural world.
• Over time, these theories change – converge or conflict (often in parallel).
Source: Rylands & Mittermeyer. 2014. Primate taxonomy: species and conservation. doi:10.1002/evan.21387
"100 yearsof primate
taxonomies"
A model to separate the human-made versus natural domains• While human taxonomy making unfolds (e.g. 1758 onwards), natural taxa –
which 'took' millions of years to realize – tend to not change much.
Domain of human taxonomy making("mimic")
• While human taxonomy making unfolds (e.g. 1758 onwards), natural taxa – which 'took' millions of years to realize – tend to not change much.
Natural domain ("model")
A model to separate the human-made versus natural domains
Domain of human taxonomy making("mimic")
• While human taxonomy making unfolds (e.g. 1758 onwards), natural taxa – which 'took' millions of years to realize – tend to not change much.
• At any time, our labels and theories (concepts) aim to stand for taxa; yet the alignment may be approximate.
Reliable?
Reliable?
Reliable?
A model to separate the human-made versus natural domains
Natural domain ("model")
Domain of human taxonomy making("mimic")
Concepts: tracking progress and conflict in the human domain• Taxonomic names and nomenclatural relationships are only so-so in terms of
tracking congruent and incongruent taxonomic perspectives.
Remsen: Using names, we're lucky when revisions are infrequent
"In biology, there are many taxa that are so under-studied thatthey are only known from their original description and
none or very few subsequent references […].
The name alone, so long as it is a unique name,is sufficient to locate all related material."
– David Remsen 2016: 213
Source: Remsen. 2016. The use and limits of scientific names in biological informatics. doi:10.3897/zookeys.550.9546
• Taxonomic names and nomenclatural relationships are only so-so in terms of tracking congruent and incongruent taxonomic perspectives.
• Logic-based multi-taxonomic alignments require better contextualization of labels and relationships, and better specification of "taxonomic sameness".
1912 vs. 1967Logically
reconcilable?
Δ = ?Δ
Δ
Δ
Concepts: tracking progress and conflict in the human domain
Still bigger (re: Synthesis):
Why taxonomic concept reasoning?
Why promote taxonomic pluralism? *• Our work extends and complements prior TDWG efforts related to the
Taxonomic Concept Transfer Schema (https://github.com/tdwg/tcs).
* See also Franz & Sterner @ TDWG16, Friday, 11:30 am (#1134) in Session "Data Gaps, Trust, Knowledge Acquisition"
• Our work extends and complements prior TDWG efforts related to the Taxonomic Concept Transfer Schema (https://github.com/tdwg/tcs).
• This work is necessary because using only Darwin Core tends to suppress taxonomic pluralism:
• DwC syntax is too under-powered for tracking multi-taxonomy alignments.
• DwC semantics ("Taxon") are too ambiguous to enforce a consistent recognition of the two domains (human taxonomy making vs. natural world).
• Technical and political means of suppressing taxonomic pluralism "by design" have implications for data quality and trust in data aggregation.
• "Synthesis" does not necessarily require taxonomic monism ("backbone"). Logic-reconciled pluralism can provide a trust-generating path for systematists' contributions towards large-scale taxonomic data integration.
* See also Franz & Sterner @ TDWG16, Friday, 11:30 am (#1134) in Session "Data Gaps, Trust, Knowledge Acquisition"
Use cases – primate classifications & avian phylogenies
1. Primate classifications sec. MSW2 (1993) versus MSW3 (2005)
a. Microcebus + Mirza sec. MSW3 (2005) with coverage constraint
b. Quantifying name (identifier) reliability
c. Reasoning achieves scalability (matrix)
2. Avian phylogenies sec. Prum et al. (2015) versus Jarvis et al. (2014)
a. Psittaciformes with & without coverage
b. Alignment of the "Neoavian explosion"
Use case 1:
Two primate classifications –
MSW2 (1993) versus MSW3 (2005)
Use case 1.a. Aligning Microcebus + Mirza sec. MSW3 (2005)
"Taxonomic concept labels"identify input concept regions
RCC–5 articulations providedfor each species-level concept
• Input visualization: MSW3 (2005) versus MSW2 (1993)
Source: Franz et al. 2016. Two influential primate classifications logical aligned. doi:10.1093/sysbio/syw023
• Alignment visualization: "grey means taxonomically congruent"
Use case 1.a. Aligning Microcebus + Mirza sec. MSW3 (2005)
One name &congruent region
Use case 1.a. Aligning Microcebus + Mirza sec. MSW3 (2005)
• Alignment visualization: "grey means taxonomically congruent"
One name &congruent region
Many names &congruent region
Use case 1.a. Aligning Microcebus + Mirza sec. MSW3 (2005)
• Alignment visualization: "grey means taxonomically congruent"
One name &congruent region
Many names &congruent region
One name &non-congruent regions
Use case 1.a. Aligning Microcebus + Mirza sec. MSW3 (2005)
• Alignment visualization: "grey means taxonomically congruent"
One name &congruent region
Many names &congruent region
One name &non-congruent regions
Many names &non-congruent regions
Use case 1.a. Aligning Microcebus + Mirza sec. MSW3 (2005)
• Alignment visualization: "grey means taxonomically congruent"
One name &congruent region
Many names &congruent region
One name &non-congruent regions
Many names &non-congruent regions
New names &exclusive regions
Use case 1.a. Aligning Microcebus + Mirza sec. MSW3 (2005)
• Alignment visualization: "grey means taxonomically congruent"
One name &congruent region
Many names &congruent region
One name &non-congruent regions
Many names &non-congruent regions
New names &exclusive regions
• Application of coverage constraint: parent-to-parent articulations (><) are fully defined by alignment signal propagated from their respective children.
Sensible when complete sampling of children is intended.
Use case 1.a. Aligning Microcebus + Mirza sec. MSW3 (2005)
• Alignment visualization: "grey means taxonomically congruent"
Use case 1.b.: Quantifying name (identifier) reliability
One name &congruent region
• Alignment visualization: RCC–5 as an identifier assessment tool [good / not]
Many names &congruent region
One name &non-congruent regions
Many names &non-congruent regions
New names &exclusive regions
One name &congruent region
• Alignment visualization: RCC–5 as an identifier assessment tool [good / not]
Many names &congruent region
One name &non-congruent regions
Many names &non-congruent regions
New names &exclusive regions
• Query services rendered: (1) MSW3 destabilizes MSW2; (2) non-congruence is not only caused by differential low-level sampling; (3) alignment constitutes a taxonomic meaning integration map to navigate across MSW3 & MSW2.
Use case 1.b.: Quantifying name (identifier) reliability
1 in 3 names is unreliable across MSW2/MSW3 classifications
Source: Franz et al. 2016. Two influential primate classifications logical aligned. doi:10.1093/sysbio/syw023
Use case 1.c.: Reasoning achieves scalability (MIR matrix)
Source: Dang et al. 2015. ProvenanceMatrix: a visualization tool for multi-taxonomy alignments. CEUR Workshop Proceedings 1456: 13–24. http://ceur-ws.org/Vol-1456/paper2.pdf
Use case 2.a.: Psittaciformes with & without coverage constraint
Use case 2.b.: Alignment of the "Neoavian explosion"
• Aves sec. 2015/2014, down to ordinal level – with coverage locally relaxed
• Aves sec. 2015/2014, down to ordinal level – with coverage locally relaxed
Non-congruence within2015.Paleognathae Non-congruence within
2014.Pelecanimorphae
Use case 2.b.: Alignment of the "Neoavian explosion"
• Aves sec. 2015/2014, down to ordinal level – with coverage locally relaxed
Non-congruence within2015/2014.Neoaves
(see next slide)
Use case 2.b.: Precise semiotics for the "avian explosion"
• Neoaves sec. 2015/2014, and 3–4 less inclusive levels
26 overlapping articulations in the sub- Neoavian alignment region cannot be assigned to differential sampling 'Genuine' phylogenetic conflict
Use case 2.b.: Precise semiotics for the "avian explosion"
In conclusion:
Achievements, challenges, promise
Taxonomic concept reasoning – now & soon?• Current reasoning toolkit over can typically handle:
• 2-6 input taxonomies at once,
• maximally with ca. 3,200 input concepts.
Taxonomic concept reasoning – now & soon?• Current reasoning toolkit over can typically handle:
• 2-6 input taxonomies at once,
• maximally with ca. 3,200 input concepts.
• Wider adoption is increasingly a matter of making the case, generating will at various levels: publishing systematists, TDWG, aggregators, publishers, etc.
• Theory and reasoning performance are no longer most pressing limitations.
Taxonomic concept reasoning – now & soon?• Current reasoning toolkit over can typically handle:
• 2-6 input taxonomies at once,
• maximally with ca. 3,200 input concepts.
• Wider adoption is increasingly a matter of making the case, generating will at various levels: publishing systematists, TDWG, aggregators, publishers, etc.
• Theory and reasoning performance are no longer most pressing limitations.
• Two new applications in planning:
• Integration of taxonomic concept syntax and semantics into Pensoft's "Open Biodiversity Knowledge Management System" (OBKMS).
• Transition of a specimen-based Symbiota flora portal (SERNEC) to utilizing (only) taxonomic concepts and RCC–5 relationships.
Acknowledgements & links to products and references
• TDWG#16 organizers, especially Gail Kampmeier & William Ulate!
• Euler/X & ETC teams (extended): Shawn Bowers, Mingmin Chen, Hong Cui, Parisa Kianmajd, James Macklin, Timothy McPhillips, Robert Morris, Thomas Rodenhausen, and Shizhuo Yu.
• ProvenanceMatrix: Tuan Nhon Dang.
• NSF DEB–1155984, DBI–1342595 (PI Franz).
• NSF IIS–118088, DBI–1147273 (PI Ludäscher).
• Information @ http://taxonbytes.org/tag/concept-taxonomy/