Top Banner
Data models for Community information Robert K. Peet , University of North Carolina John Harris , Nat. Center for Ecol. Analysis & Synthesis Michael D. Jennings , U.S. Geological Survey Dennis Grossman , NatureServe Marilyn D. Walker , USDA Forest Service
23

Data models for Community information Robert K. Peet, University of North Carolina John Harris, Nat. Center for Ecol. Analysis & Synthesis Michael D. Jennings,

Dec 21, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Data models for Community information Robert K. Peet, University of North Carolina John Harris, Nat. Center for Ecol. Analysis & Synthesis Michael D. Jennings,

Data models forCommunity information

Robert K. Peet, University of North CarolinaJohn Harris, Nat. Center for Ecol. Analysis & SynthesisMichael D. Jennings, U.S. Geological SurveyDennis Grossman, NatureServeMarilyn D. Walker, USDA Forest Service

Page 2: Data models for Community information Robert K. Peet, University of North Carolina John Harris, Nat. Center for Ecol. Analysis & Synthesis Michael D. Jennings,

New Directions forCommunity Ecology?

Massive datasets and databases are becoming available which will provide unprecedented access to:

• Spatially explicit environmental & spectral data.

• Species occurrences & co-occurrences.• Species attributes.• Species distributions.

Page 3: Data models for Community information Robert K. Peet, University of North Carolina John Harris, Nat. Center for Ecol. Analysis & Synthesis Michael D. Jennings,

EcoInformatics ?

Massive co-occurrence data have the potential to create new disciplines and allow critical syntheses.

• Theoretical community ecology. Who occurs together, and where, and following what rules?

• Vegetation & species modeling. Where should we expect species & communities to occur after environmental changes?

• Remote sensing. What is really on the ground?

• Monitoring & restoration. What changes are really taking place in the communities?

jennings
just shifted the indent a bit to get the bulleted text to line up
Page 4: Data models for Community information Robert K. Peet, University of North Carolina John Harris, Nat. Center for Ecol. Analysis & Synthesis Michael D. Jennings,

How do we get there from here?

• Public data archives (deposit, withdraw, cite).• Standard data structures. • Standard exchange formats.• Tools for semantic mediation.• Standard protocols.

Page 5: Data models for Community information Robert K. Peet, University of North Carolina John Harris, Nat. Center for Ecol. Analysis & Synthesis Michael D. Jennings,

Biodiversity data structure

Taxonomic databases

Plot/Inventory databases

Specimen databases

Observation/CollectionEvent

Object or specimen

BioTaxon

Locality

SynTaxon

Community type databases

Page 6: Data models for Community information Robert K. Peet, University of North Carolina John Harris, Nat. Center for Ecol. Analysis & Synthesis Michael D. Jennings,

A co-occurrence archive?

There is currently no standard repository for community composition data.

A repository is needed for:

• Record storage and preservation

• Record access and identification

• Record documentation in literature/databases

Page 7: Data models for Community information Robert K. Peet, University of North Carolina John Harris, Nat. Center for Ecol. Analysis & Synthesis Michael D. Jennings,

VegBank

• The ESA Vegetation Panel is currently developing a public archive for vegetation plots known as VegBank (www.vegbank.org).

• VegBank is expected to function for vegetation plot data in a manner analogous to GenBank.

• Primary data will be deposited for reference, novel synthesis, and reanalysis.

• The database architecture is generalizable to most types of species co-occurrence data.

Page 8: Data models for Community information Robert K. Peet, University of North Carolina John Harris, Nat. Center for Ecol. Analysis & Synthesis Michael D. Jennings,

Project

PlotPlot

Observation

Taxon Observation

Taxon Interpretation

PlotInterpretation

Core elements of VegBank

Page 9: Data models for Community information Robert K. Peet, University of North Carolina John Harris, Nat. Center for Ecol. Analysis & Synthesis Michael D. Jennings,

ESA standards for plot data

• Four levels of standards:

• Pick lists (48 and counting)

• Conversion to common units

• Method protocols

• Concept-based interpretations

• “Painless” metadata

jennings
painless metadata is a new item. seems like painless dentistry.....
Page 10: Data models for Community information Robert K. Peet, University of North Carolina John Harris, Nat. Center for Ecol. Analysis & Synthesis Michael D. Jennings,

VegBank Interface Tools

• Desktop client for data preparation and local use.

• Flexible data import, including XML.

• Standard query, flexible query, SQL query.

• Flexible data export, including XML.

• Easy web access to central archive

jennings
add tools for visualization and sorting of data??
Page 11: Data models for Community information Robert K. Peet, University of North Carolina John Harris, Nat. Center for Ecol. Analysis & Synthesis Michael D. Jennings,

The Taxonomic database challenge:Standardizing organisms and communities

The problem: Integration of data potentially representing

different times, places, investigators and taxonomic standards.

The traditional solution: A standard list of organisms / communities.

Page 12: Data models for Community information Robert K. Peet, University of North Carolina John Harris, Nat. Center for Ecol. Analysis & Synthesis Michael D. Jennings,

Standard lists are available

Representative examples for higher plants include: * North America / US

USDA Plants http://plants.usda.gov/ITIS http://www.itis.usda.gov/ NatureServe http://www.natureserve.org

* WorldIPNI International Plant Names Checklist

http://www.ipni.org/IOPI Global Plant Checklist

http://www.bgbm.fu-berlin.de/IOPI/GPC/

Page 13: Data models for Community information Robert K. Peet, University of North Carolina John Harris, Nat. Center for Ecol. Analysis & Synthesis Michael D. Jennings,

Most standardized taxon lists fail to allow effective integration of datasets

The reasons include:

• The user cannot reconstruct the database as viewed at an arbitrary time in the past,

• Taxonomic concepts are not defined (just lists),

• Multiple party perspectives on taxonomic concepts and names cannot be supported or reconciled.

The single largest impediment to large-scale synthesis in community ecology

Page 14: Data models for Community information Robert K. Peet, University of North Carolina John Harris, Nat. Center for Ecol. Analysis & Synthesis Michael D. Jennings,

Carya ovata(Miller)K. Koch

Carya carolinae-sept.(Ashe) Engler & Graebner

Carya ovata(Miller)K. Koch

sec. Gleason 1952 sec. Radford et al. 1968

Three concepts of shagbark hickorySplitting one species into two illustrates the ambiguity often associated with scientific names. If you encounter the name “Carya ovata (Miller) K. Koch” in a database, you cannot be sure which of two meanings applies.

Page 15: Data models for Community information Robert K. Peet, University of North Carolina John Harris, Nat. Center for Ecol. Analysis & Synthesis Michael D. Jennings,

Name ReferenceAssertion

An assertion represents a unique combination of a name and a reference

“Assertion” is equivalent to “Potential taxon” & “taxonomic concept”

Page 16: Data models for Community information Robert K. Peet, University of North Carolina John Harris, Nat. Center for Ecol. Analysis & Synthesis Michael D. Jennings,

NamesCarya ovata Carya carolinae-septentrionalisCarya ovata v. ovataCarya ovata v. australis

Assertions(One shagbark)C. ovata sec Gleason ’52C. ovata sec FNA ‘97

(Southern shagbark)C. carolinae-s. sec Radford ‘68C. ovata v. australis sec FNA ‘97

(Northern shagbark)C. ovata sec Radford ‘68C. ovata (v. ovata) sec FNA ‘97

ReferencesGleason 1952 Britton & BrownRadford et al. 1968 Flora CarolinasStone 1997 Flora North America

Six shagbark hickory assertionsPossible taxonomic synonyms are listed together

Page 17: Data models for Community information Robert K. Peet, University of North Carolina John Harris, Nat. Center for Ecol. Analysis & Synthesis Michael D. Jennings,

Name AssertionUsage

A usage represents a unique combination of an assertion and a name.

Usages can be used to track nomenclatural synonyms

Page 18: Data models for Community information Robert K. Peet, University of North Carolina John Harris, Nat. Center for Ecol. Analysis & Synthesis Michael D. Jennings,

Name AssertionUsage

A usage (name assignment) and assertion (taxon concept) can be

combined in a single model

Reference

Page 19: Data models for Community information Robert K. Peet, University of North Carolina John Harris, Nat. Center for Ecol. Analysis & Synthesis Michael D. Jennings,

1. Carya ovata2. C. carolinae3. C. ovata var. ovata3. C. ovata var. australis

A. ovata sec. GleasonB. ovata sec. FNAC. carolinae sec. RadfordD. ovata australis sec. FNAE. ovata sec. RadfordF. ovata (ovata) sec. FNA

1-F OK2-D OK3-F Syn4-D Syn

Names AssertionsITIS Usage

ITIS likely views the linkage of the assertion “Carya ovata var. australis sec. FNA 1997” with the name “Carya ovata var. australis” as a nomenclatural synonym.

Page 20: Data models for Community information Robert K. Peet, University of North Carolina John Harris, Nat. Center for Ecol. Analysis & Synthesis Michael D. Jennings,

Party Perspective

The Party Perspective on an Assertion includes:

•Status – Standard, Nonstandard, Undetermined

• Correlation with other assertions – Equal, Greater, Lesser, Overlap, Undetermined.

•Lineage – Predecessor and Successor assertions.

•Start & Stop dates.

Page 21: Data models for Community information Robert K. Peet, University of North Carolina John Harris, Nat. Center for Ecol. Analysis & Synthesis Michael D. Jennings,

(Inter)National Taxonomic Database?

An upgrade for ITIS & USDA PLANTS?

• Concept-based.• Party-neutral.• Perfectly archived.• Synonymy and lineage tracking.• Alternate names systems & hierarchies.

Page 22: Data models for Community information Robert K. Peet, University of North Carolina John Harris, Nat. Center for Ecol. Analysis & Synthesis Michael D. Jennings,

A few conclusions

1. EcoInformatics is developing as a large and important new subfield of community ecology

2. Public archives are needed for co-occurrence data.

3. Standard data structures and exchange formats are needed.

4. Records of organisms should always contain a scientific name and a reference!

5. Design for future annotation of organism and community concepts.

6. Archival databases should provide time-specific views.

Page 23: Data models for Community information Robert K. Peet, University of North Carolina John Harris, Nat. Center for Ecol. Analysis & Synthesis Michael D. Jennings,

We are pleased to acknowledge the support and cooperation of:

Ecological Society of America

Gap Analysis Program

National Center for Ecological Analysis and Synthesis

National Biological Information Infrastructure

Federal Geographic Data Committee

National Science Foundation