Top Banner
The emerging biodiversity data ecosystem Cynthia Parr, Katja Schulz, Jennifer Hammock Smithsonian Institution Nathan Wilson, Patrick Leary Marine Biological Laboratory Richard Allen Environmental Protection Agency
22
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: The emerging biodiversity data ecosystem

The emerging biodiversity data ecosystem

Cynthia Parr, Katja Schulz, Jennifer Hammock Smithsonian Institution

Nathan Wilson, Patrick LearyMarine Biological Laboratory

Richard AllenEnvironmental Protection Agency

Page 2: The emerging biodiversity data ecosystem

Today’s story

What is EOL

Core questions

Network analysis

Hotlist development

Page richness algorithm

Conclusion: improving the health and richness of our knowledge network advances understanding

Page 3: The emerging biodiversity data ecosystem

What is EOL

http://www.eol.org• Global access to knowledge

about life on earth• All species• Freely accessible & reusable:

open access, open source• Available from a single portal

in a common format• Quality• Always growing

Page 4: The emerging biodiversity data ecosystem

EOL Topics

Associations Behaviour ConservationStatus Cyclicity Cytology DiagnosticDescription Diseases Dispersal Distribution Evolution GeneralDescription Genetics Growth Habitat Legislation LifeCycle LifeExpectancy LookAlikes Management Migration MolecularBiology Morphology Physiology PopulationBiology Procedures Reproduction RiskStatement Size Threats Trends TrophicStrategy Uses Description Conservation Key Biology Ecology Introduction Education Barcode CitizenScience EducationResources Genome NucleotideSequences FunctionalAdaptations FossilHistory SystematicsOrPhylogenetics Development IdentificationResources

Page 5: The emerging biodiversity data ecosystem

Content providersDatabasesJournalsLifeDesksPublic contributions

Curating

CommentingTagging

http://www.eol.org

EOL is a content curation community

Aggregation

Page 6: The emerging biodiversity data ecosystem

Core questions

Where is our knowledge about biodiversity?

Where are the gaps?

What are the most effective ways to fill gaps given our limited resources?

Page 7: The emerging biodiversity data ecosystem

Network analysis

EOL

GBIF

NCBI

with Anne Bowser, University of Maryland

EOL connects hubs

Page 8: The emerging biodiversity data ecosystem

The GBIF hub has subnetworks

Page 9: The emerging biodiversity data ecosystem

Key individuals seek out hubs

TOLWeb

Page 10: The emerging biodiversity data ecosystem

Implications and next steps

Need more data

Identify isolated projects & mechanisms for connecting them to the network

Improve resilience & redundancy

Distribute annotation & quality control

Model data flow quantity and impact

Page 11: The emerging biodiversity data ecosystem

Viewer of Life on EOL – Kris Urie

Page 12: The emerging biodiversity data ecosystem

Low % of descendents with text in Arthropods

Page 13: The emerging biodiversity data ecosystem

Within arthropods coverage varies . . . Perhaps as expected

http://synthesis.eol.org/media/treemap/

Page 14: The emerging biodiversity data ecosystem

Developing the EOL hot list

Consultation with taxonomic experts

Development of criteria

Assembly of critical lists

Establishing targets for rich taxon pages, lesser known pages

Page 15: The emerging biodiversity data ecosystem

EOL’s hot lists

Hot List

70,000 taxa

Conservation concern

Invasives

Model organisms

Ecologically important

Pests

Charismatics

Data availability

Red Hot List

2,800 taxa

Most searched

Top 100 invasives

Crops (food)

Zoos & aquaria

High traffic

Higher taxa

Page 16: The emerging biodiversity data ecosystem

Taxon page richness algorithm

a (Breadth) b (Depth) c (Diversity)+ +

Breadth: Images, topics of text objects, references, maps, videos, sounds, conservation status

Depth: # words per text object, # words total

Diversity: Sources (partners)

60% 30% 10%

0 – 1, Threshold 0.4

Page 17: The emerging biodiversity data ecosystem

Summary of EOL page richness

Overall

640,000 have content

2 % are rich

25 % have only links

to literature

Hot List

28 % of 75K are rich

Average richness = 0.30

Red Hot List

56 % of 3K are rich

Average richness = 0.43

Page 18: The emerging biodiversity data ecosystem

Strategies for improving richness

Crowd-sourcing

Collections

Communities

Mobile apps

Leveraging

Enabling platforms

Enabling journals

Data mining BHL etc.

Version 2Coming in Fall

2011!

Page 19: The emerging biodiversity data ecosystem

The page richness index

Helps fill gaps with existing knowledge

Helps prioritize funding and training so that it has maximum impact on closing true gaps

Will be available via API

Computing and storing richness index on EOL is a step towards storing and serving computable data

Page 20: The emerging biodiversity data ecosystem

Summarize data within a partner, then across partners.

For example: compute an average value for one taxon (x specimens), compare to range of values across all taxa (621,393 samples)

Dynamic data summaries = new knowledge

Jen Hammock (EOL)Edward van den Berge (OBIS)

Atlantic CodGadus morhua

Page 21: The emerging biodiversity data ecosystem

Conclusions

There is a lot of data out there in a lot of knowledge bases

Understanding how it is connected can help us improve the ecosystem

• Quality control

• Resilience

• Richness assessment

Large-scale data summaries can foster gap-filling and standing, dynamic knowledge analyses

Page 22: The emerging biodiversity data ecosystem

Thank you

http://www.eol.org

160+ content partners

2000 Flickr contributors

1000s Wikipedia contributors

43,000 EOL members

Funding:John D. and Catherine T. MacArthur Foundation, Alfred P. Sloan Foundation, Cornerstone Institutions, Private Donors

Leadership: Erick Mata, Bob Corrigan, Mark Westneat, Marie Studer, Tom Garnett, Jim Edwards, David Patterson,

Developers: Peter Mangiafico, Jeremy Rice, Dimitri Mozzherin, David Shorthouse, Lisa Whalley and others

Biologists: Tanya Dewey, Audrey Aronowsky, Leo Shapiro

See Demo and Version 2 sneak peak in Software Bazaar