Top Banner
Integrating Bio-Data Lee Belbin Manager, Infrastructure Project TDWG (Biodiversity Information Standards)
57

TDWG at the University of Tasmania

Apr 16, 2017

Download

Education

leebel
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: TDWG at the University of Tasmania

Integrating Bio-Data

Lee BelbinManager, Infrastructure Project

TDWG (Biodiversity Information Standards)

Page 2: TDWG at the University of Tasmania

Who has heard of GBIF?

Page 3: TDWG at the University of Tasmania

GBIF

The Global Biodiversity Information Facility

Page 4: TDWG at the University of Tasmania

GBIF

International organisation established to share bio-data

Page 5: TDWG at the University of Tasmania

GBIF

Supported by ~42 countries (including Australia) and ~35

international organisations

Page 6: TDWG at the University of Tasmania

GBIF

The Australian hub is through ABRS (ABIF)

Page 7: TDWG at the University of Tasmania
Page 8: TDWG at the University of Tasmania

Who has heard of TDWG?

Page 9: TDWG at the University of Tasmania

…that’s what I figured

Page 10: TDWG at the University of Tasmania

TDWG

Formerly: The Taxonomic Database Working Group

Page 11: TDWG at the University of Tasmania

…but more accurately referred to as

Biodiversity Information Standards

Page 12: TDWG at the University of Tasmania

Biodiversity Information Standards

International group responsible for standards and protocols for

sharing bio-data

Page 13: TDWG at the University of Tasmania

EOL?

Encyclopedia of Life

Page 14: TDWG at the University of Tasmania

ALA?

Atlas of Living Australia

Page 15: TDWG at the University of Tasmania

GBIF, EoL, ALA …

…are now or will be based on TDWG standards

Page 16: TDWG at the University of Tasmania

So what?

(…hence this talk…)

Page 17: TDWG at the University of Tasmania

Science is getting more collegiate

… a good thing.

Page 18: TDWG at the University of Tasmania

The Project

US$2 million over 2.5 years (Gordon & Betty Moore

Foundation)

Page 19: TDWG at the University of Tasmania

Aim

To improve the standards for sharing 'bio-data'

Page 20: TDWG at the University of Tasmania

Why?

The whole is (far) more than the sum of the parts…

Page 21: TDWG at the University of Tasmania

PeopleLee Belbin (Hobart: Manager), Roger Hyam (Edinburgh: Systems Architect), Ricardo Pereira (Brasilia: Software Engineer)Donald Hobern (Copenhagen: GBIF & now Manager of the ALA),Stan Blum (San Francisco, TDWG old timer!)

Page 22: TDWG at the University of Tasmania

Once … we had paper!

Page 23: TDWG at the University of Tasmania

and calculators!

Page 24: TDWG at the University of Tasmania
Page 25: TDWG at the University of Tasmania
Page 26: TDWG at the University of Tasmania

The attitude:

“It’s Mine!”

Page 27: TDWG at the University of Tasmania

Then..

Page 28: TDWG at the University of Tasmania
Page 29: TDWG at the University of Tasmania
Page 30: TDWG at the University of Tasmania
Page 31: TDWG at the University of Tasmania

… but we are moving to

…far more open sharing and integration of data

Page 32: TDWG at the University of Tasmania

This will enable

…more effective environmental and species conservation / management

(among many other things)

Page 33: TDWG at the University of Tasmania

To do this, we need effective standards

…using ‘web 2.0’ technologies

Page 34: TDWG at the University of Tasmania

Video

‘The web is us’

http://www.youtube.com/watch?v=6gmP4nk0EOE

Page 35: TDWG at the University of Tasmania

Standards?

…Good ones are transparent to most who use them

Page 36: TDWG at the University of Tasmania

But for your education…I’ll give you a little insight … it will be

good for you.

Promise

Page 37: TDWG at the University of Tasmania

Standards to exchange bio-data have three components-

1. An ontology2. GUIDs

3. Transport protocols

Page 38: TDWG at the University of Tasmania

1. OntologyIs a data model that represents a formal set of concepts within a domain and the relationships

between those concepts

Page 39: TDWG at the University of Tasmania

Ontologies…are the basis of the Semantic Web where objects are given

meaning which computers and humans can understand

Page 40: TDWG at the University of Tasmania

Ontologies

…can be used by machines to reason about the objects

within that domain

Page 41: TDWG at the University of Tasmania
Page 42: TDWG at the University of Tasmania

Resource Description Framework …

RDF is the language of the Semantic Web

Page 43: TDWG at the University of Tasmania
Page 44: TDWG at the University of Tasmania

ALL data can be stored in the form of ‘RDF triples’ …

subject – predicate (verb) – objectWine – has vintage - 2005

Page 45: TDWG at the University of Tasmania
Page 46: TDWG at the University of Tasmania

2. GUIDs

Globally Unique Identifiers

Page 47: TDWG at the University of Tasmania

GUIDs

Assigned by authorities to their (bio) objects

Page 48: TDWG at the University of Tasmania

GUIDs

…Remain attached to data objects(with attribution!)

Page 49: TDWG at the University of Tasmania

GUIDs

… When ‘clicked’ return ‘semantic’ metadata / data

Page 50: TDWG at the University of Tasmania

GUID of Choice …

Life Science Identifiers(LSIDs)

Page 51: TDWG at the University of Tasmania

Transport Protocols

…Map local data to global standards

Page 52: TDWG at the University of Tasmania

Transport Protocols

… Enable searching across geographically separated data repositories (based on different

systems)

Page 53: TDWG at the University of Tasmania

The transport protocol of choice …

TAPIRTDWG Access Protocol for

Information Retrieval

Page 54: TDWG at the University of Tasmania

Transport Protocol

Video

http://www.youtube.com/watch?v=x9404is3RJ8

Page 55: TDWG at the University of Tasmania

An Example

Antbase, Google, Genbank, PubMed ‘skimmed’ for RDF

and GUIDs using TAPIR

Page 56: TDWG at the University of Tasmania

… Emergent Properties…there are specimens that have been barcoded and which are labelled in GenBank as unidentified (i.e., names like "Melissotarsus sp. BLF m1"), but the same specimen has a proper name in AntWeb (e.g., casent0107665-d01 is Melissotarsus insularis).

We can then use this information to add value to GenBank. For example, a search of GenBank for sequences for Melissotarsus insularis find nothing, but it does have sequences for this taxon, albeit under the name "Melissotarsus sp. BLF m1".

Rod Page

Page 57: TDWG at the University of Tasmania