Top Banner
Biological nomenclature in the postgenomic era: Biological and computational issues. George Garrity and Catherine Lyons Bergey’s Manual Trust and Explicatrix, LLC
24

Biological nomenclature in the postgenomic era: Biological and computational issues. George Garrity and Catherine Lyons Bergey’s Manual Trust and Explicatrix,

Jan 08, 2018

Download

Documents

Stan Falkow’s Underwear “Given a choice, most taxonomists would rather wear each other’s underwear than use each other’s names” Why is this so?
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Biological nomenclature in the postgenomic era: Biological and computational issues. George Garrity and Catherine Lyons Bergey’s Manual Trust and Explicatrix,

Biological nomenclature in the postgenomic era:

Biological and computational issues.George Garrity and Catherine Lyons

Bergey’s Manual Trust and Explicatrix, LLC

Page 2: Biological nomenclature in the postgenomic era: Biological and computational issues. George Garrity and Catherine Lyons Bergey’s Manual Trust and Explicatrix,

Imagine..

• A clinical microbiologist’s predicament• The microbial ecologist’s dilemma• The case of Francisella novicida• The history of the Altermonadaceae

– Genus described in 1972• 15 emendations, 20 species

– 19 moved to four genera– 5 synonyms, two subspecies– 64 names, five genera, three families, two classes

• The common thread in all these stories…

Page 3: Biological nomenclature in the postgenomic era: Biological and computational issues. George Garrity and Catherine Lyons Bergey’s Manual Trust and Explicatrix,

Stan Falkow’s Underwear

“Given a choice, most taxonomists would rather wear each other’s

underwear than use each other’s names”

Why is this so?

Page 4: Biological nomenclature in the postgenomic era: Biological and computational issues. George Garrity and Catherine Lyons Bergey’s Manual Trust and Explicatrix,

My objective• Share some insights on problems in three areas

– Nomenclature and taxonomy– Publishing taxonomic information– A generalized taxonomic model

• Finite state machine• Simple grammar

– Global issues• Data equivalence• Data provenance• Data curation

Page 5: Biological nomenclature in the postgenomic era: Biological and computational issues. George Garrity and Catherine Lyons Bergey’s Manual Trust and Explicatrix,

Problems in nomenclature• Systematic biologists

– Marking territory– Personal achievement

• Other biologists– End-users

• Unfamiliar with literature– Unique aspects

• Unaware of Codes of Nomenclature– Legalistic framework

» Formation and assignment of names» Circumscription and emendation of taxa» Priority and citation» Synonymy and homonymy» Correction of orthographic errors» Adjudication of nomenclatural disputes

– But» Do not govern classification or identification

Page 6: Biological nomenclature in the postgenomic era: Biological and computational issues. George Garrity and Catherine Lyons Bergey’s Manual Trust and Explicatrix,

– Biological names• Primary entry point into STM literature• Prominent role in laws/regulations

– Commerce, public safety, public health• Primary entry point into scientific databases• Poor identifiers

– Fixed in time and scope– May not be revised– Synonymies generally not address– Persist, but

» obsolesce in relation to taxon» An archival record of a taxonomic

definition for a single point in time

Problems in nomenclature (cont.)

Page 7: Biological nomenclature in the postgenomic era: Biological and computational issues. George Garrity and Catherine Lyons Bergey’s Manual Trust and Explicatrix,

The name/taxon disjunction• Impact

– Accumulation of dubious names in literature/databases

– Effects assertions of:• Identity, commonality of pathways, common

ancestry, homology, parology, xenology• Legal consequences

Page 8: Biological nomenclature in the postgenomic era: Biological and computational issues. George Garrity and Catherine Lyons Bergey’s Manual Trust and Explicatrix,

Problems in print publishing• Key requirement

– Proposals and emendations must appear in print• Code specific

– Prokaryotic Code» Effective, legitimate, and valid» Registration

• Taxonomies are retrospective– Can only cite earlier publications– Cannot cite future emendations– Increasingly based on molecular sequence data

• Deposit of sequence data in public databases– Not conveniently referenced in print

Page 9: Biological nomenclature in the postgenomic era: Biological and computational issues. George Garrity and Catherine Lyons Bergey’s Manual Trust and Explicatrix,

Problems with electronic publishing• No formal publishing mechanisms

– Does not fulfill fundamental requirement of the Code(s)

– Lack bibliographic information• Not citable• Not persistent

– Subject to uncontrolled change– May disappear

• Link rot– 404 Link not found

Page 10: Biological nomenclature in the postgenomic era: Biological and computational issues. George Garrity and Catherine Lyons Bergey’s Manual Trust and Explicatrix,

A brief glimpse at where we’re headed

• The Bergamot/N4L model– Separates names from taxa

• Taxa nameless– Uniquely, persistently identified

– Supports multiple, overlapping taxonomies• Accumulation of new data vs. new methodologies• Rank agnostic

– Unique from all other approaches• An identifier resolution service, not an information space in

which to practice taxonomy.– Names provide an entry point into the literature

• Reliably• Persistently

• A lightweight information layer

Page 11: Biological nomenclature in the postgenomic era: Biological and computational issues. George Garrity and Catherine Lyons Bergey’s Manual Trust and Explicatrix,

A simple grammarspecies -> current.name.pointer, exemplar.deposit.pointer+,

sequence.deposit.pointer+taxon -> current.name.pointer, nomos.defined.data, (taxon+|

species+)nomos.defined.data -> (sequence|phenotypic.feature|text)+name -> (citation, bibliographic.record, name.status)exemplar -> exemplar.id, sourcesequence -> gene, sequence.depositsource -> exemplar|exemplar.deposit|textexemplar.deposit -> brc.id.pointer, deposit.id.pointer, sourcesequence.deposit -> brc.id.pointer, deposit.id.pointer, sourcephenotypic.feature -> feature.name, feature.value, deposit.id.pointer

Page 12: Biological nomenclature in the postgenomic era: Biological and computational issues. George Garrity and Catherine Lyons Bergey’s Manual Trust and Explicatrix,

Exemplar+ Sequence+

Name+

Taxo

n

Species+

Page 13: Biological nomenclature in the postgenomic era: Biological and computational issues. George Garrity and Catherine Lyons Bergey’s Manual Trust and Explicatrix,

Exemplar+ Sequence+

Name+

Taxo

n

Literature Governing bodies

GenBankDDBJEMBLothers

CollectionsBRC

Species+

Page 14: Biological nomenclature in the postgenomic era: Biological and computational issues. George Garrity and Catherine Lyons Bergey’s Manual Trust and Explicatrix,

Taxo

n

Exemplar+ Sequence+

Name+

Species+

Literature Governing bodies

GenBankDDBJEMBLothers

CollectionsBRC

Practitioner + Practitioner+

Practitioner+

genotypic“omics”

ProposalSTMLegal

Databases

PriorityValiditySynonymyExemplar req.

phenotypic

directindirect

BRC

Public Private

General

Page 15: Biological nomenclature in the postgenomic era: Biological and computational issues. George Garrity and Catherine Lyons Bergey’s Manual Trust and Explicatrix,

Exemplar+ Sequence+

Name+

Species+

A properly formed species

Sequence+

Name+

Species+

Candidatus or exemplarlost

Sequence+

Environmental sequence

Exemplar+

Name+

Species+

Old type strain, not yet sequenced

Name+

Species+

Old type, exemplar based ondrawing or description

Sequence+

“Name”+

Misidentifed taxon

Exemplar*

Page 16: Biological nomenclature in the postgenomic era: Biological and computational issues. George Garrity and Catherine Lyons Bergey’s Manual Trust and Explicatrix,

Exemplar+ Sequence+

Name+

Taxo

n

N4L/Bergamot

Literature Governing bodies

GenBankDDBJEMBLothers

CollectionsBRC

Species+

Page 17: Biological nomenclature in the postgenomic era: Biological and computational issues. George Garrity and Catherine Lyons Bergey’s Manual Trust and Explicatrix,
Page 18: Biological nomenclature in the postgenomic era: Biological and computational issues. George Garrity and Catherine Lyons Bergey’s Manual Trust and Explicatrix,

A bit of background information• Bergey’s Manual Trust

– Principal information source• Bergey’s Manual of Determinative Bacteriology• Bergey’s Manual of Systematic Bacteriology• Taxonomic Outline of the Procaryotes

Page 19: Biological nomenclature in the postgenomic era: Biological and computational issues. George Garrity and Catherine Lyons Bergey’s Manual Trust and Explicatrix,

A bit of background information• Bergey’s Manual Trust

– Principal information source• Bergey’s Manual of Determinative Bacteriology• Bergey’s Manual of Systematic Bacteriology• Taxonomic Outline of the Procaryotes

Page 20: Biological nomenclature in the postgenomic era: Biological and computational issues. George Garrity and Catherine Lyons Bergey’s Manual Trust and Explicatrix,

A bit of background information• Bergey’s Manual Trust

– Principal information source• Bergey’s Manual of Determinative Bacteriology• Bergey’s Manual of Systematic Bacteriology• Taxonomic Outline of the Procaryotes

– Expertise in content packaging/delivery• SGML/XML publishing

– The Systematics» XML compliant SGML instance

Page 21: Biological nomenclature in the postgenomic era: Biological and computational issues. George Garrity and Catherine Lyons Bergey’s Manual Trust and Explicatrix,
Page 22: Biological nomenclature in the postgenomic era: Biological and computational issues. George Garrity and Catherine Lyons Bergey’s Manual Trust and Explicatrix,

A bit of background information• Bergey’s Manual Trust

– Principal information source• Bergey’s Manual of Determinative Bacteriology• Bergey’s Manual of Systematic Bacteriology• Taxonomic Outline of the Procaryotes

– Expertise in content packaging/delivery• SGML/XML publishing

– The Systematics» XML compliant SGML instance

– The Outline» An experiment in SGML/XML publishing

Page 23: Biological nomenclature in the postgenomic era: Biological and computational issues. George Garrity and Catherine Lyons Bergey’s Manual Trust and Explicatrix,

A bit of background information• Bergey’s Manual Trust

– Principal information source• Bergey’s Manual of Determinative Bacteriology• Bergey’s Manual of Systematic Bacteriology• Taxonomic Outline of the Procaryotes

– Expertise in content packaging/delivery• SGML/XML publishing

– The Systematics» XML compliant SGML instance

– The Outline» An experiment in SGML/XML publishing

Page 24: Biological nomenclature in the postgenomic era: Biological and computational issues. George Garrity and Catherine Lyons Bergey’s Manual Trust and Explicatrix,

A bit of background information• Bergey’s Manual Trust

– Principal information source• Bergey’s Manual of Determinative Bacteriology• Bergey’s Manual of Systematic Bacteriology• Taxonomic Outline of the Procaryotes

– Expertise in content packaging/delivery• SGML/XML publishing

– The Systematics» XML compliant SGML instance

– The Outline» An experiment in SGML/XML publishing

– Derivative projects» Bergamot/N4L» The Determinative