Top Banner
Dan Masiga Molecular Biology and Biotechnology Department International Centre of Insect Physiology and Ecology, Nairobi, Kenya The BARCODE Data BARCODE Data Standard Standard: Enabling Molecular Diagnostics for Biodivesity Western and Central Africa: DNA barcoding Meeting One-day course on DNA barcoding: Practical advice 23rd October 2008
19

Dan Masiga Molecular Biology and Biotechnology Department International Centre of Insect Physiology and Ecology, Nairobi, Kenya BARCODE Data Standard The.

Dec 24, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Dan Masiga Molecular Biology and Biotechnology Department International Centre of Insect Physiology and Ecology, Nairobi, Kenya BARCODE Data Standard The.

Dan MasigaMolecular Biology and Biotechnology DepartmentInternational Centre of Insect Physiology and Ecology, Nairobi, Kenya

The BARCODE Data StandardBARCODE Data Standard: Enabling Molecular Diagnostics

for Biodivesity

Western and Central Africa: DNA barcoding MeetingOne-day course on DNA barcoding: Practical advice23rd October 2008

Page 2: Dan Masiga Molecular Biology and Biotechnology Department International Centre of Insect Physiology and Ecology, Nairobi, Kenya BARCODE Data Standard The.

New partners

Page 3: Dan Masiga Molecular Biology and Biotechnology Department International Centre of Insect Physiology and Ecology, Nairobi, Kenya BARCODE Data Standard The.

The Infrastructure of Taxonomy

• Collections and databases of specimens• Codes of Taxonomic Nomenclature• Compilations of taxonomic names• Data repositories (characters, gene

sequences, images, trees)• Monographs• Floristic and faunistic surveys/inventories• Revisions• The (undigitized) Taxonomic Literature

Page 4: Dan Masiga Molecular Biology and Biotechnology Department International Centre of Insect Physiology and Ecology, Nairobi, Kenya BARCODE Data Standard The.

International Nucleotide Sequence Database Collaboration

http://www.insdc.org/

Page 5: Dan Masiga Molecular Biology and Biotechnology Department International Centre of Insect Physiology and Ecology, Nairobi, Kenya BARCODE Data Standard The.

Roles of INSDCan archival database/repository

for nucleotide sequence

Output of Project A

Output of Project B

Output of Project C

Common

access

interface

Standardization of data structure including data items and values

Assignment of a unique identifier (an accession number) to a sequence

Users

Page 6: Dan Masiga Molecular Biology and Biotechnology Department International Centre of Insect Physiology and Ecology, Nairobi, Kenya BARCODE Data Standard The.

New tools for taxonomyD

NA

Barc

od

ing

The ability to compare genotype information across a huge range of organisms is a powerful tool

Page 7: Dan Masiga Molecular Biology and Biotechnology Department International Centre of Insect Physiology and Ecology, Nairobi, Kenya BARCODE Data Standard The.

“Only [27%] of papers had a legitimate specimens examined section, with museum numbers for each

voucher, and names of the museums where the specimens used in the study could be examined”

Page 8: Dan Masiga Molecular Biology and Biotechnology Department International Centre of Insect Physiology and Ecology, Nairobi, Kenya BARCODE Data Standard The.

Couplets Consisting of:

“Species Name - DNA Sequence”DNA Sequence”

•Basis of a “look-up table” enabling molecular diagnostic applications

•However, both elements need validation

•Underlying specimens and associated raw sequence data are not typically available for secondary inspection

Page 9: Dan Masiga Molecular Biology and Biotechnology Department International Centre of Insect Physiology and Ecology, Nairobi, Kenya BARCODE Data Standard The.

Problem Areas

TRANSPARENCY AND TRACEABILITY

• Genetic Data Quality• Specimen Data Quality• Taxonomy • Access to Information

Page 10: Dan Masiga Molecular Biology and Biotechnology Department International Centre of Insect Physiology and Ecology, Nairobi, Kenya BARCODE Data Standard The.

Barcoders began calling for a Paradigm ShiftParadigm Shift

Depositing barcode sequences in public database, along with primer sequences, trace files and associated quality scores makes this species identification technique widely accessible. Reference

DNA barcode sequences should be derived from, and liked to, specimens of known promenance in web-accessible collections in

order to validate this system of molecular diagnostics.

Page 11: Dan Masiga Molecular Biology and Biotechnology Department International Centre of Insect Physiology and Ecology, Nairobi, Kenya BARCODE Data Standard The.

Rationale for Defining “BARCODE” keyword in GenBank

• Provides the community with reference records with verifiable and retrievable data:– Associated with retrievable voucher specimens

(liberally defined: tissue, DNA, etc.)– Linked to on-line metadata– Meet an agreed upon standard of taxonomic

identification– Provide an assured level of data completeness– On an agreed upon gene region – Recommended for use in identifying unknowns

Page 12: Dan Masiga Molecular Biology and Biotechnology Department International Centre of Insect Physiology and Ecology, Nairobi, Kenya BARCODE Data Standard The.

The Barcode Data StandardBarcode Data Standard Establishing a new data standard for “BARCODE”

keyword records in DDBJ/EMBL/GenBank:

1.Minimum 500bp, <1% ambiguous base calls2.Double stranded sequence3.Trace files and associated quality scores4.Primers used to generate sequence5.Linkages to:

• A morphological voucher specimen• Structured reference to collections• Geospatial reference information• Valid species name• Who performed the identification• Literature citations

Page 13: Dan Masiga Molecular Biology and Biotechnology Department International Centre of Insect Physiology and Ecology, Nairobi, Kenya BARCODE Data Standard The.

Features, Qualifiers and Values

The Feature table is updated based on discussions at the International Collaborators meeting of INSDC

Page 14: Dan Masiga Molecular Biology and Biotechnology Department International Centre of Insect Physiology and Ecology, Nairobi, Kenya BARCODE Data Standard The.

NCBI Trace Archive accepts BARCODE as a keyword that identifies “a DNA

sequence analysis of a uniform target gene to enable species identification”

Page 15: Dan Masiga Molecular Biology and Biotechnology Department International Centre of Insect Physiology and Ecology, Nairobi, Kenya BARCODE Data Standard The.

Triplet structure for specimen identifiers

/specimen_voucher=“<institution-code>|<collection-code>|<specimen-id>”

<institution-code>- abbreviation of the archiving institution <collection-code>- collection within the institution (*) <specimen-id>- specimen identifier within the collection The above approach is used in the DarwinCore/GBIF and is parallel to the Life Science Identifier (LSID) that is an Object Management Group (OMG) standard.

(*) museums & herbaria culture collections stock centers germplasm repositories (seed banks) frozen tissue banks zoos/aquaria/botanical gardens DNA banks, personal collections e-voucher archives

Page 16: Dan Masiga Molecular Biology and Biotechnology Department International Centre of Insect Physiology and Ecology, Nairobi, Kenya BARCODE Data Standard The.
Page 17: Dan Masiga Molecular Biology and Biotechnology Department International Centre of Insect Physiology and Ecology, Nairobi, Kenya BARCODE Data Standard The.

Link from GenBank to Museums

www.biorepositories.org

Page 18: Dan Masiga Molecular Biology and Biotechnology Department International Centre of Insect Physiology and Ecology, Nairobi, Kenya BARCODE Data Standard The.

Process Record

Page 19: Dan Masiga Molecular Biology and Biotechnology Department International Centre of Insect Physiology and Ecology, Nairobi, Kenya BARCODE Data Standard The.

acknowledgments• Lee Weight, Smithsonian Institution• Scott Miller, PI CBOL• David Schindel, Executive Secretary, CBOL• Sujeevan Ratnasingham, Biodiversity Institute of

Ontario (BIO)/BOLD• Robert Hanner (BIO)• Organizers: Western and Central Africa DNA

barcoding Meeting (NABDA & CBOL Secretariat)