Top Banner
Plant DNA Barcoding: data workflow Aron Fazekas University of Guelph
24

Dr Aron Fazekas - Plant DNA Barcoding; data workflow

Dec 01, 2014

Download

Education

Dr Fazekas process for checking and editing DNA sequences before publishing on BOLD.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Dr Aron Fazekas - Plant DNA Barcoding; data workflow

Plant DNA Barcoding: data workflow

Aron Fazekas University of Guelph

Page 2: Dr Aron Fazekas - Plant DNA Barcoding; data workflow

Plant DNA Barcoding: data workflow

Workflow Outline:

raw sequence editing

data alignment

re-edit the sequence file

upload to BOLD

quality checks using BOLD / genbank

Page 3: Dr Aron Fazekas - Plant DNA Barcoding; data workflow
Page 4: Dr Aron Fazekas - Plant DNA Barcoding; data workflow

Sequence editing: primer trimming

Page 5: Dr Aron Fazekas - Plant DNA Barcoding; data workflow

5’ GTTATGCATGAACGTAATGCTC

GAGCATTACGT….

Sequence editing: primer trimming

Page 6: Dr Aron Fazekas - Plant DNA Barcoding; data workflow

Sequence editing: primer trimming

Page 7: Dr Aron Fazekas - Plant DNA Barcoding; data workflow

Sequence editing: editing miscalls

Page 8: Dr Aron Fazekas - Plant DNA Barcoding; data workflow

Sequence editing: congruence between forward/ reverse reads

Page 9: Dr Aron Fazekas - Plant DNA Barcoding; data workflow

Sequence Alignment

rbcL easy to align - most programs work wellmatK tricky to align – TransAlign seems to do the best job

trnH difficult (impossible between genera?)ITS difficult (impossible between genera?)

After editing: need to align the dataKelchner (2000) Ann Missouri Bot

Gard

Clustal www.clustal.orgTransAlign http://www.biomedcentral.com/1471-2105/6/156K-Align http://www.ebi.ac.uk/Tools/msa/kalign/

Page 10: Dr Aron Fazekas - Plant DNA Barcoding; data workflow

Problems to look for after alignment:

- primers not trimmed

- gaps at the ends

- gaps in the middle (protein coding)

- translation shows stop codons

Sequence Alignment

Page 11: Dr Aron Fazekas - Plant DNA Barcoding; data workflow

- primers not trimmed trnH-psbAReal data submitted for publication

- gaps at the ends

Page 12: Dr Aron Fazekas - Plant DNA Barcoding; data workflow

rbcLdata submitted for publication - gaps in the middle of a

coding region

Page 13: Dr Aron Fazekas - Plant DNA Barcoding; data workflow

Translate coding regions (rbcL, matK) to ensure there are no stop codons present

Page 14: Dr Aron Fazekas - Plant DNA Barcoding; data workflow

Edit both the alignment file and the original sequence file

Page 15: Dr Aron Fazekas - Plant DNA Barcoding; data workflow

Can trnH-psbA (or other non-coding sequence) be aligned across diverse species?

Page 16: Dr Aron Fazekas - Plant DNA Barcoding; data workflow

Upload to BOLD

Page 17: Dr Aron Fazekas - Plant DNA Barcoding; data workflow

After data is edited, aligned: use BOLD to create a tree

Page 18: Dr Aron Fazekas - Plant DNA Barcoding; data workflow
Page 19: Dr Aron Fazekas - Plant DNA Barcoding; data workflow

• Check for misplaced taxa – remove them from the dataset

• Check for singleton species – make a list

Page 20: Dr Aron Fazekas - Plant DNA Barcoding; data workflow

BOLD BLAST check

Page 21: Dr Aron Fazekas - Plant DNA Barcoding; data workflow

Genbank BLAST check

Page 22: Dr Aron Fazekas - Plant DNA Barcoding; data workflow

Genbank BLAST check

Page 23: Dr Aron Fazekas - Plant DNA Barcoding; data workflow

Genbank Blast

Page 24: Dr Aron Fazekas - Plant DNA Barcoding; data workflow

Acknowledgements

Sujeevan Ratnasingham & Bold Team