Taxon diversity analysis for bulk insect samples using Illumina Hi-seq platform Xin ZHOU, Shanlin LIU, Yiyuan LI, Qing YANG, and Xu SU Department of Science and Technology Environmental Genomics Research Group BGI, China Adelaide, Australia, 3 December 2011
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Taxon diversity analysis for bulk insect samples using Illumina Hi-seq platform
Xin ZHOU, Shanlin LIU, Yiyuan LI,
Qing YANG, and Xu SU
Department of Science and Technology
Environmental Genomics Research Group
BGI, China
Adelaide, Australia, 3 December 2011
Opt.1: ......zzzzZZZZZ
Opt.2: morph sorting indiv. ID … Opt.1
Opt.3: morph sorting indiv. barcoding … Opt.1
Opt.4: grinding up NGS CLUSTERING/BLAST DIVERSITY!
Problem Solutions?
Zhou et al. 2011, 4th International Barcode of Life Conference
Environmental barcoding of bulk insects
Zhou et al. 2011, 4th International Barcode of Life Conference
aquatic insects mini-barcode (130bp) 454
bat diet (insects) COI fragment, 157 bp 454
Biodiversity soup: metabarcoding of arthropods for rapid biodiversity assessment and biomonitoring, Yu D.W. et.al., in review
Zhou et al. 2011, 4th International Barcode of Life Conference
Total MT isolation & DNA extraction
Sample
mixture
Total MT
isolation
MT DNA extraction
Zhou et al. 2011, 4th International Barcode of Life Conference
Shotgun sequencing
Percentage of base pairs
Q20 (Sequencing error rate < 1%) 96.2%
Q30 (Sequencing error rate < 0.1%) 92.9%
GC content 38.0%
Insert size: 200bp;Read length: 100bp PE;
Zhou et al. 2011, 4th International Barcode of Life Conference
Pre-analysis
Raw data 2.45G
After filtering 2.20GRatio of high
quality reads 89.91%
Data filtering:1. Adaptor contamination removal;2. Quality control:
in each read, only allowing <10bp with seq. error rate >1%
Zhou et al. 2011, 4th International Barcode of Life Conference
Taxon groups # OTUs
Lepidoptera 20Diptera 2
Hemiptera 3Psocoptera 1
Total 26Not found 13
Method 1: Reference basedBlast reads to reference barcodes, confident identification is made only when:1. Best BLAST hit >98% identity;2. Reference coverage > 90%;
Reference 1
Reference 2
Correct mapping
Incorrect mapping
Coverage: 100%
Coverage: 30%
Approach #2: PCR-free method
Zhou et al. 2011, 4th International Barcode of Life Conference
Potential sources of failure in detecting taxa
?Taxon specific
orBio-mass
(size & number)
Zhou et al. 2011, 4th International Barcode of Life Conference
Taxon bias?
Failures in taxon detection
Taxon groups undetected
# Total OTUs
# OTUs missing
Lepidoptera 25 5Diptera 7 5
Hymenoptera 2 2Hemiptera 4 1Psocoptera 1 0
Total 39 13
Zhou et al. 2011, 4th International Barcode of Life Conference
OR bio-mass (body size, # individuals)?
Failures in taxon detection
Readily detectedAverage length> 5mm
MissingAverage length < 5mm
Zhou et al. 2011, 4th International Barcode of Life Conference
1. Assembly of COI gene using genome assembly program (SOAPdenovo);
2. Annotation using ~240 MT genomes downloaded from Genbank;
Method 2: Reference independent
Approach #2: PCR-free method
Zhou et al. 2011, 4th International Barcode of Life Conference
(Will we be able to identify diversity without reference MT genomes for the targeted species?)
Workflow:
PCR-Free reference-independent: results
23/31 falling in standard COI barcode region (mostly >600 bp);
1 of 23 is not in our reference barcodes;(Insecta; Lepidoptera; Pyralidae);