Top Banner
State-of-the-art France GBF-Toulouse Sequencing Team BAC selection and Finishing Murielle Philippot Pierre Frasse Genome Assembly Vincent Cahais Sana Hakim Simo Zouine Infrastructure and involvement of Bioinformatic Plateform INRA- Toulouse
14

State-of-the-art France GBF-Toulouse Sequencing Team BAC selection and Finishing Murielle Philippot Pierre Frasse Genome Assembly Vincent Cahais Sana Hakim.

Mar 28, 2015

Download

Documents

Jenna McGregor
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: State-of-the-art France GBF-Toulouse Sequencing Team BAC selection and Finishing Murielle Philippot Pierre Frasse Genome Assembly Vincent Cahais Sana Hakim.

State-of-the-artFrance

GBF-Toulouse Sequencing Team BAC selection and Finishing

• Murielle Philippot• Pierre Frasse

Genome Assembly• Vincent Cahais• Sana Hakim• Simo Zouine

Infrastructure and involvement of Bioinformatic Plateform INRA- Toulouse

Page 2: State-of-the-art France GBF-Toulouse Sequencing Team BAC selection and Finishing Murielle Philippot Pierre Frasse Genome Assembly Vincent Cahais Sana Hakim.

– 1 seule banque pour obtenir simultanément des séquences de type shot-gun et L-PET

– Séquences de L-PET de plus grande taille, ~100 bases pour chacune des extrémités

Page 3: State-of-the-art France GBF-Toulouse Sequencing Team BAC selection and Finishing Murielle Philippot Pierre Frasse Genome Assembly Vincent Cahais Sana Hakim.

State-of-he-artFrance

• GBF: 12 Runs– 12 runs 3kb home-made different library + 8 runs Shotgun

planned for July –August lack of DNA)– 2 716 911 sequences (3 runs)

• WUR: 27.5 Runs– 22 486 227 séquences– 15 runs Shotgun– 6 runs 3kb– 6.5 runs 20k

• Italy: 2 Runs– 1 366 781 séquences– 1 runs 3kb– 1 runs 20kb

Page 4: State-of-the-art France GBF-Toulouse Sequencing Team BAC selection and Finishing Murielle Philippot Pierre Frasse Genome Assembly Vincent Cahais Sana Hakim.
Page 5: State-of-the-art France GBF-Toulouse Sequencing Team BAC selection and Finishing Murielle Philippot Pierre Frasse Genome Assembly Vincent Cahais Sana Hakim.
Page 6: State-of-the-art France GBF-Toulouse Sequencing Team BAC selection and Finishing Murielle Philippot Pierre Frasse Genome Assembly Vincent Cahais Sana Hakim.
Page 7: State-of-the-art France GBF-Toulouse Sequencing Team BAC selection and Finishing Murielle Philippot Pierre Frasse Genome Assembly Vincent Cahais Sana Hakim.

20Kb Hollande

0

1000

2000

3000

4000

5000

1 129 257 385 513 641 769 897

length sequence

nu

mb

er o

f se

qu

ence

s FUCQJ OD02_Ho

FW0TE9I01_Ho

FW0TE9I02_Ho

FW6J 5DV01_Ho

FW6J 5DV02_Ho

FW95G3M01_Ho

FW95G3M02_Ho

FWEXUQG01_Ho

FWGOYO101_Ho

FWGOYO102_Ho

FWLY2VR01_Ho

FWLY2VR02_Ho

FXDORVL01_Ho

Page 8: State-of-the-art France GBF-Toulouse Sequencing Team BAC selection and Finishing Murielle Philippot Pierre Frasse Genome Assembly Vincent Cahais Sana Hakim.
Page 9: State-of-the-art France GBF-Toulouse Sequencing Team BAC selection and Finishing Murielle Philippot Pierre Frasse Genome Assembly Vincent Cahais Sana Hakim.
Page 10: State-of-the-art France GBF-Toulouse Sequencing Team BAC selection and Finishing Murielle Philippot Pierre Frasse Genome Assembly Vincent Cahais Sana Hakim.

Moyenne Qualité par SSF France

0,00

5,00

10,00

15,00

20,00

25,00

30,00

35,00

Longueur moyenne par SFFFrance (3 kb)

0

100

200

300

400

500

600

700

Moyenne Qualité par SFF Italie (20kb/3kb)

0,00

5,00

10,00

15,00

20,00

25,00

30,00

35,00

Longueur moyenne par SFF Italie (20 kb / 3 kb)

0

100

200

300

400

500

600

700

Page 11: State-of-the-art France GBF-Toulouse Sequencing Team BAC selection and Finishing Murielle Philippot Pierre Frasse Genome Assembly Vincent Cahais Sana Hakim.

Qualité moyenne par SFF (Hollande WGS)

0,005,00

10,0015,0020,0025,0030,0035,00

Moyenne des qualités par SFF (Hollande 3kb)

0,005,00

10,0015,0020,0025,0030,0035,00

Longueur moyenne par SFF (Hollande WGS)

0100200300400500600700

Longueur moyenne par SFF (Hollande 3kb)

0100200300400500600700

Page 12: State-of-the-art France GBF-Toulouse Sequencing Team BAC selection and Finishing Murielle Philippot Pierre Frasse Genome Assembly Vincent Cahais Sana Hakim.

Nombre de N moyens par SFFLongueur moyenne(Hollande shotgun)

0200400600800

0,005,0010,0015,0020,00

Nombre de N moyens par SFFLongueur moyenne

(Hollande 3 Kb)

0200400600

0,0010,0020,00

Nombre de N moyen par SFF Longueur moyenne

(France)

0

200

400

600

FT7TVTG01.fasta FV59KIZ01.fasta FWZTXF202.fasta0510

1520

Nombre de N moyen par SFFLongueur moyenne

(Italie)

0100200300400500

0,005,0010,0015,0020,00

Page 13: State-of-the-art France GBF-Toulouse Sequencing Team BAC selection and Finishing Murielle Philippot Pierre Frasse Genome Assembly Vincent Cahais Sana Hakim.

• Mean sequence length: 400 – 500 nt

• Mean sequence quality: 25 – 30

• Shotgun gives:

- longer reads (550 nt)

- higher frequency of long reads

• Chloroplast and mitochondria genome contamination:

- estimated very low (1600 – 1800 / 500k reads corresponding to 1 run)

• The ration of 2 runs for 1 x coverage has been slightly over-estimated

Conclusions

Page 14: State-of-the-art France GBF-Toulouse Sequencing Team BAC selection and Finishing Murielle Philippot Pierre Frasse Genome Assembly Vincent Cahais Sana Hakim.

• 1 run 454 sequencing of the 8 or 20 kb new PET libraries

• BAC-end sequencing of the sheared library (50 000 clones; 5-6 x)

• Whole Genome draft assembly with non Newbler assemblers

Suggestions - Questions