CREATION OF A SNP CHIP IN SHEEP Rudiger Brauning and John McEwan on behalf of the International Sheep Genomics Consortium MapNet Workshop 27 August 2009
Dec 18, 2015
CREATION OF A SNP CHIP IN SHEEP
Rudiger Brauning and John McEwanon behalf of the International Sheep Genomics Consortium
MapNet Workshop 27 August 2009
Skim sequencing
ScottishBlackface
Romney
Texel
Otago University Baylor College HGSC
Merino
Poll Dorset
Awassi
~3x coverage(9.74 Gbp),Roche 454 FLX
0.5x
0.5x0.5x
0.5x
0.5x0.5x
Sequence assembly pipeline
lower case masked 454 reads
454 sequencesrepeat mask
Sheep 454 contigs
Bovine genomemap uniquely
(normal / sensitivesensitive megablast)
Uniquely mapped 454 sequences
assemble with Newbler / CAP3CAP3
Sheep genome assembly v1.0 / v.2.0v.2.0
position sheep contigs on vsg2, replace cow sequence with sheep sequence, remove remaining cow sequence
vsheep
order, orientate and space
Ordered sheep contigs
Turning a cow into a sheep - low resolutionOAR1
BTA3
BTA1
OAR2
BTA8
BTA2
OAR3
BTA11
BTA5
OAR4
BTA4
OAR5
BTA7
OAR6
BTA6
OAR7
BTA10
OAR8
BTA9
OAR9BTA9
BTA14
OAR10
BTA12
OAR11
BTA19
OAR12
BTA16
OAR13
BTA13
OAR14
BTA18
OAR15
BTA15
OAR16
BTA20
OAR17
BTA17
OAR18
BTA21
OAR19BTA22
OAR20BTA23
OAR21BTA29
OAR22BTA26
OAR23BTA24
OAR24BTA25
OAR25BTA22
OAR26BTA27
OARX
BTAX 5 chromosome inversions4 chromosome fusions1 split chromosome
The sheep genome assembly
Is comprised of 454 sequence contigs organised into sheep chromosomes
• 2.5 million contigs
– average length ~480 bases
• ordered and oriented using vsheep framework
– cover ~52% of the sheep genome
– cover ~100% of the unique fraction of the genome
15 sheep contigs – 1 bovine contig
How close is the assembly to the real sheep?The local order of sheep contigs is heavily dependant on the bovine genome assembly
• Used a light pass of 454 sequencing paired end reads as a quality control check of the assembly
– short paired ends ~30 bp
» ~89% tail-to-tail
– long paired ends ~70-100 bp
» ~98% tail-to-tail
• BAC-end sequences
– ~93% tail-to-tail
Illumina Infinium OvineSNP50 BeadChip - performance
average MAF = 0.30
> 20,000 animals genotyped
Quality control
59,454 SNPs
Future work – simulation results
simulation approach read length (bp) insert sizes (kb) coverage depth*A single BAC 100 1 20X B pooled BACs^ 100 1+5 23X+5XC pooled BACs^ 75 0.175 30X D seeded assembly` 100 1+5 23X+5X
A B C D
Acknowledgments• AgResearch NZ
– John McEwan– Gemma Payne– Nessa O’Sullivan– Tracey Van Stijn– Theresa Wilson– Rudiger Brauning– Alan McCulloch– Russell Smithies– Benoit Auvray
• University of Otago– Jo Stanton– Chrissie– Mark
• Baylor College of Medicine– Richard Gibbs– Donna Muzny– Michael E. Holder– Lynne Nazareth– Rebecca L. Thornton– Christie Kovar
• CSIRO Livestock Industries– Brian Dalrymple – James Kijas – David Townley– Abhirami Ratnakumar – Wes Barris– Sean McWilliam
• Genesis Faraday– Chris Warkup
• sheepGENOMICS– Rob Forage– Terry Longhurst
• TIGR– Ewen Kirkness
• Uni Melbourne– Jill Maddox
• USDA– Tim Smith– Curt van Tassell
• UNE– Hutton Oddy
• Uni Sydney– Frank Nicholas– Herman Raadsma
• Utah State University– Noelle Cockettwww.sheephapmap.org
www.livestockgenomics.csiro.auwww.sheepgenomics.com
isgcdata.agresearch.co.nz