Genomic Selection in Tomato Breeding David Francis The Ohio State University (francis.77”at”osu.edu)
Genomic Selection in Tomato Breeding
David FrancisThe Ohio State University
(francis.77”at”osu.edu)
Genome-Wide Approaches:Examples from OSU Processing
program
Phenotype DataDistributionsANOVAPartitioning Variation (heritability)BLUPs
StructureQ Matrix (PCA)
Genotype DataMarker Matrix
Association Analysis to establish marker-trait linkage (Fixed effect)
Estimate Breeding Value (Random effect)
Kinship matrix
Y = μ REPy + Qw + Mα + Zv + Error
Marker analysis using “The Unified Mixed Model”
Kinship (marker or pedigree)
Structure
Marker Matrix (SNPs or GBS)
Marker-Trait Association Model
𝜸 = μ + Mα + ∈
Source DF Expected MS
Genotypes N-1 2 + b2(G)
Marker 1 2 + b[2(GQTL) + 4r(1-c)g2] + n(1 –2c)2g2
Gen(marker) N-2 2 + b[2(GQTL) + 4r(1-c)g2]
Error N(b-1) 2
Where b is the number of replicates
c is the recombination fraction separating the marker from the QTL
n is a coefficient related to the population size
g is the genetic effect (in BC pop’s additive and dominance effects
are confounded).
2(GQTL) is the part of the error variance that cannot be explained
by the QTL.
F-test = n(1 –2c)2g2
Significance of marker-trait associations is based
on the population size (n), recombination distance
between marker and QTL (c), and genetic effect
of QTL (g2).
X1[1:5, 1:8]M1 M2 M3 M4 M5 M6 M7 M8
1 1 0 0 1 -1 0 0 1
2 0 0 0 -1 1 0 1 1
3 1 0 0 1 -1 0 1 1
4 1 1 1 0 0 0 0 1
5 0 -1 -1 0 0 0 0 1
[96 x 384] x [384 1] = [96 1] [G x M } x [M] = Prediction for 96
Genotypes
X
• Predict performance• Selection [keep those
at K = 1 (~15% of the population)]
First Example: Disease Resistance (Bacterial Spot)Debora Menicos (Liabeuf)
Examine Contrasting approaches:
“Cornell School” - Many markers; impute missing marker data, optimize statistical model through lengthy analysis and simulation
“Minnesota School” - a few hundred markers well spaced across the genome; RR or Bayesian approaches work equally well (differences are slight)
solcap_snp_sl_600780solcap_snp_sl_150132solcap_snp_sl_204404
solcap_snp_sl_3011610
solcap_snp_sl_3690221
solcap_snp_sl_9751 solcap_snp_sl_2128034solcap_snp_sl_3456835
solcap_snp_sl_223440
solcap_snp_sl_244051
solcap_snp_sl_3177565solcap_snp_sl_1432367CL009293-068168
solcap_snp_sl_428375
Chromosome 1
solcap_snp_sl_584470
solcap_snp_sl_1284116
241_2F_264_241_2b_32solcap_snp_sl_1358134S_42736solcap_snp_sl_1355038SGN-U574837_snp399 solcap_snp_sl_25405SL10346_156
40
solcap_snp_sl_25429 solcap_snp_sl_2541842solcap_snp_sl_3579843solcap_snp_sl_2548545solcap_snp_sl_35955 solcap_snp_sl_35968solcap_snp_sl_66052
49
solcap_snp_sl_36037 CL015660-0224_solca50solcap_snp_sl_3363652Le001778_68_solcap_55Le001778_68_solcap56solcap_snp_sl_846457solcap_snp_sl_1495159solcap_snp_sl_843960solcap_snp_sl_8405 solcap_snp_sl_2032562solcap_snp_sl_838663solcap_snp_sl_1237268
Chromosome 2
solcap_snp_sl_67900
solcap_snp_sl_96638
solcap_snp_sl_968313
solcap_snp_sl_968918solcap_snp_sl_970320
solcap_snp_sl_565626
solcap_snp_sl_572233
solcap_snp_sl_2168544
solcap_snp_sl_2171451solcap_snp_sl_3565053
solcap_snp_sl_7940 solcap_snp_sl_793961solcap_snp_sl_791964solcap_snp_sl_1596066
SL10494_706_CL0091282solcap_snp_sl_20776 solcap_snp_sl_2075784solcap_snp_sl_2072385
Chromosome 3
Optimized set(s) of 384 SNPs for processing and fresh-market germplasm based on PIC and distribution in the genome
Resistance to X. euvesicatoria (T1) X. perforans (T3)
Race non-specific QTLRace Specific
Currently: Testing to see if additive models can be improved by incorporating non-additive effects
+ = better
prediction?
A note on marker numbers: GS models for bacterial spot resistance
Model Location 1 Location 2 Across Locationsrg/rp
Phenotypic Selection - - -
384 markers,Random model 0.81 0.36 0.6
15 linked markersRandom model 1.02 0.89 0.96
Mixed ModelFull marker set 1.11 0.91 1.01Linked = fixed
Second Example: Yield and quality traitsPredicting inbredsPredicting hybrids
Training Populations (Genotype and Phenotype)SolCAP (inbreds)Nested RIL (inbreds)
HybridsPredict genotype from inbred dataPredict performance using GW model(s) developed
for inbredsCompare prediction with actual performance
Prediction in tomato breeding populations1) Unstructured collection140 Advanced inbred-lines (SolCAP collection); 7,700 SNPs
2) A nested RIL: AxB; CXB; AXD (O x H; O x H, O x CA)280 progeny; 384 SNPs
Augmented Experimental designs (2 year, 2 locations)
Traits: Total traits measured: 52Yield, digital phenotyping and chemical meas.
Reduced to 22 most informative (h2, PCA and other methods)
• Yield (total and marketable)• Color and Color uniformity• BRIX• pH• Vitamin C• Fruit Size and Shape
0.40
0.45
0.50
0.55
0.60
0.65
0.70
0.75
0.80
0 2 4 6 8 10Pro
p. N
o. 1 t
om
ato
es
Hue uniformity
Proportion No. 1 tomatoes VS Hue uniformity
Predicting inbred-line performance from inbred line dataData from SolCAP data (140 varieties, 7,700 markers)
Yield
Fruit Size
Predicting inbred-line performance from inbred line dataResults from nested RIL (280 x 384 markers)
Yield
Fruit Size
Hybrids: How does predicted performance relate to actual performance?
Ripe_kg_(vs)_MktYield P = 4.42E-05 R2 = 0.19958 (r = 0.44)
-0.02 -0.01 0.00 0.01 0.02 0.03 0.04 0.05
-4-2
02
46
8
Prediction
Yie
ld F
rem
ont
P = 0.03731 *R2 = 0.06r = 0.24
Selection Estimate
Phenotype 37.6
GS 1.2
GS + PH 4.3
Checks -8.2
Ph
che
ck
GS
Ph
_G
S
20
40
60
80
100
Yield
Ph
che
ck
GS
Ph
_G
S
40
50
60
70
80
90
100
Fruit Size
Ph
che
ck
GS
Ph
_G
S
4
5
6
7
8
BRIX
Similar results for other traits:Fruit Size was modeled, BRIX was not
Current efforts• Incorporate knowledge of linkage and gene action into models• Begin to incorporate hybrid data into training population and use new models
to predict hybrid performance• Continue work on Multi-trait index
1) Whole genome models have predictive capability for individual performance and hybrid performance; 2) Use of existing knowledge of gene-action and significant associations improves model performance; 3) models with 20-384 markers work well; 4) models are not a replacement, use in off season selection, and as a supplement for breeder knowledge
Thank you for your time.
AcknowledgmentsCollaborators, OSU
Debora Liabeuf
Eka Sari
Eduardo Bernal
Michael Dzakovich
Marcela Carvalho Andrade
Regis de Castro Carvalho
Troy Aldrich
Jihuen Sim
Caleb Orchard
Gabriel Abud
Elisabet Gas Pascal
Heather Merk
Sung-Chur Sim
Matt Robbins
Steve Schwartz
Rachel Kopec
Jessica Cooperstone
Luis Rodrigues-Saona
Sally Miller
Collaborators, CAUWencai Yang
Hui Wang
Collaborators, INRACeMathilde Causse
Collaborators, UIBHipolito Medrano
Pep Cifre
Josefina Bota
Miquel Angel Conesa
Collaborators,
IndustryCindy Lawley, Illumina
Martin Ganal, Trait
Genetics
Hirzel Canning
Red Gold Canning
Collaborators, CornellWalter de Jong
Lucas Mueller
Martha Mutschler
Collaborators, UCDAllen Van Deynze
Kevin Stoffel
Collaborators, MSUDavid Douches
C Robin Buell
John Hamilton
Dan Zarka
Kelly Zarka
Collaborators, UFLSam Hutton
Jay Scott