5/23/2016 1 Genome wide association Studies and Genomic Selection Animal selection • Selection based on EBV has been very effective – Substantial increases in the majority of traits and livestock species – An efficient procedure is being routine applied with little to no intervation by the userAccuracies depend on heritabilities and amount of information • For some traits, the results are modest and even in the opposite direction – Fertility in dairy cattle – Health traits in poultry
65
Embed
Genome wide association Studies and Genomic Selectionhpc.ilri.cgiar.org/beca/training/AQGG_2016/materials/Rekaya/Chap3_… · Studies and Genomic Selection Animal selection • Selection
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
5/23/2016
1
Genome wide association
Studies and Genomic
Selection
Animal selection
• Selection based on EBV has been very effective
– Substantial increases in the majority of traits and
livestock species
– An efficient procedure is being routine applied with
little to no intervation by the userAccuracies depend
on heritabilities and amount of information
• For some traits, the results are modest and even
in the opposite direction
– Fertility in dairy cattle
– Health traits in poultry
5/23/2016
2
Animal selection
• Accuracies of BV estimation depend on:
– Heritabilities
– Amount of information
– Precision in phenotype measurements
• Genetic progress is limited by
– Accuracies of EBV
– Generation interval
– Cost
How can we improve selection?
• Use of molecular information
– Identify causative major genes
– QTL in linked to markers
– Develop gene or marker assisted selection
• Very little success!!
– Very few major genes were identified
– Only QTLs with large effects were identified • Small portion of the total variation
• Hard to use in commercial applications
5/23/2016
3
Things could be better if
• More markers are made available
– It was not possible in the early molecular
information
• The identification of genes (QTLs) is not
needed
– Elimination of the mapping step
• Removes the uncertainties
• Remove the complexity of use in commercial setup
• Account even for small QTLs
Two Approaches
• Infinitesimal model:
– Assumes that traits are determined by
an infinite number of additive loci, each
with a small effect
– Basic tool of animal breeding
– Spectacularly successful in many cases
5/23/2016
4
Two Approaches
• Finite loci model:
– There is a finite amount of DNA and genes
– Quantify any gene contribution to variation
– Allows for the dissection of the genetic complexity
• increase the accuracy of breeding values estimation
• Increase selection response
Two Approaches
• But How to find these QTLs
– Candidate gene approach • Some success
• Number of candidate genes is too large
• Very difficult to find candidates
– Linkage mapping • Track genome segment from a generation to the
next using markers
• QTL are not precisely mapped
• Wide confidence intervals of QTL location
5/23/2016
5
Two approaches
– Fine mapping
• Use Linkage disequilibrium mapping approaches
• The process is slow to find causative mutation
• Marker density was a problem
– Result : Failure of MAS
Then …
• The completion of a working draft of the human and several livestock species genomes
– Sequencing of every nucleotide in the genome
• Development in High-through-put technologies
– Hybridization techniques
– Polymorphisms genotyping
5/23/2016
6
Polymorphisms
• A specific sequence variation that happens
with a certain frequency in the population – At least 1%, often 5%
• For example
– Blood type
– CNV (Segment of DNA that are found in different
numbers of copies among individuals)
– Single Nucleotide Polymorphism (SNP)
SNP
)Murray 2007(
5/23/2016
7
SNPs
AGATTTAGATCCCGATAGAG
AGATTTAGATCACGATAGAG
A SNP is a genomic position at which two or more different bases occur in
the population, each with a frequency > Threshold (example 1%)
Alleles are C/A
SNPs
• Human Genome ~ 3 Billion base pairs
• Bovine Genome ~ 3 Billion base pairs
• Two individuals are 99.5 to 99.9% identical
– differ in 3 -10 M base pairs.
• SNPs occur once every ~600 bp (average
gene in the human ~27Kb)
• Around 50 SNPs per gene
5/23/2016
8
SNPs
• Currently available SNP marker chips
– 1- 2 M SNPs in human
– 50k -500K SNPs in cattle
– 60K SNPs in chickens and pigs
– ~200K for dogs
– Varying numbers for fish, sheep and plants
– Cost ~ $120- $200 for 60 K chip
G C T C G A C A A C A G
G T T C G T C A A C A G
C A G
T T G
SNP 1 SNP 2 SNP 2
Haplotypes
SNP Haplotypes
Sorin Istrail, 2009
5/23/2016
9
Polymorphisms
• single nucleotide polymorphisms
(SNPs)
• Difference between any two individuals
3 106 SNPs
… ataggtccCtatttcgcgcCgtatacacgggActata …
… ataggtccGtatttcgcgcCgtatacacgggTctata …
… ataggtccCtatttcgcgcCgtatacacgggTctata …
Haplotypes and Genotypes
• Haplotype: SNP alleles on a chromosome
– 0/1 vector: 0 for major allele, 1 for minor
• Genotype: Alleles of an SNP on both
chromosomes
– a vector of 0/1/2 vector:
5/23/2016
10
Haplotypes and Genotypes
011100110
001000010
021200210
+ Individual two haplotypes
genotype
Coding
Genotypes Dominant Codominant Recessive
AA
AG
GG
1
1
0
2
1
0
1
0
0
1. Coding depends on the genetic model (there is no unique way to code genotypes
2. Interpretation of results depends on the coding procedure
5/23/2016
11
Genotypes vs Haplotypes
• It depends on the application
– Degrees of freedom vs model parameters
• There is no simple answer (application
dependent)
• If possible, you could try both
• Keep in mind, you are trying to fit a model
(explain the variation in the dependent
variable)
How to use this information?
• Genome-wide Association Study (GWAS) – Identify genetic associations with observable traits
– Identify possible causative mutations
• Genomic selection – Estimation genetic quantities that could be used for
selection
– Estimation of relationships between individuals
– Paternity checking
• Genomic enhanced management – Decision support tool
5/23/2016
12
Data analysis
• Single marker analysis
– Compare phenotype means across marker’s
classes
• Simple but too many analyses to be
conducted
– False positive/negatives
– High linkage disequilibrium
iiji egeny
Data analysis
• Example
• Using linear model
– SNP effect estimate = 10
SNP genotype Trait (EBV) average
TT
GT
GG
30
20
10
5/23/2016
13
Data analysis
• Based on single marker analysis results
– Select only significant SNPs for marker
assisted selection!
– Use all SNPs
• Some will be missed
• False positive/negatives
• A better solution: A model that uses all the
SNPs simultaneously
Multi-marker analysis: Genomic
selection
• Use all SNPs in the panel (i.e. 50K) in a
single analysis
– Unless as many observations as SNPs are
available, fixed model could not be
implemented
– Assuming SNP effects as random will remove
the problem
• Prior information is needed
5/23/2016
14
Data analysis
• Model
• The SNP genotypes will be coded as
covariables
– Minor allele content (0,1, 2)
SNP
j
iijji egy#
1
is the trait value (often a pseudo record) for individual i, is the effect of
SNP j, and is the genotype of SNP j for animal i
j
ijg
iy
Data analysis
• The resulting system of equations is very dense
– Sparse matrix techniques are of little interest
– Constructing and inverting the coefficient matrix is seldom possible!
• Gauss-Seidel with residual update (GSRU)
– matrix-free BLUP-like estimation procedure
– Low Computational cost
5/23/2016
15
Data analysis
• Gauss-Seidel
• Further
α
t
j
j
'
j
t
n:1jp:1j
1t
1j:11j:1
'
j
xx
βXβX(yx )1
1t
jj
jt β
',1) xeβXβX(yt
n:1jp:1j
1t
1j:11j:1
Data analysis
• GSRU
• Only vectors are involved in
computation
α
β t
jj
jt
t
j
j
'
j
'
j
'
j
xx
xxex ),1
1
5/23/2016
16
Data analysis
• Implementation depends on the assumed prior information – BLUP type, BayesA, BayesB, BayesC ..etc
• Implementation depends of what you are interested in estimating – SNP effects
– Genomic EBVs –GBLUP (simular to regular BLUP, except A is replaced by the genomic relationship matrix
• In both cases, an estimated GBV is obtained – Directly –GBLUP
– Sum of SNP effects
SNP
j
ijj g#
1
^
5/23/2016
17
Data analysis
• Model
• Likelihood function
With
p
j
ijiji eXy1
BLUP
2222 ~),|(
eveeeee ssp
constant~)(p
Prior distributions for and 2
e
Prior distributions for j
),0(~)|( 22
Np j
BLUP Like approach
5/23/2016
18
BLUP
2
Assume is known then,
The joint posterior distribution is easily obtained as the product of the likelihood function
and the prior distributions:
)()()(),,|(
),,(),,|()|,,(
22
222
ee
eee
pppp
ppp
ββy
ββyyβ
)()(
)()()(),,(
2
22
e
ee
pp
pppp
β
ββ
)()(),,|()|,,( 222 ββyyβ pppp eee
Given that
Then
BLUP
),,|( 2yβ
ep
),,|( 2yβ
ep
),,|( 2yβ
ep
])(2
1exp[)
2
1(
)2
)0(exp()2(
)2
1exp()]()(
2
1exp[)
2
1(),,|(
2
1 12
2/
2
12
22/12
2
2
012/2
2
1 12
2/
2
2
N
i
p
j
jijij
e
N
e
p
i
i
e
e
N
i
p
j
jijij
e
N
e
e
Xy
SXyp
yβ
For an implementation via the Gibbs sampler, we have to derive the full conditional distributions:
5/23/2016
19
BLUP
2/)2
1( N
])()(2(2
1exp[)
1(
])()(2(2
1exp[)
1(
])(2
1exp[)
1(),,|(
2
1 11 1
2
22
2
1
2
11
2
22
2
1
2
12
22
2
N
i
p
j
jiji
N
i
p
j
jiji
e
N
e
N
i
p
j
jiji
p
j
jiji
e
N
e
N
i
p
j
jiji
e
N
e
e
XyXyN
XyXy
Xyp
yβ
Putting into the proportionality term,
BLUP
}]
)()(
2{2
1exp[)
1(),,|(
2
1 11 12
22
2
2
N
Xy
N
Xy
Np
N
i
p
j
jiji
N
i
p
j
jiji
e
N
e
e
yβ
Further
Adding and subtracting leads to: 21 1]
)(
[N
XyN
i
p
j
jiji
]]
)(
[]
)(
[
}
)()(
2{2
1exp[)
1(),,|(
21 121 1
2
1 11 12
22
2
2
N
Xy
N
Xy
N
Xy
N
Xy
Np
N
i
p
j
jiji
N
i
p
j
jiji
N
i
p
j
jiji
N
i
p
j
jiji
e
N
e
e
yβ
5/23/2016
20
BLUP
Organizing elements within the exponential
}]
)(
[
)(
(2
1exp(
}
)(
{2
1exp[)
1(),,|(
21 1
2
1 1
2
21 1
22
2
2
N
Xy
N
Xy
N
N
Xy
Np
N
i
p
j
jiji
N
i
p
j
jiji
e
N
i
p
j
jiji
e
N
e
e
yβ
Since the second exponential does not depend on
21 1
22
2
2 }
)(
{2
1exp[)
1(),,|(
N
Xy
Np
N
i
p
j
jiji
e
N
e
e
yβ
BLUP
),(~),,|(^
2 VNp e yβ
N
XyN
i
p
j
jiji
1 1
^)(
NV e
2
Hence,
where
5/23/2016
21
BLUP
β
),,|,...,,(),,|( 2
21
2yyβ epe pp
β
Conditional distribution of
In general it is much easier to work with univariate distributions. Hence, we will derive the conditional
distribution for an element “k” of the vector
)2
exp()2]()(2
1exp[)
2
1(
)2
)0(exp()2(
)2
1exp()]()(
2
1exp[)
2
1(),,,|(
2
2
2/12
2
1 12
2/
2
12
22/12
2
2
012/2
2
1 12
2/
2
2
kN
i
p
j
jiji
e
N
e
p
i
i
e
e
N
i
p
j
jiji
e
N
e
ekk
Xy
SXyp
yβ
BLUP
)2
exp()2]()(2
1exp[)
2
1(
)2
exp()2]()(2
1exp[)
2
1(),,,|(
2
2
2/12
2
12
2/
2
2
2
2/12
2
12
2/
2
2
kN
i
kiki
e
N
e
kN
i
kik
p
kj
jiji
e
N
e
ekk
Xw
XXyp
yβ
Then,
))'('2'{2
1exp[)2()
2
1(
)2
exp()2)](()'(2
1exp[)
2
1(),,,|(
2
2
2
2/122/
2
2
2
2/12
2
2/
2
2
ke
kkkk
e
N
e
kkkkk
e
N
e
ekk
XXWXWW
XWXWp
yβ
Further
5/23/2016
22
BLUP
)')'(2){'(2
1exp[
))'('2'{2
1exp[)2()
2
1(),,,|(
1
2
2
2
2
2
2
2
2
2/122/
2
2
kke
kkke
kk
e
ke
kkkk
e
N
e
ekk
WXXXXX
XXWXWWp
yβ
Then
Finally,
)
]'[
,']'([~),,,|(
2
2
21
2
22
ekk
ek
ekkekk
XX
WXXXNp
yβ
BLUP
Conditional distribution of 2
e
)](2
1exp[)
2
1(
)2
1exp()]()(
2
1exp[)
2
1(),,|(
2
02
12/)(
2
2
2
012/2
2
1 12
2/
2
2
S
SXyp
e
e
N
e
e
ee
N
i
p
j
jijij
e
N
e
e
ee'
yβ
where
p
j
jijii Xye1
2
)(
2
0
2 )(),,|(
eNee Sp ee'yβ
Then
5/23/2016
23
BLUP
was assumed known, but what is ? 2
2
If locus j is a random sample of all possible loci then
)var(2
j
where is the random variable j
However, our model is:
βii Xg
where is constant for all individuals. So the genetic variation is due to the
genotypes!
β
BLUP
Genetic variance (assuming LE)
2
1
2 i
p
i
iia qpV
Where are the allele frequencies at SNP i ii qp and
Let
iii qpV 2
2
iiU
Then
i
p
i
ia UVV
1
5/23/2016
24
BLUP
Covariance between V and U
))((
)().()(
)])()][(([),(
111
p
U
p
V
p
UV
UEVEVUE
UEUVEVEUVCOV
p
i
i
p
i
i
p
i
ii
Then, the genetic variance could be re-written as:
)(2),( 1
2
1 pqpUVpCOVV
p
i
ip
i
jia
BLUP
Let p
p
i
i 1
2
2
Then
2
1
)2(),(
p
i
jia qpUVpCOVV
and
p
i
ji
a
qp
UVpCOVV
1
2
2
),(
5/23/2016
25
BLUP
Assume that is unknown, then we have to specify a prior 2
2222 ~),|(
vssp
))(2
1exp()(
)2
1exp()()
2
)0(exp()2(
)2
1exp()()
2
)0(exp()2(
)2
1exp()]()(
2
1exp[)
2
1(),,,|(
2
212/)(2
2
212/2
12
22/12
2
212/2
12
22/12
2
2
012/2
2
1 12
2/
2
22
S
S
S
SXyp
p
p
i
i
p
i
i
e
ee
N
i
p
j
jijij
e
N
e
e
ββ'
yβ
Conditional distribution
2222 ~),,,|(
pe Sp ββ'yβ
Bayes-A
• Model
• Likelihood function
With
p
j
ijiji eXy1
5/23/2016
26
Bayes-A
2222 ~),|(
eveeeee ssp
constant~)(p
Prior distributions for and 2
e
Prior distributions for j
),0(~)|( 22
jjj Np
Prior distributions for 2
j
2222 ~),|(
jvjjjjj ssp
Bayes-A
p
i
p
i i
iii
i
i
e
ee
N
i
p
j
jijij
e
N
e
pe
S
SXyp
i
1 12
212/2
2
2
2/12
2
2
012/2
2
1 12
2/
2
22
1
2
)2
1exp()()
2exp()2(
)2
1exp()]()(
2
1exp[)
2
1()|,...,,,,(
yβ
Joint posterior distribution
Conditional distributions
For the same conditional distributions as with the BLUP-type model 2 and,, e β
- Normal distributions for position parameters
- Scaled inverted Chi-Square for the residual variance
5/23/2016
27
Bayes-A
Conditional distribution 2
j
)2
1exp())(
2exp()2(
)2
1exp()()
2exp()2(
)2
1exp()]()(
2
1exp[)
2
1(),,,,|(
2
212/2
2
2
2/12
1 12
212/2
2
2
2/12
2
2
012/2
2
1 12
2/
2
2
#
22
j
jj
j
j
j
j
p
i
p
i i
iii
i
i
e
ee
N
i
p
j
jijij
e
N
e
jiej
S
S
SXyp
j
i
yβ
)2
exp()(),,,,|(2
22
12/)1(22
#
22
j
jjj
jjiej
Sp j
yβ
2
1
222
#
22 )(~),,,,|(
jjjjjiej Sp yβ
So,
Bayes-B
• Model
• Likelihood function
With
p
j
ijiji eXy1
5/23/2016
28
Bayes-B
2222 ~),|(
eveeeee ssp
constant~)(p
Prior distributions for and 2
e
Prior distributions for j
y probabilith wit0
)-(1y probabilit with ),0(~)|(
2
2 j
jj
Np
Prior distributions for 2
j
2222 ~),|(
jvjjjjj ssp
Bayes-B
• Full conditional distributions
– Same as with normal prior for
• Normal for the mean
• Scaled inverted Chi-square for the residual variance
• Conditional distributions for
• Not in closed form: We cannot use Gibbs
Sampler!!
2 and, e
2 and , jj
),,,,,|(),,,,|(),,,,|,( 22
#
22
#
222
#
22yβyβyβ jjiejjjiejjjiejjj ppp
5/23/2016
29
Bayes-B
• If the prior is used as proposal distribution
• Then
)()(),,,,|(
)()(),,,,|(,1min{),(
2222
#
2
2222
#
2
candjjejijj
jcandjejijcandj
ppyp
ppyp
),,,,|(
),,,,|(,1min{),(
22
#
2
22
#
2
ejijj
ejijcandj
yp
yp
Bayes-B
• Conditional distribution of
– Given
j
2
j
)
]'[
,']'([~),,,,|(
2
2
21
2
222
j
ejj
ej
j
ejjejjj
XX
WXXXNp
yβ
5/23/2016
30
Bayes-B
• A better re-parametrization
– Create a dummy variable taking values of 1
and 0 with probability and for each SNP
• Assuming is known
ii
ip
1
)1(~)|(
)1(
Bayes-B
• Given
• Conditional distribution of
λ
j
) )( , , , , 0 | ( ) 1)( , , , , 1 | (
) | ( ) , , , , | () , , , , , | (
2 2 2 2
2 2
2 2
e j j e j j
j e j j
e j j
y p y p
p y pp
β β
βy β
0 if 0
1 if ),0(~),|(
j
j
2
2
j
jjj
Np
))(,,,,0|()1)(,,,,1|(
)|(),,,,|(),,,,,|(
2222
22
22
ejjejj
jejj
ejjypyp
pypp
ββ
βyβ
5/23/2016
31
Bayes-B
• Assuming unknown
• Posterior of SNP effects and variances
remain the same
• We need to derive the conditional
distributions of
]1,0[~)( Up
Bayes-B
• Conditional of
• Thus,
rrp
epjp )1(),,,,|( )(22
),1(
yβ
Where r is the number of SNPs with indicator variable equal to 1.
)1,1(~),,,,|( 22
),..,1(
rrpBetap
epjyβ
5/23/2016
32
Bayes-C
• Model
• Likelihood function
With
p
j
ijiji eXy1
Bayes-C
2222 ~),|(
eveeeee ssp
constant~)(p
Prior distributions for and 2
e
Prior distributions for j
y probabilith wit0
)-(1y probabilit with ),0(~)|(
2
2 Np jj
Prior distributions for 2
j
2222 ~),|(
vssp
5/23/2016
33
Bayes-C
• It is similar to attaching a dummy variable,
taking values of 1 and 0 with probability (1-pi)
and pi, such that:
i
0 if 0
1 if ),0(~)|(
i
i
2
2
N
p jj
Thus, )1(~ Bern o u llii
and
]1,0[~)( Up
Bayes-C
• Conditional distributions:
– Given
• Normal for position parameters
• Scaled inverted Chi-square for the residual variance
• Conditional distribution of
},...,,{ 21 p
i
))(,,,,0|()1)(,,,,1|(
)|(),,,,|(),,,,,|(
2222
22
22
ejjejj
jejj
ejjypyp
pypp
ββ
βyβ
5/23/2016
34
Bayes-C
2
))(2
1exp()(
)2
1exp()()
2
)(exp()2(),,,|(
2
212/)'(2
2
212/2
12
22/1222
S
Sp
p
p
i
ie
ββ'
yβ
Conditional distribution
2
'
222 ~),,,|(
pe Sp ββ'yβ
Where p’ is the number of SNP with non-zero effect
Bayes-C
rrp
ep )1(),,,,|( )'(22 yβ
Conditional distribution
Where r is the number of SNPs with indicator variable equal to 1.
)1,1(~),,,,|( 22 rrpBetap e yβ
5/23/2016
35
Non-Genotyped animals
• Multiple step procedures
– Only genotyped and phenotyped individuals
• Can we include non-genotyped animals?
– Impute missing genotyped
– Use expected relationships
• Produce genomic breeding values to all
animals
Non-Genotyped animals
p
j
ijiji eXy1
5/23/2016
36
Non-Genotyped animals
• For non-genotyped animals
– X matrix is unknown
– Can it be imputed?
• In matrix notation
– Where g1 and g2 are the BV for genotyped
and non-genotyped animals
2
1
2
1
2
1
0
0
e
e
g
g
Z
ZIy
Non-Genotyped animals
• Let,
– Where M1 and M2 (unknown) are the
matrices for marker genotypes for genotyped
and non-genotyped animals and is the
vector of SNP effects
•
βMg11
βMg22
β
5/23/2016
37
Non-genotyped animals
• Using some proprieties for multivariate
normal distribution
• where
),0
0(~),|,( 22
21 uuANp
Auu
),|(),|(~),|,( 2
12
2
1
2
21 uuuppp AuuAuAuu
),0(~),|( 22
1 uuNp AAu
)],0[0(~),,|( 2
21
1
11121
1
1112
2
12 uuNp AAAuAAAuu
Non-genotyped animals
• Replacing u1 by the marker effects
• Thus,
• Where
)],[0(~),,|( 2
21
1
1112221
1
1112
2
12 uuNp AAAAβMAAAuu
2
1
1
1
1112
1
2
1
0
0
e
e
εβMAA
βM
Z
ZIy
)][,0(~,| 2
21
1
111222
2
uuN AAAAA
5/23/2016
38
Non-genotyped animals
• Further,
• Let
2
1
2
1
2
1
0
0
e
e
εβ
βM
Z
ZIy
M
2
222
111
0
ZU
MZW
MZW
Non-genotyped animals
• Model
• In matrix notation,
eUεWβIy
yU'
yW'
yI'
ε
β
μ
σ
σ)AAA(AUU'WU'IU'
UW'σ
σIWW'IW'
UI'WI'II'
2
ε
2
e1
21
1
111222
2
β
2
e
5/23/2016
39
Non-genotyped animals
• Re-parametrization using A-1
• It is easy to prove that:
2221
1211
2221
1211
AA
AAA
AA
AAA
1 and
122
21
1
111222)(][ AAAAA
122121
1112)( AAAA
Non-genotyped animals
• In matrix notation using A-1,
• If # genotyped animals > # of non-genotyped animals then use A-1
for calculating the conditional mean
• If # genotyped animals < # of non-genotyped animals then use A for
calculating the conditional mean
yU'
yW'
yI'
ε
β
μ
σ
σAUU'WU'IU'
UW'σ
σIWW'IW'
UI'WI'II'
2
ε
2
e
2
β
2
e
22
5/23/2016
40
Non-genotyped animals
εUβ
M
M
g
gg
2
1
2
1
Genomic “animal” model
• For selection purposes, we are interested
in animal effects rather than SNP effects
• For 50k panel and few thousands animals,
the system of equations is huge
• With the availability of even higher density
panels, the situation could be worse
5/23/2016
41
Genomic “animal” model
• Model
• In matrix notation
p
j
ijiji eXy1
p
j
I1
)( eβX1y jj
Genomic “animal” model
• Model
With:
p
j 1
)( jjβXu
IZ
eZu1y
5/23/2016
42
Genomic “animal” model
• MMEModel
IZ
yZ'
y1'
uGZZ'Z'1
Z1'1'11 ^
^
2
e
yZ'
y1'
uGI1
1'1
2
e
N
Gu )var(
Genomic “animal” model
• MMEModel
IZ
yZ'
y1'
uGZZ'Z'1
Z1'1'11 ^
^
2
e
yZ'
y1'
uGI1
1'1
2
e
N
Gu )var(
5/23/2016
43
Genomic “animal” model
• MMEModel
y
y1'
βXβXI1
1'
jj
p
i
j
p
i
je
N
1
1
1
2 )][var(
GuβX j
)var()var(1
p
j
j
Genomic “animal” model
• The G matrix
2
1
'
1
'
1
)var()var(
j
p
j
j
j
p
j
j
p
j
j
XX
βXXβX
j
jj
where
)var(2
jj β
5/23/2016
44
Genomic “animal” model
• Only few animals are genotyped
– Few hundreds
• Extensive phenotyping
– Several millions
• “pseudo” phenotypes
– Estimates – Non observed
– Accuracies
Genomic “animal” model
• Can we combine phenotypes and genomic
information in a single analysis?
• Make use of both source of information
• Remove the need for “pseudo” records
• Work well for regular genetic evaluation,
but not necessarily for other applications
• Decision making tools
• Causative mutation identification
5/23/2016
45
Genomic “animal” model
• Model
• Further,
eZuwby
'
2
1
u
uu
Where y is a vector of “raw” phenotypes, and u1 and u2 are the vector of genetic
merit of non-genotyped and genotyped animals, respectively.
Genomic “animal” model
• If no Genomic information is available
• Further,
eZuwby
'22
2
2 ),(~),|,(),|( uuu Npp A0AuuAu 1
2
2221
12112
2 ),|,var( uu
AA
AAAuu1
5/23/2016
46
Genomic “animal” model
• If animals in u2 has been genotyped
'22
2
2 ),(~),|,(),|( uuu Npp H0AuuAu 1
2
21
12112
2 ),,|,var( uu
GA
AAGAuu1
- Naive approach - Inverse of the matrix could be hard