Linkage Analysis Linkage Analysis Eric Jorgenson Epidemiology 217 2/21/12
Dec 18, 2015
Linkage AnalysisLinkage Analysis
Eric Jorgenson
Epidemiology 217
2/21/12
Worldwide Distribution of Human Earwax SNP rs17822931
Yoshiura et al., Nature Genetics 2006
Geographic Distribution of PTC Phenotype
Wooding Genetics 2006, adapted from Cavalli-Sforza 1994
High
Low
Bimodal Distribution of PTCBimodal Distribution of PTC
PTC Distribution
0
5
10
15
20
25
30
35
40
45
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Number of Subjects
Your Phenotypes and GenotypesTaste SNP Ear wax SNP
Sample Name taster ear wax rs10246939 rs1726866 rs17822931
BU10 Y D CT AG TT
BU12 N D None None None
BU14 N D? TT AA TT
BU15 Y W CT AG CC
BU17 Y W CT AG CC
BU19 Y D CC GG TT
BU20 N W TT AA CC
BU21 mild W TT AA CC
BU22 Y D? CC GG CC
BU23 Y W CC GG CT
BU24 Y W? CT AG CC
BU25 Y W CT AG CC
BU26 Y W? CT AG CT
BU27 N W TT AA CC
BU28 Y W CT AG CT
BU29 Y mild W CT AG CT
BU30 Y W CT AG CC
From Joe Wiemels
Types of Genetic StudiesTypes of Genetic Studies
• Family StudiesFamily Studies– Compare trait values across family membersCompare trait values across family members
• Linkage AnalysisLinkage Analysis– Compare trait values with inheritance patternsCompare trait values with inheritance patterns
• AssociationAssociation– Compare trait values against genetic variantsCompare trait values against genetic variants
Family StudiesFamily Studies
• Familial RelationshipsFamilial Relationships– TwinsTwins– SiblingsSiblings– Parents/offspringParents/offspring
• Phenotype informationPhenotype information– Affected/Unaffected (Prostate Cancer)Affected/Unaffected (Prostate Cancer)– Quantitative measure (Blood Pressure)Quantitative measure (Blood Pressure)
• No Genotype information requiredNo Genotype information required
Why do Family Studies?Why do Family Studies?
• Is the trait genetic?Is the trait genetic?
• What is the mode of transmission?What is the mode of transmission?– DominantDominant– RecessiveRecessive– AdditiveAdditive– Polygenic (Multiple genes involved)Polygenic (Multiple genes involved)
Mutation and MeiosisMutation and Meiosis
Recessive traitRecessive trait
First PTC Family StudyFirst PTC Family Study
Number of FamiliesCan taste Can not taste
Both parents can taste 40 90 16One parent can taste, the other can not 51 80 37Neither parent can taste 9 0 17
Children
L. H. Snyder Science 1931
Linkage Analysis
• Narrow down position of disease gene
• No biological knowledge needed
• Genetic markers (not disease gene)
• Recombination
Recombination
a
b
A
B b
a
b
a
A a
bB
a a
b b
A
b
a
b B b
aa
Recombination
a
b
A
B b
a
b
a
A a
bB
a a
b b
A
b
a
b B b
aaRR NRNRNRNR RR
Independent Assortment
a
b
A
B b
a
b
a
A a
bB
a a
b b
A
b
a
b B b
aa25% 25%25% 25%
No recombination
a
b
A
B b
a
b
a
A a
bB
a a
b b
A
b
a
b B b
aa0% 50%50% 0%
Recombination Fraction
a
b
A
B b
a
b
a
A a
bB
a a
b b
A
b
a
b B b
aa61 420442 77
Recombination Fraction
A a
bB
a a
b b
A
b
a
b B b
aa61 420442 77
Recombination Fraction =Recombinants / Total =
61 + 77 / 61 + 77 + 442 + 420 = 138 / 1000
= 13.8%
Linkage
• Recombination fraction < 50%
• Two traits: PTC and KELL blood group
• Two genetic markers
• One trait and one genetic marker
Linkage Analysis
Human Linkage Analysis
• RFLP Markers for Linkage (1980)• Huntington’s Disease Linkage (1983)• Cystic Fibrosis Linkage (1985)• Cystic Fibrosis Gene (1989)• Huntington’s Disease Gene (1993)
Genomewide Linkage Analysis
Genetic Markers
Genes
= 10% on average
Linkage Analysis
• LOD score based on recombination
• LOD () = log ()R (1 - )NR
____________________
( = 1/2) R + NR
Dominant Trait
D d
D dD d
d d
d d
1 2
1 3 2 3 2 3
3 3
Linkage Analysis
1 2
1 3 2 3 2 3
3 3
NR R NR
LOD score
LOD () = log ()1 (1 - )2
____________________
( = 1/2) 1 + 2 LOD Score
0.01 -1.11
0.05 -0.44
0.1 -0.19
0.2 0.01
0.3 0.07
0.4 0.06
0.5 0.00
IBD
• Identity by descent
• Allele Sharing methods
• Often used for affected sib pairs
Identity By Descent
a A a A
A aa AA A aa25% 25%25% 25%
Identity By Descent
a A
A aa AA A aa
Parent 1
1 1
1 1
Alleles shared IBD
Identity By Descent
a A
A aa AA A aa
Parent 1
1 1
1 1
Alleles IBD Frequency
2 0%
1 100%
0 0%
Identity By Descent
A A
A aa AA A aa
Sibling 1
2 1
1 0
Alleles shared IBD
Identity By Descent
A A
A aa AA A aa
Sibling 1
2 1
1 0
Alleles IBD Frequency
2 25%
1 50%
0 25%
Identity By Descent
• IBD can be used for linkage analysis
• Expect 50% alleles shared between siblings
• Look for IBD > 50% for concordant pairs
• Look for IBD < 50% for discordant pairs
PTC Linkage AnalysisPTC Linkage Analysis
0
1
2
3
4
5
6
7
8
9
Location in the Genome
LOD Score
Chromosome 7
0
1
2
3
4
5
6
7
8
9
0 15 30 45 60 75 90 105 120 135 150 165 180 195 210 225 240 255 270 285
Location (cM)
LOD Score
PTC Linkage AnalysisPTC Linkage Analysis
Human Chromosomes
TAS2R38
Fine MappingFine Mapping
Linkage markers
Genes
Kim et al. Science 2003
Linkage Disequilibrium
a
b
A
B b
a
b
a
A a
bB
a a
b b
A
b
a
b B b
aa
Linkage Disequilibrium
A a
bB
a a
b b
A
b
a
b B b
aa
Linkage Disequilibrium
A
b
a
b
Linkage Disequilibrium
A
b
a
b b
a
b
a
a a
bb
A a
b b
A
b
a
b b b
aa
Linkage Disequilibrium
A
B
A
b b
A
b
aa
b
a
bTime
Linkage Disequilibrium Mapping
Genetic Markers
Genes
PTC Linkage Disequilibrium Mapping
Kim et al. Science 2003
TAS2R38 Receptor Structure
Kim et al. J Dent Res 2004
rs713598
rs1726866
rs10246939
3 SNPs in the TAS2R38 Gene
P A V
A V I
P A I
A A V
P V I
P V V
A A I A VV
TAS2R38 Diplotype and PTC Score
0
2
4
6
8
10
12
14
16
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Raw PTC Score
Number of Subjects
PAV/PAV
PAV/AVI
AAV/AVI
AVI/AVI
Kim et al. Science 2003
Confirm Mode of Inheritance
Number of FamiliesCan taste Can not taste
Both parents can taste 40 90 16One parent can taste, the other can not 51 80 37Neither parent can taste 9 0 17
Children
L. H. Snyder Science 1931
Chromosome 7
0
1
2
3
4
5
6
7
8
9
015 30 45 60 75 90 105 120 135 150 165 180 195 210 225 240 255 270 285
Location (cM)
LOD Score
Unadjusted
Adjusted
Explain Linkage Signal
Geographic Distribution of PTC Phenotype
Wooding Genetics 2006, adapted from Cavalli-Sforza 1994
Geographic Distribution of PTC Haplotypes
Table 4. Frequency of PTC haplotypes in populations worldwide
Haplotype European
(n = 200)
West Asian
(n = 22)
East Asian
(n = 54)
African
(n = 24)
S.W. Native American (n = 18)
A VI 0.47 0.67 0.31 0.25
A A V 0.03 0.04
A A I 0.17
PA V 0.49 0.33 0.69 0.50 1.00
PVI 0.04
Kim et al. Science 2003
Diplotype and PTC Score
0
2
4
6
8
10
12
14
16
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Raw PTC Score
Number of Subjects
PAV/PAV
PAV/AVI
AAV/AVI
AVI/AVI
Kim et al. Science 2003
3 SNPs form 3 Haplotypes
P A V
A V I
A A V
Taster
Non-taster
Rare
Comparing Diplotypes
Diplotype Mean PTC ScorePAV/AVI 8.81AAV/AVI 7.00AVI/AVI 1.86
Predicted Effect of the 3 SNPs
Amino Acid Substitution PAM250P --> A 1A --> V 0V --> I 4
TAS2R38 Haplotype Function
0
0.2
0.4
0.6
0.8
1
1.2
0.1 1 10 100 1000
PTC concentration (μ )M
/Ratio PTC SST
PAV
PAI
PVV
PVI
AAV
AAI
AVV
AVI
PTC Diplotype and Taste
Sandell and Breslin Current Biology 2006
Next Week
• Next Generation Sequencing
Appendix: Phase Unknown Linkage
Phase Unknown
D d
D dD d
d d
d d
1 2
1 3 2 3 2 3
3 3
? ?
? ?
? ?
? ?? ?
Phase Unknown
1 2
1 3 2 3 2 3
3 3
? ? ?
What if we don’t know phase?
• We calculate the LOD score for each phase
• Divide by 2
Phase Unknown
LOD () = ½ log ()1 (1 - )2
____________________
( = 1/2) 1 + 2
+
½ log ()2 (1 – )1
____________________
( = 1/2) 2 + 1
= -0.02 for = 0.44