IBD sharing: Theory and applications in the Ashkenazi Jewish population Shai Carmi Pe’er lab, Columbia University Mt. Sinai, NY March 2014
Feb 24, 2016
IBD sharing: Theory and applications in the Ashkenazi Jewish population
Shai CarmiPe’er lab, Columbia University
Mt. Sinai, NYMarch 2014
About Me
• 2006-2008: Empirical network analysis (computational)
• 2007-2010: Diffusion/navigation in random networks (theory)
• 2010-2011: Anomalous diffusion (theory)
• 2008-2011: RNA splicing and editing (computational/experimental)
• 2012-2014: Population genetics, with Itsik Pe’er
Outline
• IBD Sharing: Introduction• Ashkenazi Jewish Genetics• Demographic inference• Imputation• Future Directions & Summary
Outline
• IBD Sharing: Introduction• Ashkenazi Jewish Genetics• Demographic inference• Imputation• Future Directions & Summary
Identical-by-Descent (IBD) Sharing
A B
AB
A shared segment
g
Definition: A segment is shared IBD if it is inherited from a single recent common ancestor.
What’s “recent”?
A B
AB
A shared segment
g
Definition: A segment is shared IBD if it is inherited from a single recent common ancestor.
• Textbook/Pedigrees:MRCA more recent than a given time (Thompson, Genetics, 2013)
• In practice:o A segment is IBD if
it is longer than a cutoff
o Allow small differences
o Present methods can detect segments > ≈1cM
When is the Common Ancestor “recent”?
N=10
g=7
Present
Time(generations)
Why is IBD Useful?
A BAB
A shared segment
g
• Segments are rare but longo Probability of a site to be shared o Segment length
Applications
A BAB
A shared segment
g
• A segment indicates recent co-ancestry:o Disease mappingo Pedigree reconstructiono Detecting natural selectiono Demographic (historical)
inference
• Identical sequence across individuals:o Phasingo Imputationo Estimating heritabilityo Estimating genotyping error
rateBrowning and Browning, Annu. Rev. Genet., 2012
IBD Sharing Theory
• Model:o A population with constant effective size No A minimal segment length mo Two chromosomes of length L
• The fraction of the chromosome in shared segments?
• The number of shared segments?
The IBD Process along the Chromosome
ℓ1
0 LCoordinate
ℓ2 ℓ3 ℓ4 ℓ5 ℓ6 ℓ7 ℓ8 ℓ9 ℓ10
𝑓 𝑇 = ( ℓ1+ℓ5+ℓ9 ¿¿ /𝐿;𝑛𝑇=3
t1
t2
t3
t4
t5
t6
t7
t8
t9
t10
cutoff mCoalescent theory:
Given :
Sample Results
• The avg. fraction of the chr. in shared segments:
;
• The avg. number of shared segments:
• Implicit expressions for the distributions
Palamara et al., AJHG, 2012; Carmi et. al., Genetics, 2013; Carmi and Pe’er, arXiv, 2014
Outline
• IBD Sharing: Introduction• Ashkenazi Jewish Genetics• Demographic inference• Imputation• Future Directions & Summary
Founder PopulationsTime
Founder population Non-founder population
Disease alleles
B
Population size
Founder Populations
Recent successes:• Greece (Tachmazidou et al., Nat. Comm. 2013)• Finland (Kurki et al. PLoS Genet., 2014)• Iceland (deCODE) (many papers; most recently Steinthorsdottir et al., Nat. Genet. 2014;
Grarup, PLoS Genet., 2013)
A Brief History of Ashkenazi Jews
• Unclear origin• Ca. 1000:
Small communities in Northern France, Rhineland• Migration east• Expansion• Migration to US and Israel• ≈10M today• Relative isolation
Ashkenazi Jewish (AJ) Genetics
Behar et al., Nature, 2010Bray et al., PNAS, 2010Guha et al., Genome Biol, 2012Behar et al., Hum. Biol., 2014
Price et al., PLoS Genet., 2008Olshen et al., BMC Genet, 2008Need et al., Genome Biol, 2009Kopelman et al., BMC Genet, 2009
Atzmon et al., AJHG, 2010
AJJewish, non-AJ
Middle-East
Europe
AJ Genetics: Interim Summary
• Current large population (≈10M)
• IBD analysis: bottleneck of effective size ≈300 (later)
• Mendelian disorders, high frequency risk alleles
• Insight on both European & Middle-Eastern past
• No genealogies
The Ashkenazi Genome Consortium
NY area labs interested in specific diseases
Quantify utility in medical genetics
Learn about population
history
Phase I: 128 whole genomes (CG; completed)Phase II: ≈300 whole genomes (NYGC; under way)
Large genotyped cohorts
Impute
Sequencing StatisticsStatistic Per genome
(exome)SNVs 3.4M (22k)
Novel SNVs 3.8% (4.1%)Het/hom ratio 1.65 (1.67)
Insertions 220k (242)Deletions 235k (223)
Multi-nucleotide variants 83k (374)Synonymous SNVs 10,536
Non-synonymous SNVs 9706Nonsense SNVs 72Other disrupting 255
CNVs 302SVs 1480MEIS 4090
Results Highlights• Low false positive rate at ≈5,000 per genome• 50% more novel variants per genome in AJ
(compared to non-Jewish Europeans)• More genetic diversity in AJ (θ), but less projected for large
samples• More AJ-specific variants compared to EU-specific variants• A model for EU-Middle-East-AJ ancient history• A model for AJ recent history• The panel is necessary for screening clinical AJ genomes• Catalog of mutations in known AJ disease genes• Slightly higher mutation burden in AJ• The panel is useful for imputation
S. C. et al., submitted
A Model for Ancient History
Outline
• IBD Sharing: Introduction• Ashkenazi Jewish Genetics• Demographic inference• Imputation• Future Directions & Summary
A Simple Approach
• Model: o A constant effective population size No A single chromosome of length Lo Sample size no For each pair, detect all segments of length >mo Compute <fT>, the average fraction of the chr.
shared• Inference:
o Method of moments
o Can prove:
Palamara et al., AJHG, 2012; Carmi et. al., Genetics, 2013
A Simple Approach
A Maximum Likelihood Approach
Carmi and Pe’er, arXiv, 2014
A Practical Approach
Palamara et al., AJHG, 2012
• Assume historical size N(t)=N0 λ(t).o Time scaled by 2N0
• Avg. fraction of the genome in segments of length ℓ1<ℓ<ℓ2:
(1)
Method:• Detect IBD in sample• Plot the empirical P(ℓ)• Using Eq. (1), find the
history N(t) that fits best
0 5 10 15 200.00
0.00
0.01
0.10
Ne 1000
Ne 2000
Ne 3000
Segment length ℓ
P(ℓ)
IBD Sharing in AJ
• Atzmon et al., AJHG, 2010
• Bray et al., PNAS, 2010
• Gusev et al., MBE, 2012
≈50cM per pair in segments >3cM
An AJ Bottleneck
S. C. et al., submitted
Time (years)
Caveats
• Phasing and genotyping errors; IBD detection errors• Reasonable power only for 10-50 generations ago• Model specification (e.g. prolonged bottleneck,
admixture)• Fitting
Parameter Ancestral size
Bottleneck size
Growth rate (per gen)
Bottleneck time (gen)
95% confidence interval
3654-5856
249-419 16-53% 25-32
Outline
• IBD Sharing: Introduction• Ashkenazi Jewish Genetics• Demographic inference• Imputation• Future Directions & Summary
Imputation
Impute2
• Cost-effective association study design:o Fully sequence a small reference
panelo Impute many sparsely genotyped
individuals
AJ Panel Performance
Fraction of non-ref variants with maf ≤1% wrongly imputed: 13% for AJ, 35% for CEU
Imputation by IBD
Sequence A
Gusev at al., Genetics, 2012
Imputation by IBD
Sequence A
• How to select individuals for sequencing?• Is there enough IBD sharing?• How to impute effectively?
Palin et al., Genet. Epidemiol., 2011; Kong et al., Nat. Genet., 2008
Selection for Sequencing
• Improve performance by selecting top-sharing samplesGusev et al., Genetics, 2012: INFOSTIP• Theory for coverage in a population model
Carmi et al., Genetics, 2013• Not terribly important
Coverage by IBD
Fit to:
TAGC (sequencing; n=128)SZ study (genotyping; n=2500)
Coverage by IBD: Theory
Time(gen)
Present
gg+1
𝑁→∞
𝑁→∞
B
𝑁→∞
1-α
Exact solution: Define and
Outline
• IBD Sharing: Introduction• Ashkenazi Jewish Genetics• Demographic inference• Imputation• Future Directions & Summary
Future Directions
• N-way IBD sharingo Derived P(ℓ1<ℓ<ℓ2) for three chromosomeso Important for demographic inference, disease
mapping, detecting natural selection
• Dating mutations using IBD
• Phasing/imputation using IBDo A fast approach needed
Dating f2 mutations
x x
Summary
• IBD is useful in genetics
• We characterized IBD in population models
• IBD abundant in AJ and can be used for historical inference and imputation
• Many interesting future applications
Acknowledgements
Funding:Human Frontiers Science program
Itsik Pe’er’s lab:James Xue, Ethan Kochav, Yunzhi Ye
TAGC consortium members:Todd Lencz, Semanti Mukherjee (LIJMC)Lorraine Clark, Xinmin Liu (CUMC)Gil Atzmon, Harry Ostrer, Danny Ben-Avraham (AECOM)Inga Peter, Judy Cho (MSSM) Joseph Vijai (MSKCC)Ken Hui (Yale)
Thank you for your attention!