Lea Starita, PhD @lea_starita Shendure and Fields Labs Department of Genome Sciences University of Washington Massively parallel functional analysis of missense mutations in BRCA1 for interpreting variants of uncertain significance
Lea Starita, PhD@lea_starita
Shendure and Fields LabsDepartment of Genome Sciences
University of Washington
Massively parallel functional analysis of missense mutations in BRCA1 for interpreting variants of uncertain significance
~350 Variants of Uncertain Significance (VUS)
Freq
uenc
yVariants of uncertain significance (VUS)
Computational prediction
ThroughputGenetic analysis or one-off experiments
Validity
Massively parallel functional analysis
+++ +
+ +++
++++++
How do we interpret the impact of genetic variation at scale?
Massively parallel functional assaysfor assessing function of missense variants
Generate a library with mutations in a sequence of interest
Multiplexed functional assay
Quantify effect sizes of individual variants
var-1var-2
var-3
var-4
var-1var-2
var-3
var-4
Co-transfection
CRISPR
HDR Library
Array-synthesized mutations
Haploid Cells
Population of Cells w/ Many Different Edits
var-1
var-2
var-4
var-3
Co-transfection
CRISPR
HDR Library
Array-synthesized mutations
Haploid Cells
Population of Cells w/ Many Different Edits
var-1
var-2
var-4
var-3
var-2
var-4
var-4var-4
effect size
Sequence variants in input and selected populations
BARD1RING
BRCA1RING
Biochemical functions of BRCA1
BRCA11-102
BRCA1 is required for homology-directed dsDNA break repair (HDR)
BRCA1 HDR activity is required for tumor suppression
BRCA1 must dimerize with BARD1 to function in HDR
The BRCA1:BARD1 dimer has ubiquitin ligase activity
BARD1RING
BRCA1RING
Multiplex assays for BRCA1 protein function and splicing
Experiments 1 and 2:BARD1-BRCA1-RING E3 ligase activityBARD1-BRCA1-RING interaction
Experiment 3: Saturation genome editing to assess the effect of SNVs on splicing. exon 17 exon 19exon 18
CRISPRHDR Library (N = 4,096)
BARD1RING
BRCA1RING
Multiplex assays for BRCA1 protein function and splicing
Experiments 1 and 2:BARD1-BRCA1-RING E3 ligase activityBARD1-BRCA1-RING interaction
Experiment 3: Saturation genome editing to assess the effect of SNVs on splicing. exon 17 exon 19exon 18
CRISPRHDR Library (N = 4,096)
E3 ligase activity
variant library
BRCA1T7 coat BARD1(26-126) BRCA1(2-304)
A
ATP, E1, E2Flag-Ub capture elute
deep sequencing
calculate variant frequency
deep sequencing
calculate selected/input ratio for each variant for each round
calculate slope of log2 ratios over 5 rounds of selection
5XB
Massively parallel assays for the BRCA1-RINGE3 ligase and BARD1-binding activities
E3 ligase activity
variant library
BRCA1T7 coat BARD1(26-126) BRCA1(2-304)
A
ATP, E1, E2Flag-Ub capture elute
deep sequencing
calculate variant frequency
deep sequencing
calculate selected/input ratio for each variant for each round
calculate slope of log2 ratios over 5 rounds of selection
5XB
BARD1-binding activity
selection in -histidine
transformation, Time 1 Time 2 Time 3deep sequencing deep sequencing deep sequencing deep sequencing
Time 0
B
Massively parallel assays for the BRCA1-RINGE3 ligase and BARD1-binding activities
predictionmodel
How can we leverage these measurements to estimatethe likelihood that a BRCA1 variant would be pathogenic?
to understand the homology-directed DNA repair (HDR) function of BRCA1?
predictionmodel
Ransburgh et al. Cancer Research 2010
HDR activity?
How can we leverage these measurements to estimatethe likelihood that a BRCA1 variant would be pathogenic?
Experimental data build a better predictor of BRCA1 HDR function
0.0 0.2 0.4 0.6
PolyPhen-2
CADD
Grantham
Align-GVGD
E3 + BARD1-binding
E3 + BARD1-binding
+ A-GVGD
Leave-One-Out Cross Validation R2
pathogenic
VUS
benign
0.0 1.00.33 0.770
2
4
0
2
4
0
2
4
HDR predictions for clinical BRCA1 variants
Predicted HDR score
Cou
nts
Cou
nts
Cou
nts
splice
High HDR functionLow HDR function
pathogenicVUS
benign
0.0 1.00.33 0.77
0
50
100
HDR predictions for 1,287 BRCA1 variants not yet seen in patients
Predicted HDR score
High HDR functionLow HDR function
Cou
nts
HDR predictions for 1,287 BRCA1 variants not yet seen in patients
Cou
nts
High HDR functionLow HDR function
pathogenicVUS
benign
not yet seen
0.0 1.00.33 0.77
0
50
100
Predicted HDR score
BARD1RING
BRCA1RING
Multiplex assays for BRCA1 protein function and splicing
Experiments 1 and 2:BARD1-BRCA1-RING E3 ligase activityBARD1-BRCA1-RING interaction
Experiment 3: Saturation genome editing to assess the effect of SNVs on splicing. exon 17 exon 19exon 18
CRISPRHDR Library (N = 4,096)
Cas9/gRNA
Edited repair templates(N = 4,096)
Multiplex genome editing to measurethe effects of SNVs on splicing
1. CRISPR-Cas9constructtargetingBRCA1 exon18
2. RepairtemplatelibrarytosubstituteSNVs withintheexon.
Findlay, Boyleetal.,Nature (2014).
Co-transfection,Multiplex editing
Cas9/gRNA
Edited repair templates(N = 4,096)
Multiplex genome editing to measurethe effects of SNVs on splicing
Heterogeneouspopulationofeditedcells
5 days, collect gDNA& RNA “SelectivePCR”– onlyeditedgDNA andcDNA
Sequence, gDNA and cDNA Variant Counts
gDNA cDNA
Co-transfection,Multiplex editing
Cas9/gRNA
Edited repair templates(N = 4,096)
5 days, collect gDNA& RNA
Multiplex genome editing to measurethe effects of SNVs on splicing
CountvariantgDNA andcDNA
cDN
A/ g
DN
A
Enrichment Scores
Rank-ordered variants
Calculateeffectsonsplicingforeachvariant
Variants that create splice enhancers and silencers or trigger nonsense-mediated decay behave as expected
Nature nature13695.3d 8/8/14 14:12:55
Co-transfection,multiplex editing
CRISPR
Random hexamerHDR library
5 days, gDNA &RNA prep (with RT)
ATGCTGAGTTTGTGTGTGAA CGG AC
ATGC NNNNNN TGTGTG GATATCC AC
ATGC NNNNNN TGTGTG GATATCC AC
ATGC NNNNNN TGTGTG GATATCC AC
ATGCTGAGTTTGTGTG T GAACGG AC
ATGC NNNNNN TGTGTG GATATCC AC
ATGCTGAGTTTGTGTGTGAA CGG AC
Endogenous locus
Donor plasmidswith random hexamers
Selective PCR
gDNA cDNA
Sequencing
Genome editingin many cells
Transcriptionand splicing
CRISPR
Hexamer Count gDNA cDNA
Enrichment scores
cDN
A/g
DN
A
a
Endogenous loci
WT
Edited
Rank-ordered hexamers
Log 2
enric
hmen
t - re
plic
ate
2a
Log 2
enric
hmen
t
Log2 enrichment - replicate 1a
–10
–5
0
5
ESE
ESS
Nonsense
Enrichment score rank
b
c
R = 0.659
–15
–10
–5
0
5
–15 –10 –5 0 5
ESEESSNonsense
Figure 1 | Saturation genome editing and multiplex functional analysis of ahexamer region influencing BRCA1 splicing. a, Experimental schematic.Cultured cells were co-transfected with a single Cas9-sgRNA construct(CRISPR) and a complex homology-directed repair (HDR) library containingan edited exon that harbours a random hexamer (blue, green, orange) and afixed selective PCR site (red). CRISPR-induced cutting stimulated homologousrecombination with the HDR library, inserting mutant exons into the genomesof many cells. At five days post-transfection, cells were harvested for gDNA
and RNA. After reverse transcription, selective PCR was performed followed bysequencing of gDNA- and cDNA-derived amplicons. Hexamer enrichmentscores were calculated by dividing cDNA counts normalized by gDNA counts.b, Correlation of enrichment scores between biological replicates for hexamersobserved in each experiment with positions of previously identified14 exonicsplicing enhancers (ESEs), exonic splicing silencers (ESSs) and stop codonsindicated. c, Rank-ordered plot of enrichment scores with positions of ESEs,ESSs and stop codons indicated.
a b
cLibrary R replicate 1 effect size
Libr
ary
R re
plic
ate
2 ef
fect
siz
e
–5–4–3–2–101
BRCA1 exon 18 position
Effe
ct s
ize
1 2 3 4 5 6 7 8 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75 77
A C G T
R = 0.847
–5
–4
–3
–2
–1
0
1
Library R effect size
Libr
ary
R2
effe
ct s
ize
nonsensesense
R = 0.846
–5
–4
–3
–2
–1
0
1
–5 –4 –3 –2 –1 0 1 –5 –4 –3 –2 –1 0 1
Figure 2 | Multiplex homology-directed repair reveals effects of singlenucleotide variants on transcript abundance. Three separate HDR libraries(R, R2, and L) containing a 3% mutation rate (97% WT, 1% each non-WT base)in either half of BRCA1 exon 18 were introduced to the genome via co-transfection with pCas9-sgBRCA1x18. Enrichment scores were calculated foreach haplotype observed at least 10 times in the gDNA, and effect sizes of SNVswere determined by weighted linear regression modelling. ‘Sense’ includesboth missense and synonymous SNVs. a, Effect sizes calculated from replicatetransfections of HDR library R, consisting of a 3% per-nucleotide mutation rate
in the 39-most 39 bases and the same selective PCR site used in Fig. 1, werehighly correlated (R 5 0.846). b, Library R2 harboured a selective PCR sitecomposed of 5 synonymous changes, none of which are present in library R.When effect sizes derived from experiments with library R2 were plottedagainst those from library R, there was a strong correlation (R 5 0.847),indicating reproducibility and demonstrating that differences between selectivePCR sites did not strongly influence scores. c, Effect sizes for SNVs across theexon are displayed. Data sets from libraries R and L were combined to spanthe entire exon. Dashed lines represent SNVs that introduce nonsense codons.
RESEARCH LETTER
2 | N A T U R E | V O L 0 0 0 | 0 0 M O N T H 2 0 1 4
Nature nature13695.3d 8/8/14 14:12:55
Co-transfection,multiplex editing
CRISPR
Random hexamerHDR library
5 days, gDNA &RNA prep (with RT)
ATGCTGAGTTTGTGTGTGAA CGG AC
ATGC NNNNNN TGTGTG GATATCC AC
ATGC NNNNNN TGTGTG GATATCC AC
ATGC NNNNNN TGTGTG GATATCC AC
ATGCTGAGTTTGTGTG T GAACGG AC
ATGC NNNNNN TGTGTG GATATCC AC
ATGCTGAGTTTGTGTGTGAA CGG AC
Endogenous locus
Donor plasmidswith random hexamers
Selective PCR
gDNA cDNA
Sequencing
Genome editingin many cells
Transcriptionand splicing
CRISPR
Hexamer Count gDNA cDNA
Enrichment scores
cDN
A/g
DN
A
a
Endogenous loci
WT
Edited
Rank-ordered hexamers
Log 2
enric
hmen
t - re
plic
ate
2a
Log 2
enric
hmen
t
Log2 enrichment - replicate 1a
–10
–5
0
5
ESE
ESS
Nonsense
Enrichment score rank
b
c
R = 0.659
–15
–10
–5
0
5
–15 –10 –5 0 5
ESEESSNonsense
Figure 1 | Saturation genome editing and multiplex functional analysis of ahexamer region influencing BRCA1 splicing. a, Experimental schematic.Cultured cells were co-transfected with a single Cas9-sgRNA construct(CRISPR) and a complex homology-directed repair (HDR) library containingan edited exon that harbours a random hexamer (blue, green, orange) and afixed selective PCR site (red). CRISPR-induced cutting stimulated homologousrecombination with the HDR library, inserting mutant exons into the genomesof many cells. At five days post-transfection, cells were harvested for gDNA
and RNA. After reverse transcription, selective PCR was performed followed bysequencing of gDNA- and cDNA-derived amplicons. Hexamer enrichmentscores were calculated by dividing cDNA counts normalized by gDNA counts.b, Correlation of enrichment scores between biological replicates for hexamersobserved in each experiment with positions of previously identified14 exonicsplicing enhancers (ESEs), exonic splicing silencers (ESSs) and stop codonsindicated. c, Rank-ordered plot of enrichment scores with positions of ESEs,ESSs and stop codons indicated.
a b
cLibrary R replicate 1 effect size
Libr
ary
R re
plic
ate
2 ef
fect
siz
e
–5–4–3–2–101
BRCA1 exon 18 position
Effe
ct s
ize
1 2 3 4 5 6 7 8 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75 77
A C G T
R = 0.847
–5
–4
–3
–2
–1
0
1
Library R effect size
Libr
ary
R2
effe
ct s
ize
nonsensesense
R = 0.846
–5
–4
–3
–2
–1
0
1
–5 –4 –3 –2 –1 0 1 –5 –4 –3 –2 –1 0 1
Figure 2 | Multiplex homology-directed repair reveals effects of singlenucleotide variants on transcript abundance. Three separate HDR libraries(R, R2, and L) containing a 3% mutation rate (97% WT, 1% each non-WT base)in either half of BRCA1 exon 18 were introduced to the genome via co-transfection with pCas9-sgBRCA1x18. Enrichment scores were calculated foreach haplotype observed at least 10 times in the gDNA, and effect sizes of SNVswere determined by weighted linear regression modelling. ‘Sense’ includesboth missense and synonymous SNVs. a, Effect sizes calculated from replicatetransfections of HDR library R, consisting of a 3% per-nucleotide mutation rate
in the 39-most 39 bases and the same selective PCR site used in Fig. 1, werehighly correlated (R 5 0.846). b, Library R2 harboured a selective PCR sitecomposed of 5 synonymous changes, none of which are present in library R.When effect sizes derived from experiments with library R2 were plottedagainst those from library R, there was a strong correlation (R 5 0.847),indicating reproducibility and demonstrating that differences between selectivePCR sites did not strongly influence scores. c, Effect sizes for SNVs across theexon are displayed. Data sets from libraries R and L were combined to spanthe entire exon. Dashed lines represent SNVs that introduce nonsense codons.
RESEARCH LETTER
2 | N A T U R E | V O L 0 0 0 | 0 0 M O N T H 2 0 1 4Nature nature13695.3d 8/8/14 14:12:55
Co-transfection,multiplex editing
CRISPR
Random hexamerHDR library
5 days, gDNA &RNA prep (with RT)
ATGCTGAGTTTGTGTGTGAA CGG AC
ATGC NNNNNN TGTGTG GATATCC AC
ATGC NNNNNN TGTGTG GATATCC AC
ATGC NNNNNN TGTGTG GATATCC AC
ATGCTGAGTTTGTGTG T GAACGG AC
ATGC NNNNNN TGTGTG GATATCC AC
ATGCTGAGTTTGTGTGTGAA CGG AC
Endogenous locus
Donor plasmidswith random hexamers
Selective PCR
gDNA cDNA
Sequencing
Genome editingin many cells
Transcriptionand splicing
CRISPR
Hexamer Count gDNA cDNA
Enrichment scores
cDN
A/g
DN
A
a
Endogenous loci
WT
Edited
Rank-ordered hexamers
Log 2
enric
hmen
t - re
plic
ate
2a
Log 2
enric
hmen
t
Log2 enrichment - replicate 1a
–10
–5
0
5
ESE
ESS
Nonsense
Enrichment score rank
b
c
R = 0.659
–15
–10
–5
0
5
–15 –10 –5 0 5
ESEESSNonsense
Figure 1 | Saturation genome editing and multiplex functional analysis of ahexamer region influencing BRCA1 splicing. a, Experimental schematic.Cultured cells were co-transfected with a single Cas9-sgRNA construct(CRISPR) and a complex homology-directed repair (HDR) library containingan edited exon that harbours a random hexamer (blue, green, orange) and afixed selective PCR site (red). CRISPR-induced cutting stimulated homologousrecombination with the HDR library, inserting mutant exons into the genomesof many cells. At five days post-transfection, cells were harvested for gDNA
and RNA. After reverse transcription, selective PCR was performed followed bysequencing of gDNA- and cDNA-derived amplicons. Hexamer enrichmentscores were calculated by dividing cDNA counts normalized by gDNA counts.b, Correlation of enrichment scores between biological replicates for hexamersobserved in each experiment with positions of previously identified14 exonicsplicing enhancers (ESEs), exonic splicing silencers (ESSs) and stop codonsindicated. c, Rank-ordered plot of enrichment scores with positions of ESEs,ESSs and stop codons indicated.
a b
cLibrary R replicate 1 effect size
Libr
ary
R re
plic
ate
2 ef
fect
siz
e
–5–4–3–2–101
BRCA1 exon 18 position
Effe
ct s
ize
1 2 3 4 5 6 7 8 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75 77
A C G T
R = 0.847
–5
–4
–3
–2
–1
0
1
Library R effect size
Libr
ary
R2
effe
ct s
ize
nonsensesense
R = 0.846
–5
–4
–3
–2
–1
0
1
–5 –4 –3 –2 –1 0 1 –5 –4 –3 –2 –1 0 1
Figure 2 | Multiplex homology-directed repair reveals effects of singlenucleotide variants on transcript abundance. Three separate HDR libraries(R, R2, and L) containing a 3% mutation rate (97% WT, 1% each non-WT base)in either half of BRCA1 exon 18 were introduced to the genome via co-transfection with pCas9-sgBRCA1x18. Enrichment scores were calculated foreach haplotype observed at least 10 times in the gDNA, and effect sizes of SNVswere determined by weighted linear regression modelling. ‘Sense’ includesboth missense and synonymous SNVs. a, Effect sizes calculated from replicatetransfections of HDR library R, consisting of a 3% per-nucleotide mutation rate
in the 39-most 39 bases and the same selective PCR site used in Fig. 1, werehighly correlated (R 5 0.846). b, Library R2 harboured a selective PCR sitecomposed of 5 synonymous changes, none of which are present in library R.When effect sizes derived from experiments with library R2 were plottedagainst those from library R, there was a strong correlation (R 5 0.847),indicating reproducibility and demonstrating that differences between selectivePCR sites did not strongly influence scores. c, Effect sizes for SNVs across theexon are displayed. Data sets from libraries R and L were combined to spanthe entire exon. Dashed lines represent SNVs that introduce nonsense codons.
RESEARCH LETTER
2 | N A T U R E | V O L 0 0 0 | 0 0 M O N T H 2 0 1 4
Findlay, Boyleetal.,Nature (2014).
* defined from Ke et al. 2011
*Splice enhancers
*Splice silencers
Effects of SNVs across BRCA1 exon 18
Co-transfection,multiplex editing
CRISPR
Random hexamerHDR library
5 days, gDNA &RNA prep (with RT)
ATGCTGAGTTTGTGTGTGAA CGG AC
ATGC NNNNNNTGTGTGGATATCCAC
ATGCNNNNNN TGTGTG GATATCCAC
ATGC NNNNNNTGTGTGGATATCCAC
ATGCTGAGTTTGTGTGT GAACGG AC
ATGCNNNNNNTGTGTGGATATCC AC
ATGCTGAGTTTGTGTGTGAACGG AC
Endogenous locus
Donor plasmidswith random hexamers
Selective PCR
gDNA cDNA
Sequencing
Genome editingin many cells
Transcriptionand splicing
CRISPR
Hexamer Count gDNA cDNA
Enrichment scores
cDN
A/g
DN
A
a
Endogenous loci
WT
Edited
Rank-ordered hexamers
Log 2
enric
hmen
t - re
plic
ate
2aLo
g 2 en
richm
ent
Log2 enrichment - replicate 1a
–10
–5
0
5
ESE
ESS
Nonsense
Enrichment score rank
b
c
R = 0.659
–15
–10
–5
0
5
–15 –10 –5 0 5
ESEESSNonsense
Figure 1 | Saturation genome editing and multiplex functional analysis of ahexamer region influencing BRCA1 splicing. a, Experimental schematic.Cultured cells were co-transfected with a single Cas9-sgRNA construct(CRISPR) and a complex homology-directed repair (HDR) library containingan edited exon that harbours a random hexamer (blue, green, orange) and afixed selective PCR site (red). CRISPR-induced cutting stimulated homologousrecombination with the HDR library, inserting mutant exons into the genomesof many cells. At five days post-transfection, cells were harvested for gDNA
and RNA. After reverse transcription, selective PCR was performed followed bysequencing of gDNA- and cDNA-derived amplicons. Hexamer enrichmentscores were calculated by dividing cDNA counts by gDNA counts.b, Correlation of enrichment scores between biological replicates for hexamersobserved in each experiment with positions of previously identified14 exonicsplicing enhancers (ESEs), exonic splicing silencers (ESSs) and stop codonsindicated. c, Rank-ordered plot of enrichment scores with positions of ESEs,ESSs and stop codons indicated.
a b
cLibrary R replicate 1 effect size
Libr
ary
R re
plic
ate
2 ef
fect
siz
e
–5–4–3–2–101
BRCA1 exon 18 position
Effe
ct s
ize
1 2 3 4 5 6 7 8 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75 77
A C G T
R = 0.847
–5
–4
–3
–2
–1
0
1
Library R effect size
Libr
ary
R2
effe
ct s
ize
nonsensesense
R = 0.846
–5
–4
–3
–2
–1
0
1
–5 –4 –3 –2 –1 0 1 –5 –4 –3 –2 –1 0 1
Figure 2 | Multiplex homology-directed repair reveals effects of singlenucleotide variants on transcript abundance. Three separate HDR libraries(R, R2, and L) containing a 3% mutation rate (97% WT, 1% each non-WT base)in either half of BRCA1 exon 18 were introduced to the genome via co-transfection with pCas9-sgBRCA1x18. Enrichment scores were calculated foreach haplotype observed at least 10 times in gDNA sequencing, and effect sizesof SNVs were determined by weighted linear regression modelling. ‘Sense’includes both missense and synonymous SNVs. a, Effect sizes calculated fromreplicate transfections of HDR library R, consisting of a 3% per-nucleotide
mutation rate in the 39-most 39 bases and the same selective PCR site used inFig. 1, were highly correlated (R 5 0.846). b, Library R2 harboured a selectivePCR site composed of 5 synonymous changes, none of which are present inlibrary R. When effect sizes derived from experiments with library R2 wereplotted against those from library R, there was a strong correlation (R 5 0.847),indicating reproducibility and demonstrating that differences between selectivePCR sites did not strongly influence scores. c, Effect sizes for SNVs across theexon are displayed. Data sets from libraries R and L were combined to spanthe entire exon. Dashed lines represent SNVs that introduce nonsense codons.
LETTER RESEARCH
4 S E P T E M B E R 2 0 1 4 | V O L 5 1 3 | N A T U R E | 1 2 1
Macmillan Publishers Limited. All rights reserved©2014
- - - - *Dashedlinesrepresentnonsense mutations
Findlay, Boyleetal.,Nature (2014).
Effects of SNVs across BRCA1 exon 18
Co-transfection,multiplex editing
CRISPR
Random hexamerHDR library
5 days, gDNA &RNA prep (with RT)
ATGCTGAGTTTGTGTGTGAA CGG AC
ATGC NNNNNNTGTGTGGATATCCAC
ATGCNNNNNN TGTGTG GATATCCAC
ATGC NNNNNNTGTGTGGATATCCAC
ATGCTGAGTTTGTGTGT GAACGG AC
ATGCNNNNNNTGTGTGGATATCC AC
ATGCTGAGTTTGTGTGTGAACGG AC
Endogenous locus
Donor plasmidswith random hexamers
Selective PCR
gDNA cDNA
Sequencing
Genome editingin many cells
Transcriptionand splicing
CRISPR
Hexamer Count gDNA cDNA
Enrichment scores
cDN
A/g
DN
A
a
Endogenous loci
WT
Edited
Rank-ordered hexamers
Log 2
enric
hmen
t - re
plic
ate
2aLo
g 2 en
richm
ent
Log2 enrichment - replicate 1a
–10
–5
0
5
ESE
ESS
Nonsense
Enrichment score rank
b
c
R = 0.659
–15
–10
–5
0
5
–15 –10 –5 0 5
ESEESSNonsense
Figure 1 | Saturation genome editing and multiplex functional analysis of ahexamer region influencing BRCA1 splicing. a, Experimental schematic.Cultured cells were co-transfected with a single Cas9-sgRNA construct(CRISPR) and a complex homology-directed repair (HDR) library containingan edited exon that harbours a random hexamer (blue, green, orange) and afixed selective PCR site (red). CRISPR-induced cutting stimulated homologousrecombination with the HDR library, inserting mutant exons into the genomesof many cells. At five days post-transfection, cells were harvested for gDNA
and RNA. After reverse transcription, selective PCR was performed followed bysequencing of gDNA- and cDNA-derived amplicons. Hexamer enrichmentscores were calculated by dividing cDNA counts by gDNA counts.b, Correlation of enrichment scores between biological replicates for hexamersobserved in each experiment with positions of previously identified14 exonicsplicing enhancers (ESEs), exonic splicing silencers (ESSs) and stop codonsindicated. c, Rank-ordered plot of enrichment scores with positions of ESEs,ESSs and stop codons indicated.
a b
cLibrary R replicate 1 effect size
Libr
ary
R re
plic
ate
2 ef
fect
siz
e
–5–4–3–2–101
BRCA1 exon 18 position
Effe
ct s
ize
1 2 3 4 5 6 7 8 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75 77
A C G T
R = 0.847
–5
–4
–3
–2
–1
0
1
Library R effect size
Libr
ary
R2
effe
ct s
ize
nonsensesense
R = 0.846
–5
–4
–3
–2
–1
0
1
–5 –4 –3 –2 –1 0 1 –5 –4 –3 –2 –1 0 1
Figure 2 | Multiplex homology-directed repair reveals effects of singlenucleotide variants on transcript abundance. Three separate HDR libraries(R, R2, and L) containing a 3% mutation rate (97% WT, 1% each non-WT base)in either half of BRCA1 exon 18 were introduced to the genome via co-transfection with pCas9-sgBRCA1x18. Enrichment scores were calculated foreach haplotype observed at least 10 times in gDNA sequencing, and effect sizesof SNVs were determined by weighted linear regression modelling. ‘Sense’includes both missense and synonymous SNVs. a, Effect sizes calculated fromreplicate transfections of HDR library R, consisting of a 3% per-nucleotide
mutation rate in the 39-most 39 bases and the same selective PCR site used inFig. 1, were highly correlated (R 5 0.846). b, Library R2 harboured a selectivePCR site composed of 5 synonymous changes, none of which are present inlibrary R. When effect sizes derived from experiments with library R2 wereplotted against those from library R, there was a strong correlation (R 5 0.847),indicating reproducibility and demonstrating that differences between selectivePCR sites did not strongly influence scores. c, Effect sizes for SNVs across theexon are displayed. Data sets from libraries R and L were combined to spanthe entire exon. Dashed lines represent SNVs that introduce nonsense codons.
LETTER RESEARCH
4 S E P T E M B E R 2 0 1 4 | V O L 5 1 3 | N A T U R E | 1 2 1
Macmillan Publishers Limited. All rights reserved©2014Findlay, Boyleetal.,Nature (2014).
MutPred Spliceannotations:C49G “SpliceAffectingVariant”A53G “ESELoss/ESSGain”A56G “ESELoss”G63T “Cryptic5’SS”T67G “Cryptic5’SS”akaVUSV1714G
- - - - *Dashedlinesrepresentnonsense mutations
In summary
Parallelized assays for the protein function of the RING domain of BRCA1
Saturation genome editing to understand the effect of missense variants on splicing
Freq
uenc
y
Next steps…
Suggestions?
Library construction and variant delivery
Parallelizable assays for protein function
Calculate likelihood estimates for pathogenicity
Sequencing of variants
Computational variant scoring pipeline
Challenges for scaling up
How the results from massively parallelassays could get to the bedside…
Morescans
Betterdatabases
Bettervarianteffectprediction
Shendure lab
Fowler lab
Parvin labMuhtadi Islam
The Ohio State University
Kitzman labUniversity of MichiganFields lab
Dave Young Justin Gullingsrud
Thanks to:
Funding from theYeast Resource Center
NIH P41
The effects of missense SNVs on splicing and protein function are difficult to predict
exon 17 exon 19exon 18CRISPR
HDR Library (N = 4,096)
Multiplex genome editing to determine effects of SNVs on splicing of exon 18 of
BRCA1Findlay et al. Nature 2014
Learning the Sequence Determinants of Alternative Splicing from Millions of
Random SequencesRosenberg et al. Cell 2015
Stop Gain √ Missense ?Frameshift √
Splicing effects
I-SceI-GFP donor GFP
+ I-SceI to induce break
HDR
GFP broken GFP
+ functional
BRCA1 variant
error prone break repair
+ siRNA to target BRCA1 3’UTR
+ nonfunctional
BRCA1 variant
Scoring full-length BRCA1 variants for HDR function in human cells
pathogenic
control
benign
0.00
0.25
0.50
0.75
1.00
1.25
WT
Vecto
r o
nly
R7C
M18T
L22S
C39Y
H41R
C44F
C44S
K45Q
C61G
C64G
D67Y
% H
DR
resc
ue
HDR rescue assay
Muhtadi Islam and Jeff Parvin
I-SceI-GFP donor GFP
+ I-SceI to induce break
HDR
GFP broken GFP
+ functional
BRCA1 variant
error prone break repair
+ siRNA to target BRCA1 3’UTR
+ nonfunctional
BRCA1 variant
Scoring full-length BRCA1 variants for HDR function in human cells
HDR rescue assay
Construction of the barcoded single amino acid substitution BRCA1-RING library
+
ATGCTGAGTTTGTGTGTGAACGGAC…
ATGCNNNNNNTGTGTGGATATCCAC…
ATGCNNNNNNTGTGTGGATATCCAC…
ATGCNNNNNNTGTGTGGATATCCAC…
ATGCTGAGTTTGTGTGT GAACGGAC…
ATGCNNNNNNTGTGTGGATATCCAC…
ATGCTGAGTTTGTGTGTGAACGGAC…
BRCA1 locus
Donor Plasmids w/ random 6mers Genome editing
in many cells
Transcription and Splicing
CRISPR
Endogenous Loci Eacheditedexonreceives
1.ArandomSNV2.Afixed mutation
3%of106 cells=30,000events
Multiplex genome editing to measurethe effects of SNVs on splicing
Prospective functional map for 1,287 BRCA1 RING variants0
25
50
75
0
2
4
6
8
Predicted HDR rescue score
0.0 1.00.33 0.53 0.77
1,287 BRCA1 RING variants
likely HDR likely HDR non-functional functional
Predicted HDR rescue score
0.0 1.00.33 0.53 0.77
59 BRCA1 RING variants
likely HDR likely HDR non-functional functional
midpoint between mean pathogenic and benign scores
max pathogenicHDR rescue score
(experimental)
min benignHDR rescue score
(experimental)
A B
COSMICEVS
pathogenicVUS
benign
subs
titut
ing
amin
o ac
id
AVILMFYWSTNQCGPRHKDE*
Zn++Zn++
N- 4-helix bundle loop 1 central helix loop 2
5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100
C- 4-helix bundle
C
no dataWT aa
1
2.0
0
PRED
ICTE
DHD
R re
scue
soc
re
*
*
E3 ligase activity
Massively parallel assays for BRCA1-RING E3 ligase activity
variant library
BRCA1T7 coat BARD1(26-126) BRCA1(2-304)
A
ATP, E1, E2Flag-Ub capture elute
deep sequencing
calculate variant frequency
deep sequencing
calculate selected/input ratio for each variant for each round
calculate slope of log2 ratios over 5 rounds of selection
5XB
no dataWT aa
1
2.0
0
E3 fu
nctio
nal s
core
su
bs
titu
tin
g a
min
o a
cid
AVILMFYWSTNQ
CGP
RH
KDE
*
M18
damaging
neutral
enhancing
M18
E3 ligase activity
variant library
BRCA1T7 coat BARD1(26-126) BRCA1(2-304)
A
ATP, E1, E2Flag-Ub capture elute
deep sequencing
calculate variant frequency
deep sequencing
calculate selected/input ratio for each variant for each round
calculate slope of log2 ratios over 5 rounds of selection
5XB
Massively parallel assays for BRCA1-RING E3 ligase activity
E3 ligase activity
variant library
BRCA1T7 coat BARD1(26-126) BRCA1(2-304)
A
ATP, E1, E2Flag-Ub capture elute
deep sequencing
calculate variant frequency
deep sequencing
calculate selected/input ratio for each variant for each round
calculate slope of log2 ratios over 5 rounds of selection
5XB
Massively parallel assays for the BRCA1-RINGE3 ligase and BARD1-binding activities
E3 ligase activity
variant library
BRCA1T7 coat BARD1(26-126) BRCA1(2-304)
A
ATP, E1, E2Flag-Ub capture elute
deep sequencing
calculate variant frequency
deep sequencing
calculate selected/input ratio for each variant for each round
calculate slope of log2 ratios over 5 rounds of selection
5XB
selection in -histidine
transformation, Time 1 Time 2 Time 3deep sequencing deep sequencing deep sequencing deep sequencing
Time 0
B
BARD1-binding activity
Massively parallel assays for the BRCA1-RINGE3 ligase and BARD1-binding activities
Yeast two-hybrid
Genetic testing is big business
*41.7% of tests revealed a VUS in at least one geneTung et al. FrequencyofmutationsinindividualswithbreastcancerreferredforBRCA1 andBRCA2
testingusingnext-generationsequencingwitha25-genepanel.Cancer 2015
More companies, lower costs, more genes*
pathogenicVUS
benign
not yet seen
0.0 1.00.33 0.77
0
50
100
Predicted HDR score
Cou
nts
High HDR functionLow HDR function
0.0 0.2 0.4 0.6
PolyPhen-2
CADD
Grantham
Align-GVGD
E3 + BARD1-binding
E3 + BARD1-binding
+ A-GVGD
Leave-One-Out Cross Validation R2
We need new technologies to deliver
on the promises of genetic medicine
https://www.whitehouse.gov/precision-medicine
Massively parallel functional analyses are a
possible solution
0
25
50
75
0
2
4
6
8
Predicted HDR rescue score
0.0 1.00.33 0.53 0.77
1,287 BRCA1 RING variants
likely HDR likely HDR non-functional functional
Predicted HDR rescue score
0.0 1.00.33 0.53 0.77
59 BRCA1 RING variants
likely HDR likely HDR non-functional functional
midpoint between mean pathogenic and benign scores
max pathogenicHDR rescue score
(experimental)
min benignHDR rescue score
(experimental)
A B
COSMICEVS
pathogenicVUS
benign
subs
titut
ing
amin
o ac
id
AVILMFYWSTNQCGPRHKDE*
Zn++Zn++
N- 4-helix bundle loop 1 central helix loop 2
5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100
C- 4-helix bundle
C
no dataWT aa
1
2.0
0
PRED
ICTE
DHD
R re
scue
soc
re
*
*
Starita et al. Genetics, 2015Findlay et al. Nature, 2014Rosenberg et al. Cell, 2015 Patwardhan et al. Nature Biotech, 2012 Fowler et al. Nature Methods, 2010…