Top Banner
LETTER doi:10.1038/nature13579 Structural basis of PAM-dependent target DNA recognition by the Cas9 endonuclease Carolin Anders 1 , Ole Niewoehner 1 , Alessia Duerst 1 & Martin Jinek 1 The CRISPR-associated protein Cas9 is an RNA-guided endonuclease that cleaves double-stranded DNA bearing sequences complement- ary to a 20-nucleotide segment in the guide RNA 1,2 . Cas9 has emerged as a versatile molecular tool for genome editing and gene expression control 3 . RNA-guided DNA recognition and cleavage strictly require the presence of a protospacer adjacent motif (PAM) in the target DNA 1,4–6 . Here we report a crystal structure of Streptococcus pyo- genes Cas9 in complex with a single-molecule guide RNA and a target DNA containing a canonical 59-NGG-39 PAM. The structure reveals that the PAM motif resides in a base-paired DNA duplex. The non-complementary strand GG dinucleotide is read out via major- groove interactions with conserved arginine residues from the carboxy- terminal domain of Cas9. Interactions with the minor groove of the PAM duplex and the phosphodiester group at the 11 position in the target DNA strand contribute to local strand separation immediately upstream of the PAM. These observations suggest a mechanism for PAM-dependent target DNA melting and RNA–DNA hybrid forma- tion. Furthermore, this study establishes a framework for the ratio- nal engineering of Cas9 enzymes with novel PAM specificities. In type II CRISPR (clustered regularly interspaced short palindromic repeats)–Cas (CRISPR-associated) systems, the endonuclease Cas9 as- sociates with a dual-RNA guide structure consisting of a CRISPR RNA (crRNA) and a trans-activating CRISPR RNA (tracrRNA) to cleave double-stranded DNA (dsDNA) using its HNH and RuvC nuclease domains 1,2,7 . Cas9 has been exploited in numerous gene-targeting appli- cations, in which its sequence specificity is programmed by either dual crRNA–tracrRNA guides or chimaeric single-molecule guide RNAs (sgRNAs) 8–19 . PAM recognition is a critical aspect of Cas9-mediated DNA targeting, being a prerequisite for ATP-independent strand sepa- ration and guide-RNA–target-DNA heteroduplex formation 6 . Recent crystal structures and electron microscopic reconstructions of Cas9 and its RNA- and DNA-bound complexes revealed that Cas9 undergoes a dramatic RNA-induced conformational rearrangement that facilitates target DNA binding 20,21 . Although two tryptophan residues have been implicated in PAM binding 20 , how PAM recognition occurs at the mo- lecular level remains unclear. To provide insight into the molecular mechanism of PAM recog- nition in Cas9, we determined the crystal structure of S. pyogenes Cas9 1 Department of Biochemistry, University of Zurich, Winterthurerstrasse 190, CH-8057 Zurich, Switzerland. G A 535335G T T T T A A A A A G G A A A A C T T G G A A A A A A T TT TTTTTT T C C C G A A A U G G G G G G A A AAA A A A U U U U U U U A A A A A G C C C C C C A U U U A U U C C G G G A A A A A A A A A A A U U U U U U U U C U G U A A G Non-target DNA strand (non-complementary) sgRNA Target DNA strand (complementary) PAM PAM A G A A G a c b 180º 1 5 10 15 20 –1 –2 -3 1* 2* 3* –1* –2* –3* Target sequence Target DNA strand sgRNA Non-target DNA strand PAM PAM 20 15 10 5 1 PAM PAM RuvC domain C-terminal domain (CTD) HNH domain α-helical (REC) lobe Topo-homology domain (Topo) Arg-rich (bridge) helix PAM-interacting domain 90º Repeat:anti-repeat duplex Stem loop 1 Stem loop 2 Stem loop 2 Stem loop 1 Repeat:anti-repeat duplex 355353Guide:target heteroduplex Figure 1 | Crystal structure of Cas9 in complex with a sgRNA and a PAM- containing target DNA. a, Schematic diagram of guide and target nucleic acids. Empty ovals denote nucleotides not observed in the electron density. b, Orthogonal views of the sgRNA–target-DNA four-way junction. c, Front and rear views of the Cas9–sgRNA–DNA complex. In all panels guide RNA is coloured orange, target DNA strand in light blue and non-target DNA strand in black. The 59-NGG-39 PAM trinucleotide in the non-target strand is highlighted in yellow. 25 SEPTEMBER 2014 | VOL 513 | NATURE | 569 Macmillan Publishers Limited. All rights reserved ©2014
16

Structural basis of PAM-dependent target DNA recognition by ......1Department of Biochemistry, University of Zurich, Winterthurerstrasse 190, CH-8057 Zurich, Switzerland. GA 5′ 3′

Oct 13, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Structural basis of PAM-dependent target DNA recognition by ......1Department of Biochemistry, University of Zurich, Winterthurerstrasse 190, CH-8057 Zurich, Switzerland. GA 5′ 3′

LETTERdoi:10.1038/nature13579

Structural basis of PAM-dependent target DNArecognition by the Cas9 endonucleaseCarolin Anders1, Ole Niewoehner1, Alessia Duerst1 & Martin Jinek1

The CRISPR-associated protein Cas9 is an RNA-guided endonucleasethat cleaves double-stranded DNA bearing sequences complement-ary to a 20-nucleotide segment in the guide RNA1,2. Cas9 has emergedas a versatile molecular tool for genome editing and gene expressioncontrol3. RNA-guided DNA recognition and cleavage strictly requirethe presence of a protospacer adjacent motif (PAM) in the targetDNA1,4–6. Here we report a crystal structure of Streptococcus pyo-genes Cas9 in complex with a single-molecule guide RNA and atarget DNA containing a canonical 59-NGG-39 PAM. The structurereveals that the PAM motif resides in a base-paired DNA duplex. Thenon-complementary strand GG dinucleotide is read out via major-groove interactions with conserved arginine residues from the carboxy-terminal domain of Cas9. Interactions with the minor groove of thePAM duplex and the phosphodiester group at the 11 position in thetarget DNA strand contribute to local strand separation immediatelyupstream of the PAM. These observations suggest a mechanism forPAM-dependent target DNA melting and RNA–DNA hybrid forma-tion. Furthermore, this study establishes a framework for the ratio-nal engineering of Cas9 enzymes with novel PAM specificities.

In type II CRISPR (clustered regularly interspaced short palindromicrepeats)–Cas (CRISPR-associated) systems, the endonuclease Cas9 as-sociates with a dual-RNA guide structure consisting of a CRISPR RNA(crRNA) and a trans-activating CRISPR RNA (tracrRNA) to cleavedouble-stranded DNA (dsDNA) using its HNH and RuvC nucleasedomains1,2,7. Cas9 has been exploited in numerous gene-targeting appli-cations, in which its sequence specificity is programmed by either dualcrRNA–tracrRNA guides or chimaeric single-molecule guide RNAs(sgRNAs)8–19. PAM recognition is a critical aspect of Cas9-mediatedDNA targeting, being a prerequisite for ATP-independent strand sepa-ration and guide-RNA–target-DNA heteroduplex formation6. Recentcrystal structures and electron microscopic reconstructions of Cas9 andits RNA- and DNA-bound complexes revealed that Cas9 undergoes adramatic RNA-induced conformational rearrangement that facilitatestarget DNA binding20,21. Although two tryptophan residues have beenimplicated in PAM binding20, how PAM recognition occurs at the mo-lecular level remains unclear.

To provide insight into the molecular mechanism of PAM recog-nition in Cas9, we determined the crystal structure of S. pyogenes Cas9

1Department of Biochemistry, University of Zurich, Winterthurerstrasse 190, CH-8057 Zurich, Switzerland.

G A

5′

3′

5′

3′

3′

5′

GTTT

T

A

AAAA

GG

AAA

ACTT GG AAAAAA T T T T T T T T T

T CCC

G

AA

A

U

G

G

G

GG

GA

A

A A A

AA

A

U

U

U

U UU

U

A

A

A

A

A

G

C

CCC

C

CA

UU

U

A UU CC GGG A AAAAAAAAAA UUUUUUUU CU

GU

A

A G

Non-target DNA strand

(non-complementary)

sgRNA

Target DNA

strand

(complementary)

PAM

PAM

AG

AA

G

a

c

b

180º

15101520

–1–2

-3

1*2*

3*

–1*–2*–3*

Target sequence

Target DNA strand

sgRNA

Non-target DNA strand

PAMPAM20151051

PAMPAM

RuvC domain

C-terminal domain (CTD)

HNH domain

α-helical (REC) lobe

Topo-homology domain (Topo)

Arg-rich (bridge) helix

PAM-interacting domain

90º

Repeat:anti-repeat duplex

Stem loop 1

Stem loop 2

Stem loop 2

Stem loop 1

Repeat:anti-repeat

duplex

3′

5′

5′

3′

5′

3′

Guide:target

heteroduplex

Figure 1 | Crystal structure of Cas9in complex with a sgRNA and aPAM- containing target DNA.a, Schematic diagram of guide andtarget nucleic acids. Empty ovalsdenote nucleotides not observed inthe electron density. b, Orthogonalviews of the sgRNA–target-DNAfour-way junction. c, Front and rearviews of the Cas9–sgRNA–DNAcomplex. In all panels guide RNA iscoloured orange, target DNA strandin light blue and non-target DNAstrand in black. The 59-NGG-39

PAM trinucleotide in the non-targetstrand is highlighted in yellow.

2 5 S E P T E M B E R 2 0 1 4 | V O L 5 1 3 | N A T U R E | 5 6 9

Macmillan Publishers Limited. All rights reserved©2014

Page 2: Structural basis of PAM-dependent target DNA recognition by ......1Department of Biochemistry, University of Zurich, Winterthurerstrasse 190, CH-8057 Zurich, Switzerland. GA 5′ 3′

in complex with an 83-nucleotide sgRNA and a partially duplexed tar-get DNA containing a 59-TGG-39 PAM sequence (Fig. 1 and ExtendedData Table 1). Owing to an inactivating mutation (H840A) in the Cas9HNH nuclease domain, the structure reveals an intact target (comple-mentary) DNA strand, while the non-target (non-complementary) DNAstrand is captured as cleaved product that has dissociated from the RuvCdomain active site (Fig. 1a–c). In the complex, the bound nucleic acids areenclosed by the nuclease and helical recognition lobes of Cas9 and forma four-way junction that straddles the arginine-rich bridge helix (Fig. 1b, c).The entire PAM-containing region of the target DNA (target-strandnucleotides –1 to –8 and non-target-strand nucleotides 11* to 18*) isbase-paired. Strand separation occurs only at the first base pair of thetarget sequence (the 11 position). Here, the target strand exhibits apronounced kink as it hybridizes with the sgRNA. The PAM duplexis nestled in a positively charged groove between the Topoisomerase-homology and C-terminal domains (collectively referred to as the PAM-interacting domain21) (Fig. 2a and Extended Data Fig. 1). Comparisonwith the crystal structure of the Cas9–sgRNA complex bound to a single-stranded DNA target21 reveals a slight tightening of an otherwise pre-structured PAM binding cleft upon PAM-duplex binding (ExtendedData Fig. 2).

The deoxyribose-phosphate backbone of the non-target DNA strandis engaged in numerous ionic and hydrogen-bonding interactions (Fig. 2b).Conserved tryptophan residues Trp 476 and Trp 1126, previously im-plicated in PAM recognition by crosslinking experiments20, are not indirect contact with the PAM, suggesting that the crosslinks may haveoriginated from a transient intermediate in the PAM recognition mech-anism, or from non-specifically bound DNA. Instead, the guanine nu-cleobases of dG2* and dG3* in the non-target strand are read out inthe major groove by base-specific hydrogen-bonding interactions withArg 1333 and Arg 1335, respectively, provided by a b-hairpin from the

C-terminal domain of Cas9 (Fig. 2c). The target-strand nucleotides com-plementary to the PAM are not recognized by major-groove interactions(Fig. 2b, c), rationalizing previous observations that Cas9-mediatedDNA cleavage requires the 59-NGG-39 trinucleotide in the non-targetstrand, but not its target-strand complement1,6. The lack of interac-tions with the target-strand backbone also explains why mismatchesin the PAM are tolerated provided that a GG dinucleotide is present inthe non-target strand1. In agreement with the observed role of the argi-nine residues in PAM recognition, substitution of Arg 1333 or Arg 1335with alanine residues resulted in substantially reduced target DNA bind-ing in vitro (Fig. 2d). Furthermore, alanine substitutions of both Arg 1333and Arg 1335 nearly abolished cleavage of linearized plasmid DNA, andsubstantially reduced cleavage of supercoiled circular plasmid DNAand short dsDNA oligonucleotides in vitro (Fig. 2e and Extended DataFig. 3). Individual arginine substitutions yielded modest reductions ofcleavage activity (Fig. 2e and Extended Data Fig. 3).

The Cas9 sequence motif containing the PAM-interacting arginineresidues (1332DRKRY1336) is conserved in other type II-A Cas9 pro-teins known to recognize 59-NGG-39 PAMs (Extended Data Fig. 4 andSupplementary Information). Similar arginine-containing motifs arefound in Cas9 from Francisella novicida (1608SRYPD1612) and from Strep-tococcus thermophilus CRISPR3 locus (1350PRYRDY1356), which recog-nize 59-NG-39 and 59-NGGNG-39 PAMs, respectively, but are notablyabsent from type II-C Cas9 proteins that are known to recognize dis-tinct PAM sequences4,22,23 (Extended Data Fig. 4). Whereas arginineresidues are commonly used by DNA-binding proteins to recognizeguanines, major-groove read-out of adenines typically involves gluta-mine residues24. Interestingly, a Cas9 orthologue from Lactobacillusbuchneri, predicted to recognize a 59-NAAAA-39 PAM25, contains glu-tamine residues (1338QLQ1340) at the positions equivalent to Arg 1333and Arg 1335 in S. pyogenes Cas9. Together, these observations suggest

Arg 1335

dG3*

dG2*

dC–3

dC–2

dA–1

cb

A–1*

A–2*

T7*

A5*

T4*

G3*

T1*

G2*

G8*

T6*

T 3

T2

A20

A19

A18

C–2

A–1

C–3

T–5

A–6

A–7

C–8

A–4

Gln 1221

Lys 1200

(Lys 1118)

Ser 1116

Ser 1136

Arg 1333

Ser 1216

Ala 1217

(Ser 1109)

Ser 1109

(Glu 1108)

T1

(Thr 404)

Arg 403

(Phe 405)

(Thr 404)

(Gly 166)

Arg 78

Arg 71

Arg 165

Arg 71

Lys 1107

A–3*

Arg 1114

Tyr 72

edCas9–sgRNA

0nM 0.5

2.5

10 50 250

1,00

0

5,00

0

10,0

00

dCas9(R1333A)–sgRNA

0 0.5

2.5

10 50 250 1,

000

5,00

0

10,0

00

dCas9(R1335A)–sgRNA

0 0.5

2.5

10 50 250

1,00

0

5,00

0

10,0

00

Cas9–

RNA–

DNA

DNA

d

a

dT1*

dG2*

dA–1

dC–2

dC–3

dT1

dT2

dT3

dA–1*

dA–2*

dA–3*

Topo CTD

A20

A19

dT1*

dA–1*

Arg 1335

3′5′

3′

5′

3′

5′

Arg 1333Arg 1333Arg 1333

— WT

bp

Cas9

Time (min) 1 5 15 60 120

R1333A

1 5 15 60 120

R1335A

1 5 15 60 120

R1333A

R1335A

1 5 15 60 120

Cas9

cleavage

products

500

750

1,000

1,500

2,000

2,500

3,0003,500

Figure 2 | The GG dinucleotide of the PAM is read out by major-grooveinteractions. a, Zoomed-in view of the PAM binding region in Cas9. Topo,topoisomerase-homology domain; CTD, C-terminal domain. b, Schematic ofCas9 interactions with the PAM duplex. Red circles denote bridging watermolecules. c, Detailed view of the major groove. Sequence-specific hydrogen-bonding interactions with the GG PAM dinucleotide are indicated with dashedlines. d, Electrophoretic mobility shift assay using catalytically inactivedCas9–sgRNA complexes and fluorophore-labelled target DNA duplex.

e, Endonuclease activity assay of wild type (WT) and mutant Cas9 proteinsusing a linearized plasmid DNA containing a target sequence fullycomplementary to the sgRNA in Fig. 1a. Bands at 2,104 and 598 base pairs (bp)correspond to Cas9 cleavage products. In all relevant panels guide RNA iscoloured orange, target DNA strand in light blue and non-target DNA strand inblack. The 59-NGG-39 PAM trinucleotide in the non-target strand ishighlighted in yellow.

RESEARCH LETTER

5 7 0 | N A T U R E | V O L 5 1 3 | 2 5 S E P T E M B E R 2 0 1 4

Macmillan Publishers Limited. All rights reserved©2014

Page 3: Structural basis of PAM-dependent target DNA recognition by ......1Department of Biochemistry, University of Zurich, Winterthurerstrasse 190, CH-8057 Zurich, Switzerland. GA 5′ 3′

that at least in a subset of Cas9 proteins, PAM binding may be governedby a major-groove base-recognition code. Substitutions of Arg 1333 andArg 1335 in S. pyogenes Cas9 with glutamine residues did not produce aspecificity switch towards alanine-rich PAMs (Extended Data Fig. 5).Reprogramming PAM specificity might thus require more extensiveremodelling of the PAM-interacting motif by directed evolution and/or computational design, as has been done previously for homingendonucleases26,27.

The PAM-interacting domain of Cas9 makes further contacts withthe minor groove of the PAM duplex (Fig. 3a). Ser 1136 interacts withthe non-target strand dG3* through a water-mediated hydrogen bond,while Lys 1107 contacts dC22 of the target strand (Fig. 3a). This inter-action enforces a pyrimidine at this position, explaining why 59-NAG-39 PAMs are weakly permissive for S. pyogenes Cas9 (refs 19, 28, 29).The minor-groove interactions with the PAM duplex orient the targetDNA strand for base pairing with the guide RNA. Downstream of

Lys 1107, residues Glu 1108 and Ser 1109 interact with the phospho-diester group linking dA–1 and dT1 in the target DNA strand (the 11phosphate). The non-bridging phosphate oxygen atoms form hydro-gen bonds with the backbone amide groups of Glu 1108 and Ser 1109,and with the side chain of Ser 1109 (Fig. 3b). Owing to its interactionwith the Lys 1107–Ser 1109 loop (the ‘phosphate lock’ loop), the 11phosphate group is rotated (Fig. 3c and Extended Data Fig. 6), whichcoincides with a distortion in the target DNA strand that allows thenucleobase of dT1 to base pair with A20 of the guide RNA. Further-more, comparison with the structure of Cas9–sgRNA bound to a single-stranded DNA21 suggests that interaction between the 11 phosphateand the loop is PAM-dependent (Extended Data Fig. 6).

Previous biochemical studies indicated that PAM recognition is con-comitant with local destabilization of the adjacent sequence and dir-ectional target DNA unwinding from the PAM-proximal end6. Ourstructural observations suggest that the interaction between the target

dG3*

dC–2

dA–1

Lys 1107

dG2* dC–3

Lys 1107

Ser 1109

Glu 1108

dT2

a b c

Ser 1136

dA–1

dT1

+1P

d

dC–2

Target DNA strand

sgRNA

Non-target DNA strand

Ideal B-form DNA

+1P

dT1

dT2

dT3

A20

A19

A18

PAMPAM

— WTCas9

Time (min) 1 5 15 60 120

K1107A

1 5 15 60 120

KES > KG

1 5 15 60 120 1 5 15 60 120

Cas9

cleavage

products

500

750

1,000

1,500

2,000

2,500

3,000bp 3,500

KES > GG

— WTCas9

Time (min) 1 5 15 60 120

K1107A

1 5 15 60 120

KES > KG

1 5 15 60 120 1 5 15 60 120

Cas9

cleavage

products

500

750

1,000

1,500

2,000

2,500

3,000bp 3,500

KES > GG

Perfect sgRNA–target-DNA-strand match

Target strand nucleotides 1–2 mismatched

+1P

e

5’

3′

5′

GTTT

T

A

GTA

GG

AAA

A

T T TA C

T CC

C

GAAAAA

PAM

PAM

–1–2

–3

1*2*

3*

–1*–2*–3*

20

3′5′

3′

5’

+1P

dC1

dA2

dT3

123

dG–1*

dT–2*

5′

3′

5′

GTTT

T

A

GTT

GG

AAA

A

T TA A C

T CC

C

GAAAAA

PAM

PAM

–1–2

–3

1*2*

3*

–1*–2*–3*

20

3′5′

3′

5’+1P

123

dG–1*

dT–2*

3′

3’

dT4

dT4

Target strand nt 1–2 mismatch

Target strand nt 1–3 mismatch

Figure 3 | Interactions with the 11 phosphodiester group orient the targetstrand for guide-RNA binding. a, Detailed view of the minor groove of thePAM region. b, Hydrogen-bonding interactions (dashed lines) of the 11phosphate (11P) with the Lys 1107–Ser 1109 (phosphate lock) loop.c, Superposition of the unwound target DNA strand with an ideal B-form DNAduplex (green). d, Endonuclease activity assays using linearized plasmid DNAcontaining a fully complementary target sequence (top) or a target sequencemismatched to the sgRNA at positions 1–2 (bottom). KES.KG denotessubstitution of the Lys 1107–Ser 1109 loop with a Lys-Gly dipeptide. KES.GGdenotes substitution of the Lys 1107–Ser 1109 loop with a Gly-Gly dipeptide.

e, Crystal structures of dCas9–sgRNA bound to DNA substrates containingmismatches to the sgRNA at positions 1–2 (top) and 1–3 (bottom), overlaidwith refined 2mFo2DFc electron density maps (grey mesh, contouredat 1s). The sgRNA is identical to that in Fig. 1a. In both structures, the targetDNA strand is provided in two fragments, as indicated in the schematics.Residual electron density corresponding to the 11 base pair is indicated with ared arrowhead. In all relevant panels guide RNA is coloured orange, targetDNA strand in light blue and non-target DNA strand in black. The 59-NGG-39

PAM trinucleotide in the non-target strand is highlighted in yellow.

LETTER RESEARCH

2 5 S E P T E M B E R 2 0 1 4 | V O L 5 1 3 | N A T U R E | 5 7 1

Macmillan Publishers Limited. All rights reserved©2014

Page 4: Structural basis of PAM-dependent target DNA recognition by ......1Department of Biochemistry, University of Zurich, Winterthurerstrasse 190, CH-8057 Zurich, Switzerland. GA 5′ 3′

DNA strand and the phosphate lock loop might stabilize target DNAimmediately upstream of the PAM in an unwound conformation, there-by linking PAM recognition with local strand separation. In agreementwith this hypothesis, alanine substitution of Lys 1107 or replacement ofthe Lys 1107-Ser 1109 loop with a Lys–Gly or Gly–Gly dipeptide yieldedCas9 proteins with modestly reduced cleavage activities towards line-arized plasmid DNA containing a perfectly complementary sequence,but almost no activity towards DNA containing mismatches to the guideRNA at positions 1 and 2 (Fig. 3d). Moreover, the phosphate lock loopmutations also disproportionately impaired cleavage of an oligonucle-otide duplex containing the same mismatch, but the defect was par-tially relieved with a duplex in which the mismatched nucleotides werethemselves unpaired (Extended Data Fig. 7).

To provide additional support for the hypothesis, we determinedtwo crystal structures of Cas9(D10A/H840A)–sgRNA bound to DNAscontaining mismatches to the guide RNA (Fig. 3e and Extended Data

Table 1). To discount the possibility that duplex melting in these targetDNAs is driven from the unpaired PAM-distal end, the target strandwas supplied in two fragments and interrupted by a gap at the scissilephosphate (14) position. The structure of the complex containing mis-matches at positions 1 and 2 reveals a fully melted duplex, with nucleo-tides 11 and 12 unpaired, and nucleotide dT3 base paired to A18 ofthe sgRNA. In the complex containing mismatches at positions 1–3, thetarget strand backbone upstream of the PAM is disordered and we ob-serve only residual electron density for the 11 base pair that cannotbe modelled with full occupancy (Fig. 3e), suggesting that the DNA isunpaired in a substantial fraction of molecules in the crystal. Together,these structures reveal that even in the absence of compensatory basepairing to the guide RNA, target DNA binding by Cas9–RNA results inlocal strand separation immediately upstream of the PAM. Impor-tantly, the interaction of the 11 phosphate with the phosphate lockloop is maintained in both structures, supporting the hypothesis that theloop contributes to stabilizing the target DNA strand in the unwoundstate.

In this study we highlight the central importance of PAM recognitionin Cas9 function, both as a critical determinant of initial target DNAbinding and as a licensing element in subsequent strand separation andguide-RNA–target-DNA hybridization. Based on our structural and bio-chemical observations, we propose a model for PAM-dependent targetdsDNA recognition and unwinding (Fig. 4). Sequence-specific PAM read-out by Arg 1333 and Arg 1335 in Cas9 positions the DNA duplex suchthat the 11 phosphate group of the target strand interacts with thephosphate lock loop. This promotes local duplex melting, allowing theCas9–RNA complex to probe the identity of the nucleotides immedi-ately upstream of the PAM. Base pairing between the seed region of theguide RNA and the target DNA strand subsequently drives further step-wise destabilization of the target DNA duplex and directional forma-tion of the guide-RNA–target-DNA heteroduplex.

Online Content Methods, along with any additional Extended Data display itemsandSourceData, are available in the online version of the paper; references uniqueto these sections appear only in the online paper.

Received 11 March; accepted 13 June 2014.

Published online 27 July 2014.

1. Jinek, M. et al. A programmable dual-RNA-guided DNA endonuclease in adaptivebacterial immunity. Science 337, 816–821 (2012).

2. Gasiunas, G., Barrangou, R., Horvath, P. & Siksnys, V. Cas9-crRNAribonucleoprotein complex mediates specific DNA cleavage for adaptiveimmunity in bacteria. Proc. Natl Acad. Sci. USA 109, 2579–2586 (2012).

3. Mali,P., Esvelt, K.M.& Church, G.M.Cas9asa versatile tool for engineering biology.Nature Methods 10, 957–963 (2013).

4. Garneau, J. E. et al. The CRISPR/Cas bacterial immune system cleavesbacteriophage and plasmid DNA. Nature 468, 67–71 (2010).

5. Sapranauskas, R. et al. The Streptococcus thermophilus CRISPR/Cassystem provides immunity in Escherichia coli. Nucleic Acids Res. 39, 9275–9282(2011).

6. Sternberg, S. H., Redding, S., Jinek, M., Greene, E. C. & Doudna, J. A. DNAinterrogation by the CRISPR RNA-guided endonuclease Cas9. Nature 507, 62–67(2014).

7. Deltcheva, E. et al. CRISPR RNA maturation by trans-encoded small RNA and hostfactor RNase III. Nature 471, 602–607 (2011).

8. Mali, P. et al. RNA-guided human genome engineering via Cas9. Science 339,823–826 (2013).

9. Cong, L. et al. Multiplex genome engineering using CRISPR/Cas systems. Science339, 819–823 (2013).

10. Jinek, M. et al. RNA-programmed genome editing in human cells. elife 2, e00471(2013).

11. Hwang, W. Y. et al. Efficient genome editing in zebrafish using a CRISPR-Cassystem. Nature Biotechnol. 31, 227–229 (2013).

12. Wang, H. et al. One-step generation of mice carrying mutations in multiple genesby CRISPR/Cas-mediated genome engineering. Cell 153, 910–918 (2013).

13. Bassett, A. R., Tibbit, C., Ponting, C. P. & Liu, J.-L. Highly efficient targetedmutagenesis of Drosophila with the CRISPR/Cas9 system. Cell Rep. 4, 220–228(2013).

14. Gratz, S. J. et al. Genome engineering of Drosophila with the CRISPR RNA-guidedCas9 nuclease. Genetics 194, 1029–1035 (2013).

15. Friedland, A. E. et al. Heritable genome editing in C. elegans via a CRISPR-Cas9system. Nature Methods 10, 741–743 (2013).

16. Qi, L. S. et al. Repurposing CRISPR as an RNA-guided platform for sequence-specific control of gene expression. Cell 152, 1173–1183 (2013).

GG

C C

R1333 R1335

K1107S1109

K1107S1109

K1107S1109

+1

CTDRuvC

HNH

Helical

(REC)

Arg-rich

bridge helix

Guide-RNA

binding

Target DNA

binding

RNA–DNA

hybrid

formation

PAM recognition

+1–2 bp melting

RNA–DNA

hybrid propagation

GG

C C

R1333 R1335

GG

C C

R1333 R1335

Topo

Seed region

5′3′

3′5′

3′5′

Phosphate

lock

Cleavage

PAM-interacting domain

Figure 4 | Model for PAM-dependent target DNA unwinding andrecognition by Cas9. Guide RNA binding to Cas9 results in the formation ofthe PAM binding site. Cas9–RNA engages the PAM GG dinucleotide usingArg 1333 and Arg 1335, and positions the target DNA duplex such that the11 phosphate (orange circle) interacts with the phosphate lock loop, resultingin local strand separation immediately upstream of the PAM. Base pairingbetween displaced target DNA strand and the seed region of the guide RNApromotes further stepwise strand displacement and propagation of the guide–target heteroduplex. Guide RNA is coloured orange, target DNA strand inlight blue and non-target DNA strand in black. CTD, C-terminal domain.Topo, topoisomerase-homology domain; REC, recognition lobe. bp, base pair.

RESEARCH LETTER

5 7 2 | N A T U R E | V O L 5 1 3 | 2 5 S E P T E M B E R 2 0 1 4

Macmillan Publishers Limited. All rights reserved©2014

Page 5: Structural basis of PAM-dependent target DNA recognition by ......1Department of Biochemistry, University of Zurich, Winterthurerstrasse 190, CH-8057 Zurich, Switzerland. GA 5′ 3′

17. Bikard, D. et al. Programmable repression and activation of bacterial geneexpression using an engineered CRISPR-Cas system. Nucleic Acids Res. 41,7429–7437 (2013).

18. Gilbert, L. A. et al. CRISPR-mediated modular RNA-guided regulation oftranscription in eukaryotes. Cell 154, 442–451 (2013).

19. Mali, P. et al. CAS9 transcriptional activators for target specificity screening andpaired nickases for cooperative genome engineering. Nature Biotechnol. 31,833–838 (2013).

20. Jinek, M. et al. Structures of Cas9 endonucleases reveal RNA-mediatedconformational activation. Science 343, 1215 (2014).

21. Nishimasu, H. et al.Crystal structure ofCas9 in complex withguide RNA and targetDNA. Cell 156, 935–949 (2014).

22. Fonfara, I. et al. Phylogeny of Cas9 determines functional exchangeability of dual-RNA and Cas9 among orthologous type II CRISPR-Cas systems. Nucleic Acids Res.42, 2577–2590 (2013).

23. Esvelt, K. M. et al. Orthogonal Cas9 proteins for RNA-guided gene regulation andediting. Nature Methods 10, 1116–1121 (2013).

24. Luscombe,N.M., Laskowski, R. A. &Thornton, J.M.Amino acid-base interactions: athree-dimensional analysis of protein-DNA interactions at an atomic level. NucleicAcids Res. 29, 2860–2874 (2001).

25. Briner, A. E. & Barrangou, R. Lactobacillus buchneri genotyping on the basis ofclustered regularly interspaced short palindromic repeat (CRISPR) locus diversity.Appl. Environ. Microbiol. 80, 994–1001 (2014).

26. Redondo, P. et al. Molecular basis of xeroderma pigmentosum group C DNArecognition by engineered meganucleases. Nature 456, 107–111 (2008).

27. Ashworth, J. et al. Computational reprogramming of homing endonucleasespecificity at multiple adjacent base pairs. Nucleic Acids Res. 38, 5601–5608(2010).

28. Jiang, W., Bikard, D., Cox, D., Zhang, F. & Marraffini, L. A. RNA-guided editing ofbacterial genomes using CRISPR-Cas systems. Nature Biotechnol. 31, 233–239(2013).

29. Hsu, P. D. et al. DNA targeting specificity of RNA-guided Cas9 nucleases. NatureBiotechnol. 31, 827–832 (2013).

Supplementary Information is available in the online version of the paper.

Acknowledgements We are grateful to J. Doudna for agreement on researchdirections, helpful discussions and encouragement throughout the project. We thankB. Blattmann and C. Stutz-Ducommun for crystallization screening, N. Ban andM. Leibundgut for the gift of iridium hexamine, and R. Dutzler for sharing synchrotronbeam time and crystallographic advice. We thank E. Charpentier, I. Fonfara,S. Sternberg, P. Sledz, A. May and S. Kassube for critical reading of the manuscript. Partof this work was performed at the Swiss Light Source at the Paul Scherrer Institute,Villigen, Switzerland. We thank T. Tomizaki, V. Olieric and M. Wang for assistance withX-ray data collection. This work was supported by the European Research CouncilStarting Grant no. 337284 ANTIVIRNA and by start-up funds from the University ofZurich.

Author Contributions C.A. designed experiments, performed site-directedmutagenesis, prepared guide RNAs, purified and crystallized the Cas9–sgRNA–target-DNA complex, determined its structure together with M.J., and performedplasmid cleavage assays. O.N. purified Cas9 mutants, performed EMSA assays andassisted with cleavage assays. A.D. performed site-directed mutagenesis, preparedguide RNAs and assisted with cleavage assays. M.J. designed experiments andsupervised the study. C.A. and M.J. wrote the manuscript.

Author Information Atomic coordinates and structure factors have been deposited inthe Protein Data Bank under accession numbers 4un3, 4un4 and 4un5. Reprints andpermissions information is available at www.nature.com/reprints. Readers arewelcome to comment on the online version of the paper. The authors declarecompeting financial interests: details are available in the online version of the paper.Correspondence and requests for materials should be addressed toM.J. ([email protected]).

LETTER RESEARCH

2 5 S E P T E M B E R 2 0 1 4 | V O L 5 1 3 | N A T U R E | 5 7 3

Macmillan Publishers Limited. All rights reserved©2014

Page 6: Structural basis of PAM-dependent target DNA recognition by ......1Department of Biochemistry, University of Zurich, Winterthurerstrasse 190, CH-8057 Zurich, Switzerland. GA 5′ 3′

METHODSIn vitro transcription and purification of sgRNA. The sequences of RNA oli-gonucleotides used in the study are provided in Extended Data Table 2. sgRNAswere prepared by in vitro transcription using recombinant T7 RNA polymerase asdescribed6, except that for sgRNA-1 a fully double-stranded DNA template wasused instead. Transcription template for sgRNA-1 was generated by PCR usingthe following oligonucleotides: PCR template 59-CACTTTTTCAAGTTGATAACGGACTAGCCTTATTTTAACTTGCTATTTCTAGCTCTAAAACTTTTTTACAAATTGAGTTATCCTATAGTGAGTCGTATTA-39; forward primer 59-TAATACGACTCACTATA-39; reverse primer 59-CACTTTTTCAAGTTGA-39. sgRNA-2was transcribed directly from single-stranded oligonucleotide template (59- CACTTTTTCAAGTTGATAACGGACTAGCCTTATTTTAACTTGCTATTTCTAGCTCTAAAACGCGTCTCATCTTTATGCGTCCCTATAGTGAGTCGTATTA-39) that was hybridized to the forward primer listed above.

Transcribed RNAs were purified by gel electrophoresis on an 8% denaturing(7 M urea) polyacrylamide gel and subsequently ethanol precipitated.DNA oligonucleotides. The sequences of DNA oligonucleotides used in the studyare provided in Extended Data Table 2. All DNA oligonucleotides were synthesizedby Microsynth AG (Balgach, Switzerland). For crystallization, target strand (59-CAATACCATTTTTTACAAATTGAGTTAT-39) and non-target strand (59-AAAATGGTATTG-39) were used without further purification. Prior to complex for-mation, the DNA oligonucleotides were mixed in a 1:1 molar ratio (final concen-tration 100mM) and hybridized by heating to 75 uC for 5 min, followed by slowcooling to room temperature. The target-strand oligonucleotides used for cleav-age assays contained a 59-linked ATTO532 fluorophore (ATTO532-59-GCGCAATACCATTTTTTACAAATTGAGTTAT-39 and ATTO532-59-GCGCAATACCAAATTTTACAAATTGAGTTAT-39) and were PAGE-purified before use. Thenon-target-strand oligonucleotides (59-ATAACTCAATTTGTAAAAAATGGTATTG-39 and 59-ATAACTCAATTTGTAAAATTTGGTATTG-39) were used with-out further purification. For electrophoretic mobility shift assays, the non-target-strand oligonucleotide contained a 39-linked ATTO532 fluorophore (59-ATAACTCAATTTGTAAAAAATGGTATTGCGC-39-ATTO532) and was PAGE-purifiedbefore use.Cas9 expression, purification and Cas9–sgRNA–DNA complex reconstitu-tion. S. pyogenes Cas9 was expressed and purified essentially as described1,20. Pointmutations were introduced by inverse PCR and verified by DNA sequencing andmutant proteins were purified as for wild-type Cas9. Briefly, Cas9 was expressedin Escherichia coli BL21 (DE3) Rosetta 2 (Novagen) fused to an N-terminal fusionprotein containing a hexahistidine affinity tag, the maltose binding protein (MBP)polypeptide sequence, and the tobacco etch virus (TEV) protease cleavage site. Cellswere lysed in 20 mM Tris pH 8.0, 250 mM NaCl, 5 mM imidazole pH 8.0. Clarifiedlysate was applied to a 10 ml Ni-NTA (Sigma Aldrich) affinity column. The columnwas washed with 20 mM Tris pH 8.0, 250 mM NaCl, 10 mM imidazole pH 8.0, andbound protein was eluted by increasing imidazole concentration to 250 mM. Elutedprotein was dialysed against 20 mM HEPES pH 7.5, 150 mM KCl, 10% glycerol,1 mM dithiothreitol (DTT), 1 mM EDTA (EDTA) overnight at 4 uC in the pres-ence of TEV protease to remove the His6-MBP affinity tag. Cleaved protein was fur-ther purified by cation exchange chromatography (HiTrap SP FF, GE Healthcare),eluting with a linear gradient of 0.1–1.0 M KCl. For purification of Cas9–sgRNAcomplexes, purified Cas9 and in vitro transcribed sgRNA were mixed in a 1:1 molarratio, concentrated to 6–10 mg ml21 in a 50,000 MWCO centrifugal filter (MerckMillipore), applied to a Superdex 200 16/600 column (GE Healthcare) and elutedwith 20 mM HEPES pH 7.5, 500 mM KCl. Reconstitution of Cas9–sgRNA–DNAcomplex was carried out by mixing purified Cas9 and in vitro transcribed sgRNAin a 1:1 molar ratio. The sample was slowly exchanged to a buffer containing20 mM HEPES pH 7.5, 500 mM KCl, 5 mM MgCl2 during concentration in a 50,000MWCO centrifugal filter. At a final concentration of 4–6 mg ml21, pre-hybridizedtarget DNA was added in 1.5-fold molar excess. The complex was applied to aSuperdex 200 16/600 column and eluted with 20 mM HEPES pH 7.5, 500 mMKCl, 5 mM MgCl2. Purified complex was concentrated to 4–8 mg ml21, flash fro-zen in liquid nitrogen, and stored at 280 uC. For crystallization of complexes boundto mismatch-containing DNAs, purified Cas9–sgRNA complex was diluted with20 mM HEPES pH 7.5, 500 mM KCl, 5 mM MgCl2 to a final concentration of5 mg ml21 and was mixed with DNA oligonucleotides in a 1:2 molar ratio. Seleno-methionine (SeMet)-substituted Cas9 was expressed and purified as for native pro-tein, with the following modifications. The expression was carried out in M9 minimalmedia supplemented with 1 mg ml21 biotin and 1mg ml21 thiamine. At OD600 of0.8, the following amino acids were added to allow SeMet incorporation: 50 mg l21

Leu, Ile, Val, 100 mg l21 Phe, Lys, Thr, 75 mg l21 SeMet. After 30 min, the temper-ature was reduced to 18 uC and 200mM isopropyl-b-D-thiogalactopyranoside wasadded for induction.Complex crystallization and structure determination. Cas9–sgRNA–DNA crys-tals were grown at 20 uC using the hanging drop vapour diffusion method. The

drops were composed of equal volumes (1ml 1 1ml) of purified complex (dilutedto 1.25–2 mg ml21 in 20 mM HEPES pH 7.5, 250 mM KCl, 5 mM MgCl2) and res-ervoir solution (0.1 M Tris-acetate pH 8.5, 200–400 mM KSCN, 14–18% PEG 3350).Iterative microseeding was used to improve crystal nucleation, growth and morpho-logy. For cryoprotection, crystals were transferred into 0.1 M Tris-acetate pH 8.5,200 mM KSCN, 30% PEG 3350, 10% ethylene glycol, and flash-cooled in liquidnitrogen. Diffraction data were measured at beamlines PXI and PXIII of the SwissLight Source (Paul Scherrer Institute, Villigen, Switzerland) and processed usingXDS30. The crystals belonged to space group C2 and contained one complex in theasymmetric unit. Native data extended to a resolution of 2.59 A. Data collectionstatistics are summarized in Extended Data Table 1. Phases were obtained from asingle-wavelength anomalous diffraction (SAD) experiment using complex crys-tals containing SeMet-substituted Cas9 and measured at Se K-edge wavelength(0.97965 A). Three data sets were measured by exposing different parts of the samecrystal, rotating the crystal through 360u in each data set. Selenium sites were locatedusing the Hybrid Substructure Search (HySS) module of the Phenix package31.Additional phasing information came from a SAD experiment carried out withcomplex crystals soaked with 5 mM iridium hexamine chloride for 2 h and mea-sured at the Ir L-III edge (1.10501 A). The native and SAD data sets were combinedfor substructure refinement and phase calculations using the MIRAS procedure inAutoSHARP32, yielding readily interpretable electron density maps. Fragments ofapo-Cas9 structure20 were docked using MOLREP33. Model building was com-pleted in COOT34. The atomic model was refined using Phenix.refine35. The finalmodel includes Cas9 residues 4–710, 719–765, 776–1012, 1030–1050, 1059–1241,1253–1363; nucleotides 1–81 of the sgRNA; nucleotides (28)–20 of the targetDNA strand and nucleotides (23)–8 of the non-target DNA strand (Note thedifferent DNA nucleotide numbering in the atomic coordinates deposited inthe PDB). The structures of dCas9–sgRNA–mismatch DNA complexes were solvedby molecular replacement using the Phaser module of the Phenix package31 usingthe atomic structures of Cas9 and sgRNA as separate search models and omittingDNA from the initial search.Endonuclease activity assays. Plasmid cleavage assays were carried out using plas-mids pMJ879, pMJ891, pMJ992, pMJ993 and pMJ994 (Extended Data Table 2), asdescribed previously1,6. Equimolar quantities of Cas9 and sgRNA (final concen-tration of 1.5mM) were pre-incubated in 20 mM HEPES pH 7.5, 100 mM KCl, 5%glycerol, 1 mM DTT, 0.5 mM EDTA, 2 mM MgCl2 at room temperature for 5 min.400 ng of circular or SspI-linearized plasmid were added and the cleavage reactions(40ml total volume) were incubated at 37 uC for 2 h. 7ml aliquots were taken at theindicated time points, quenched by addition of EDTA (50 mM final concentration)and treated with 14mg Proteinase K for 30 min at room temperature. Cleavage pro-ducts were resolved by gel electrophoresis on 1% agarose gel stained with GelRed(Biotium) and visualized using a Typhoon FLA 9500 scanner (GE Healthcare). Cleav-age assays on double-stranded oligonucleotides were performed as described abovewith minor changes. Cas9 and sgRNA were used at a final concentration of 5 mM;.Target-strand oligonucleotides (400 nM) and non-target-strand oligonucleotides(435 nM) were mixed in a molar ratio of 2:3, annealed, and added to a final con-centration of 20 nM and 30 nM to start the cleavage reaction (105ml total volume).12.5ml aliquots were taken at the indicated time points and treated as describedabove. Cleavage products were resolved by electrophoresis on 16% denaturing poly-acrylamide gel in 0.5x TBE running buffer and were detected using a Typhoon FLA9500 scanner and quantified using ImageQuant software (GE Healthcare).Electrophoretic mobility shift assays. Electrophoretic mobility shift assays werecarried out using catalytically inactive Cas9 (D10A/H840A) protein (dCas9) andits point mutants. dCas9, sgRNA and DNA concentrations were determined witha NanoDrop spectrophotometer (Thermo Scientific) using the calculated extinc-tion coefficients at 280 nm for Cas9 (120,450 M21 cm21) and at 260 nm for sgRNA(829,800 M21 cm21), target DNA (276,100 M21 cm21) and non-target DNA(332,100 M21 cm21). Target and non-target DNA strands were hybridized in a 1.5:1molar ratio. The resulting target DNA duplex (final concentration of 10 nM) wastitrated with increasing concentrations of dCas9 reconstituted with a twofold molarexcess of sgRNA in 20 mM HEPES pH 7.5, 100 mM KCl, 5% glycerol, 1 mM DTTand 5 mM MgCl2 in a total volume of 40ml. All binding reactions were incubatedat 37 uC for 10 min and 20ml were subsequently resolved on a native 8% polyacry-lamide gel at room temperature using 0.5x TBE running buffer supplemented with1 mM MgCl2. Bound and unbound fractions were detected using a Typhoon FLA9500 scanner.

30. Kabsch, W. XDS. Acta Crystallogr. D 66, 125–132 (2010).31. Adams, P. D. et al. PHENIX: a comprehensive Python-based system

for macromolecular structure solution. Acta Crystallogr. D 66, 213–221(2010).

32. Vonrhein, C., Blanc, E., Roversi, P. & Bricogne, G. Automated structure solutionwith autoSHARP. Methods Mol. Biol. 364, 215–230 (2007).

RESEARCH LETTER

Macmillan Publishers Limited. All rights reserved©2014

Page 7: Structural basis of PAM-dependent target DNA recognition by ......1Department of Biochemistry, University of Zurich, Winterthurerstrasse 190, CH-8057 Zurich, Switzerland. GA 5′ 3′

33. Vagin, A. & Teplyakov, A. Molecular replacement with MOLREP.Acta Crystallogr. D66, 22–25 (2010).

34. Emsley, P. & Cowtan, K. Coot: model-building tools for molecular graphics. ActaCrystallogr. D 60, 2126–2132 (2004).

35. Afonine, P. V. et al. Towards automated crystallographic structure refinementwith phenix.refine. Acta Crystallogr. D 68, 352–367 (2012).

36. Katoh, K. & Standley, D. M. MAFFT: iterative refinement and additional methods.Methods Mol. Biol. 1079, 131–146 (2014).

LETTER RESEARCH

Macmillan Publishers Limited. All rights reserved©2014

Page 8: Structural basis of PAM-dependent target DNA recognition by ......1Department of Biochemistry, University of Zurich, Winterthurerstrasse 190, CH-8057 Zurich, Switzerland. GA 5′ 3′

Extended Data Figure 1 | The PAM duplex binds in a positively chargedcleft on the C-terminal PAM-interacting domain. a, Enlarged view of thePAM binding site in Cas9. Nucleic acids are shown in stick representation,coloured according to the scheme in Fig. 1 and overlaid with experimentally

phased, solvent-flattened electron density map (grey mesh, contoured at 1s).b, PAM binding site in Cas9, shown in the same orientation as in panel a. Themolecular surface of Cas9 is coloured according to electrostatic potential.

RESEARCH LETTER

Macmillan Publishers Limited. All rights reserved©2014

Page 9: Structural basis of PAM-dependent target DNA recognition by ......1Department of Biochemistry, University of Zurich, Winterthurerstrasse 190, CH-8057 Zurich, Switzerland. GA 5′ 3′

Extended Data Figure 2 | The PAM binding site is pre-ordered in theCas9–RNA complex. a, Comparison of the structures of Cas9–sgRNA boundto a PAM-containing target DNA duplex (left) and single-stranded DNAtarget21 (right). The target DNA strands of the complexes were superimposedusing a least-squares algorithm in Coot34 and the complexes are shown inidentical orientations. Bound nucleic acids are shown in stick format andcoloured according to the scheme in Fig. 1. b, Superimposed Cas9 moleculesfrom the PAM-containing and ssDNA-bound complexes. The colour scheme is

the same as in panel a. In both complexes, the HNH domain is in an inactiveconformation, with the active site located approximately 40 A away from thescissile phosphate in the target DNA strand, suggesting that the domainundergoes a further conformational rearrangement upon target-strandcleavage. c, Superimposed nucleic acid ligands. sgRNA and target DNA fromthe single-stranded target complex are coloured grey. d, Detailed view of thePAM binding site in the superimposed complexes, indicating a slighttightening of the PAM binding cleft.

LETTER RESEARCH

Macmillan Publishers Limited. All rights reserved©2014

Page 10: Structural basis of PAM-dependent target DNA recognition by ......1Department of Biochemistry, University of Zurich, Winterthurerstrasse 190, CH-8057 Zurich, Switzerland. GA 5′ 3′

b Double stranded oligonucleotide DNA

a Supercoiled circular plasmid DNA

— WT

bp

Cas9

Time (min)

R1333A R1335AR1333AR1335A

linearized product

500

750

1000

1500

2000250030003500

— WTCas9

Time (min)

R1333A R1335AR1333AR1335A

1 5 15 60 120 1 5 15 60 120 1 5 15 60 120 1 5 15 60 120

1 5 15 60 120 1 5 15 60 120 1 5 15 60 120 1 5 15 60 12050004000 nicked product

SC

Cas9 cleavage product

Extended Data Figure 3 | Endonuclease activities of Cas9 proteinscontaining mutations in the PAM binding motif. a, Endonuclease activityassay of wild-type and mutant Cas9 proteins using supercoiled circular (SC)plasmid DNA containing a target sequence fully complementary to the sgRNA

in Fig. 1a. Nucleotide sequences of target sites are provided in Extended DataTable 2. b, Endonuclease activity assay of wild-type and mutant Cas9 proteinsusing an oligonucleotide duplex containing a target sequence fullycomplementary to the sgRNA in Fig. 1a.

RESEARCH LETTER

Macmillan Publishers Limited. All rights reserved©2014

Page 11: Structural basis of PAM-dependent target DNA recognition by ......1Department of Biochemistry, University of Zurich, Winterthurerstrasse 190, CH-8057 Zurich, Switzerland. GA 5′ 3′

a

Streptococcus pyogenesStreptococcus mutansListeria innocuaStreptococcus thermophilus A (CRISPR3)

Francisella novicidaLactobacillus buchneriTreponema denticolaStreptococcus thermophilus B (CRISPR1)

Campylobacter jejuniPasteurella multocidaNeisseria meningitidis

����������������������������������������������������������

II-A

II-A

II-A

II-A

II-B

II-A

II-A

II-A

II-C

II-C

II-C

Cas9 PAM Type

Extended Data Figure 4 | PAM binding motifs in Cas9 orthologues. a, Cas9orthologues with known PAM sequences4,19,22,23. The PAM of Lactobacillusbuchneri Cas9 has been inferred from known protospacer sequences, buthas not been experimentally validated25. b, Alignment of the amino acidsequences of the major groove interacting regions of Cas9 orthologues. Primarysequences of type II-A Cas9 proteins from S. pyogenes (GI 15675041), Listeriainnocua Clip 11262 (GI 16801805), S. mutans UA159 (GI 24379809),S. thermophilus LMD-9 (S. thermophilus A, GI 11662823; S. thermophilus B,GI 116627542), Lactobacillus buchneri NRRL B-30929 (GI 331702228),Treponema denticola ATCC 35405 (GI 42525843), type II-B Cas9 from

Francisella novicida U112 (GI 118497352), and type II-C Cas9 proteins fromCampylobacter jejuni subsp. jejuni NCTC 11168 (GI 218563121), Pasteurellamultocida subsp. multocida str. Pm70 (GI 218767588) and Neisseriameningitidis Zs491 (GI 15602992) were aligned using MAFFT36. Amino acidsare coloured in shades of blue according to their degree of conservation. Thered boxes denote amino acid residues inferred to be involved in PAMrecognition in type II-A and type II-B Cas9 proteins based on the sequencealignment and the crystal structure of the Cas9–sgRNA–DNA complexelucidated in this study.

LETTER RESEARCH

Macmillan Publishers Limited. All rights reserved©2014

Page 12: Structural basis of PAM-dependent target DNA recognition by ......1Department of Biochemistry, University of Zurich, Winterthurerstrasse 190, CH-8057 Zurich, Switzerland. GA 5′ 3′

c5’-TAAAA-3’ PAM

WT

bp

Cas9

Time (min)

R1333QR1335Q

500

750

1000

1500

2000

2500

30003500

Cas9cleavageproducts

1 5 15 60 120 1 5 15 60 120

a5’-TGG-3’ PAM

WT

bp

Cas9

Time (min)

R1333QR1335Q

500

750

1000

1500

2000

2500

30003500

Cas9cleavageproducts

1 5 15 60 120 1 5 15 60 120

b5’-TAA-3’ PAM

WT

bp

Cas9

Time (min)

R1333QR1335Q

500

750

1000

1500

2000

2500

30003500

Cas9cleavageproducts

1 5 15 60 120 1 5 15 60 120

Extended Data Figure 5 | Glutamine substitution of Arg 1333 and Arg 1335in S. pyogenes Cas9. a, Endonuclease activity assay of wild-type and mutantCas9 proteins using a linearized plasmid containing a target sequence fullycomplementary to sgRNA-2 and a 59-TGG-39 PAM (Extended Data Table 2).

Bands at 2,014 and 598 base pairs (bp) correspond to Cas9 cleavage products.b, Endonuclease activity assay as in panel a using linearized plasmid DNAcontaining a 59-TAA-39 PAM. c, Endonuclease activity assay as in panel a usinglinearized plasmid DNA containing an extended 59-TAAAA-39 PAM.

RESEARCH LETTER

Macmillan Publishers Limited. All rights reserved©2014

Page 13: Structural basis of PAM-dependent target DNA recognition by ......1Department of Biochemistry, University of Zurich, Winterthurerstrasse 190, CH-8057 Zurich, Switzerland. GA 5′ 3′

ideal B-form DNAunwound target DNA duplex

+1P +1P

dT2

dT1

dA–1

dC–2

dT2

dT1

dA–1

Lys1107

Ser1109

Glu1108

dT2

b

c

dA–1

dT1+1P

dC–2

dT3

A19

A20

A18

dC2

dG1

dC3

+1P

G19

C20

G18

Lys1107

Ser1109

Glu11083.8

3.62.7

2.53.0

Cas9–sgRNA–PAM DNA duplex Cas9–sgRNA–ssDNA target(PDB 4OO8, Mol. A)

superimposed

a

Extended Data Figure 6 | PAM-dependent interaction of the 11 phosphatewith the phosphate lock loop. a, Comparison of the bound target DNA (left)and the modelled B-form DNA (right). Docking of the ideal B-form duplexyields a steric clash with the phosphate lock loop. The arrow indicates therotation of the 11 phosphate group (11P) needed for interaction with thephosphate lock loop. b, Comparison of the phosphate lock loop and the 11phosphate positions in the Cas9–sgRNA–DNA complex containing a PAM

(left) and the Cas9–sgRNA–ssDNA target complex21 (right). Molecule A fromthe crystallographic asymmetric unit of the Cas9–sgRNA–ssDNA complex isshown. In molecule B, the nucleotides upstream of the 11 phosphate arestructurally ordered due to crystal packing interactions, and the 11 phosphateis positioned within hydrogen-bonding distance as a result. Numbers indicateinteratomic distances in A. c, Superposition of the two structures shown inpanel b.

LETTER RESEARCH

Macmillan Publishers Limited. All rights reserved©2014

Page 14: Structural basis of PAM-dependent target DNA recognition by ......1Department of Biochemistry, University of Zurich, Winterthurerstrasse 190, CH-8057 Zurich, Switzerland. GA 5′ 3′

Extended Data Figure 7 | Endonuclease activity of phosphate lock loop Cas9mutants against mismatch- and bubble-containing DNA substrates.a, Endonuclease activity assay of wild-type and mutant Cas9 proteins usingdouble-stranded oligonucleotide DNA containing a target sequence fullycomplementary to the sgRNA shown in Fig. 1a. Samples were taken after 15 s,30 s, 1 min, 2 min, 5 min, 15 min, 1 h and 2 h. b, Endonuclease activity assayusing an oligonucleotide duplex containing mismatches to the sgRNA atpositions 1–2. c, Endonuclease activity assay using a bubble-containing

oligonucleotide duplex in which the target strand is mismatched to the sgRNAat positions 1–2 and the target and non-target strands are themselvesmismatched at positions 1–2. d, Quantification of cleavage defects observedwith mismatch- and bubble-containing substrates from a–c. For each protein,the amount of cleaved product obtained after 2 h was normalized to the amountof product obtained from a perfectly complementary DNA substrate.Experiments were performed in triplicate. Error bars report standard error ofthe mean (s.e.m.).

RESEARCH LETTER

Macmillan Publishers Limited. All rights reserved©2014

Page 15: Structural basis of PAM-dependent target DNA recognition by ......1Department of Biochemistry, University of Zurich, Winterthurerstrasse 190, CH-8057 Zurich, Switzerland. GA 5′ 3′

Extended Data Table 1 | X-ray crystallographic data collection and refinement statistics

*Values in parentheses denote the highest resolution shell.

LETTER RESEARCH

Macmillan Publishers Limited. All rights reserved©2014

Page 16: Structural basis of PAM-dependent target DNA recognition by ......1Department of Biochemistry, University of Zurich, Winterthurerstrasse 190, CH-8057 Zurich, Switzerland. GA 5′ 3′

Extended Data Table 2 | DNA and RNA sequences used in the study

5’-....GAATTCCTAATGCGCCAAATTTTACAAATTGAGTTATGGATCC....-3’3’-....CTTAAGGATTACGCGGTTTAAAATGTTTAACTCAATACCTAGG....-5’

Description Sequence* Used in

Forward primer for PCR amplification of ssDNA templates, T7 promoter †

5‘-TAATACGACTCACTATA-3’ in vitro transcription of sgRNA

ssDNA template sgRNA 1 †5‘-CACTTTTTCAAGTTGATAACGGACTAGCCTTATTTTAACTTGCTATTTCTAGCTCTAAAACTTTTTTACAAATTGAGTTATCCTATAGTGAGTCGTATTA-3‘ in vitro transcription of sgRNA 1

sgRNA 15’-GGAUAACUCAAUUUGUAAAAAAGUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUG-3’

Fig. 1, 2a-c, , 3c,e, ED Fig. 1, 2a,c,d, 6b (crystallography)Fig. 2d,e, ED Fig. 3, 7

Reverse primer for PCR amplification of sgRNA1 ssDNA template

5’-CACTTTTTCAAGTTGA-3’ in vitro transcription of sgRNA 1

target strand 1 5’-CAATACCATTTTTTACAAATTGAGTTAT-3’Fig. 1, 2a-c, 3a-c, ED Fig. 1, 2a,c,d, 6 (crystallography)Fig. 2d

target strand 1ATTO532-5’

ATTO532-5‘-GCGCAATACCATTTTTTACAAATTGAGTTAT-3‘ ED Fig. 3b, 7a,d

target strand 2 MM FLATTO532-5’ ‡

ATTO532-5‘-GCGCAATACCAAATTTTACAAATTGAGTTAT-3‘ ED Fig. 7b,c,d

target strand 2 MM 5’-CAATACCACAT-3’ Fig. 3e (crystallography)

target strand 3 MM 5’-CAATACCACAA-3’ Fig. 3e (crystallography)

target strand 1 distal 5’-TTTACAAATTGAGTTAT-3’ Fig. 3e (crystallography)

non-target strand 1 5‘-ATAACTCAATTTGTAAAAAATGGTATTG-3' ED Fig. 3b, 7a,c,d

non-target strand 13‘-ATTO532

5‘-ATAACTCAATTTGTAAAAAATGGTATTGCGC-3'-ATTO532 Fig. 2d

non-target strand 12 MM ‡

5‘-ATAACTCAATTTGTAAAATTTGGTATTG-3' ED Fig. 7b,d

non-target strand 1 (product) 5’-AAAATGGTATTG-3' Fig. 1, 2a-c, 3a,c, ED Fig. 1, 2a,c,d, 6a (crystallography)

non-target strand2 MM

5’-ATGTGGTATTG-3' Fig. 3e (crystallography)

non-target strand3 MM

5’-TTGTGGTATTG-3' Fig 3e (crystallography)

pMJ879 §(target plasmid 1)

5’-....GAATTCCTAATGCGCCATTTTTTACAAATTGAGTTATGGATCC....-3’3’-....CTTAAGGATTACGCGGTAAAAAATGTTTAACTCAATACCTAGG....-5’ Fig. 2e, 3d, ED Fig. 3a

pMJ891 §(target plasmid 1, 2 MM)

Fig. 3d

ssDNA template sgRNA 2 †5’-CACTTTTTCAAGTTGATAACGGACTAGCCTTATTTTAACTTGCTATTTCTAGCTCTAAAACGCGTCTCATCTTTATGCGTCCCTATAGTGAGTCGTATTA-3’ in vitro transcription of sgRNA 2

sgRNA 25’-GGGACGCAUAAAGAUGAGACGCGUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUG-3’ ED Fig 5

pMJ992 §(target plasmid 2, NGG PAM)

ED Fig. 5a

pMJ993 §(target plasmid 2, NAA PAM)

ED Fig. 5b

pMJ994 §(target plasmid 2, NAAAA PAM)

ED Fig. 5c

* sgRNA guide sequences and complementary DNA target strand sequences are shown in red. Non-target strand PAM sites (5’-NGG-3’, 5'-NAA-3', 5'-NAAA-3') are highlighted in bold italics. Target-strand complement of the PAM (5’-CCN-3’, 5'-TTN-3', 5'-TTTTN-3') is denoted in italic.

† The sequence corresponding to the T7 promoter is coloured blue.

‡ Nucleotide positions with a mismatch (MM) to the sgRNA guide sequence are underlined.

§ All inserts shown are ligated into pUC19 vector between EcoRI and BamHI restriction sites (shown in grey).

5’-....GAATTCGTGAGACCAGCGTCTCATCTTTATGCGTCGGATCC....-3’3’-....CTTAAGCACTCTGGTCGCAGAGTAGAAATACGCAGCCTAGG....-5’

5’-....GAATTCGTGAGATTAGCGTCTCATCTTTATGCGTCGGATCC....-3’3’-....CTTAAGCACTCTAATCGCAGAGTAGAAATACGCAGCCTAGG....-5’

5’-....GAATTCGTGATTTTAGCGTCTCATCTTTATGCGTCGGATCC....-3’3’-....CTTAAGCACTAAAATCGCAGAGTAGAAATACGCAGCCTAGG....-5’

RESEARCH LETTER

Macmillan Publishers Limited. All rights reserved©2014