2016 08 30 MaterialsAndMethods submission with ref · ! 4! Primer,extensionandpurification, NEBNextQ5HotStartHiFiPCRMasterMixwasusedforprimerextensioninthepresenceof!...

1

Supplementary Information

SI Materials and Methods

Cell lines

Human lymphocyte GM12878 was purchased from the NIGMS Human Genetic Cell Repository

(Coriell Institute) and cultured in RPMI 1640 medium (no phenol red) supplemented with 15%

fetal bovine serum. Cells were maintained at 37°C in a 5% CO2 humidified chamber.

Oligonucleotides and adapteors

Oligonucleotides for Ad1: AD1T: 5’-‐phos-‐GATCGGAAGAGCACACGTCTGAACTCCAGTCA-‐SpC3;

AD1B: 5’-‐NNNNNGACTGGTTCCAATTGAAAGTGCTCTTCCGATC*T. Oligonucleotides for Ad2: AD2T:

5’-‐phos-‐AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT-‐SpC3; AD2B: 5’-‐

ACACTCTTTCCCTACACGACGCTCTTCCGATCTNNNNN-‐SpC3. Oligonucleotides for primer extension:

Bio3: 5’-‐bio-‐TGACTGGAGTTCAGACGTGTGCTCTTCCGATCT. Above oligonucleotides were

synthesized by IDT. PCR primers for Ad1 were ordered from Sigma: Pu: 5’-‐

GACTGGTTCCAATTGAAAGTGCTCTTCCGATC*T; Pi: 5’-‐

TGACTGGAGTTCAGACGTGTGCTCTTCCGATC*T. “*” indicated phosphorothioate bond. PCR

primers for Ad2 and final library (Universal and Index primers for Illumina) were purchased from

New England Biolabs.

To prepare Ad1 or Ad2, AD1T & AD1B or AD2T & AT2B were annealed, respectively. 5nmol AD1T

or AD2T and 6 nmol AD1B or AD2B were mixed together in 50 µL Hybridization Buffer (10 mM

Tris-‐HCl pH7.5, 100 mM NaCl, 0.1 mM EDTA) and boiled for 2 min, then slowly cooled down to

25°C.

2

Damage-‐Seq library preparation

Treatment of cells and isolation of fragmented genomic DNA

GM12878 cells were grown to ~8x105 cells/ml before treatment. To treat the cells with cisplatin

(Sigma) or oxaliplatin (LC Labs), fresh stocks were made every time before treatment. The drug

was dissolved in DMSO to 20 mM and immediately added to medium to a final concentration of

200 µM. Cells were further incubated at 37°C for 1.5 hour, then transferred to a pre-chilled 15 ml

tube on ice, collected by centrifugation, and washed by ice-cold PBS. Cell pellets were frozen at -

80°C.

Cell pellets were resuspended in 900 µL cold lysis buffer (10 mM Tris-‐HCl pH8.0, 1 mM EDTA,

250 mM NaCl, 0.5% TritonX-‐100 and 0.1% SDS) with 10 µL RNaseA (Sigma) and incubated on ice

for at least 10 min. Lysates were sonicated by Misonix Sonicator 3000 with a microtip on ice

water to generate fragments averaging 400 bp in length and then centrifuged at 14,000 rpm for

10 min to pellet debris. The supernatant was incubated with 10 µL Proteinase K (New England

Biolabs) at 55°C for 30 min, followed by phenol/chloroform extraction and ethanol precipitation.

The DNA pellet was dissolved in 100 µL 1xTE buffer, and fragments 200-‐700 bp in length were

selected by 0.5x/0.7x (50/70 µL) HighPrep PCR beads (MagBio) according to the manufacture’s

guidelines. DNA concentration was determined by NanoDrop 1000 (Thermo).

For naked DNA samples, DNA fragments were prepared from untreated GM12878 cells as

described above. For untreated samples, 1 µg size-‐selected DNA was used to prepare library by

NEBNext DNA preparation kit following the manufacture’s protocol (unlike the Damage-‐seq

protocol). For cisplatin treatment, 5 µg DNA were incubated with 20 µM cisplatin in a final

volume of 50 µL at room temperature for 15 min. Treated DNA were purified through a G50 spin

column (GE) immediately and subjected to Damage-‐seq.

3

End-‐repair, dA-‐tailing and first adaptor ligation

Size-‐selected DNA (5 µg) were used for End-‐repair and dA-‐tailing by NEBNext DNA preparation

kit following the manufacture’s instruction. Ad1 (500 pmol) was ligated to both ends by

NEBNext Quik Ligase for more than 12 hr under 16°C, then purified by HighPrep PCR beads and

eluted in 126 µL 0.1x TE.

Damage-‐specific immunoprecipitation by antibodies

Damaged DNA immunoprecipitation was performed as described previously with modification.

To denature DNA, 42 µL of 8M Urea were added (to a final concentration of 2M), followed by

boiling for 2 min and immediately putting on ice water for 2 min. Then 50 µg denatured

sonicated salmon sperm DNA (Stratagene) and 20 µL of 10x PEXB Buffer (see below) were added,

followed by incubation with antibody-‐coated beads which were prepared as described below in

totally 200 µL of 1x PEXB Buffer (1xPBS, 2 mM EDTA, 0.01% Triton X-‐100 and 0.025% BSA). A

slurry of 40 µL of anti-‐rat Dynabeads (11035, Thermo) was washed three times with 1x PEXB

Buffer, and then incubated with 10 µg carrier DNA, 1.5 µL anti-‐cisplatin modified DNA antibody

(ab103261, Abcam) in 100 µL of 1x PEXB Buffer for 3 hrs at 4°C. After incubation, beads were

washed by 1x PEBX Buffer and incubated with denatured DNA overnight at 4°C.

The beads were washed sequentially with PEXU Buffer (1xPBS, 2 mM EDTA, 0.01% Triton X-‐100

and 1.6M Urea), PEX Buffer (1xPBS, 2 mM EDTA, 0.01% Triton X-‐100), IP Buffer (20 mM Tris-‐Cl

pH 8.0, 2 mM EDTA, 150 mM NaCl, 1% Triton X-‐100, and 0.5% sodium deoxycholate), and TE

Buffer (10 mM Tris-‐Cl pH 8.0 and 1 mM EDTA). The fragments containing damage were eluted

by incubation with 100 µL of Elution Buffer (50 mM NaHCO3, 1% SDS) at 65°C for 10 min. The

eluted DNA was then isolated by phenol/chloroform extraction followed by ethanol

precipitation.

4

Primer extension and purification

NEBNext Q5 Hot Start HiFi PCR Master Mix was used for primer extension in the presence of

purified DNA and 30 pmol Bio3 in a thermocycler for 45s at 98°C followed by 5 min at 65°C. We

chose this enzyme because it has high fidelity and furthermore, it can carry out primer extension

at the same temperature used for annealing. Then 2 µL of Exo1 (New England Biolabs) were

added to degrade excessive primers at 37°C for 10 min. After HighPrep PCR beads purification,

DNA was denatured by boiling for 2 min and immediately putting on ice water for 2 min, then

incubated with 5 µL of Dynabeads MyOne Streptavidin C1 (Thermo) in 30 µL of 1x B&W Buffer (5

mM Tris-‐HCl pH 7.5, 0.5 mM EDTA, 1 M NaCl) at 4°C for 1 hr. Biotinylated DNA were eluted by a

short incubation (~10s) in 100 µL of nonionic water at 75°C and concentrated by ethanol

precipitation.

Second adaptor ligation, PCR amplification and high-‐throughput sequencing

To add the second adaptor to the 3’ end, purified DNA were incubated with 40 pmol of Ad2 in

10 µL of 1x Hybridization Buffer for 10 min at 65°C and then for 5 min at 16°C in a thermal cycler.

To perform ligation, 4 µL of 5x ligase buffer, 1 µL of T4 DNA ligase HC (Thermo), 1 µL of 50%

PEG8000 (New England Biolabs), and 4 µL of H2O were added to each reaction. The reactions

were incubated overnight at 16°C.

For quality check, one percent of ligation products were PCR-‐amplified with primers Pu/Pi

(Primers 1 for Ad1) or Universal/Index1 (Primers 2 for Ad2, E7350S, New England Biolabs). After

HighPrep PCR beads purification, ligated DNA were PCR-‐amplified by NEBNext Q5 Hot Start HiFi

PCR Master Mix for 11-‐13 cycles with NEBNext Multiplex Oligos for Illumina (New England

Biolabs). The PCR products were purified by HighPrep PCR beads and concentration was

determined by Pico Green (Thermo).

5

XR-‐Seq library preparation for cisplatin and oxaliplatin induced damages

XR-‐seq libraries were prepared as described (1) with modifications in the damage-‐specific

immunoprecipitation and in vitro reversal steps. GM12878 cells were treated with 200 µM

cisplatin or oxaliplatin for 3 hr and collected as described above. Cells were lysed and primary

excision products were pulled down by TFIIH co-‐immunoprecipitation and followed by adapter

ligation on both ends. The damage-‐specific immunoprecipitation with anti-‐cisplatin modified

DNA antibody was performed as described in Damage-‐seq with minor difference. A slurry of 25

µL of anti-‐rat Dynabeads and 1 µL anti-‐cisplatin modified DNA antibody were used per reaction.

Pre-‐incubated beads were then incubated with ligation products and 20 µg of denatured

sonicated salmon sperm DNA in 100 µL of 1x PEXB Buffer at 4°C overnight. After sequential

wash with PEX Buffer twice, IP Buffer once and TE Buffer once, the ligation products containing

damage were eluted by incubation with 100 µL of Elution Buffer at 65°C for 10 min. The eluted

DNA was purified by phenol/chloroform extraction and ethanol precipitation. To reverse

cisplatin or oxalipaltin-‐induced damages in vitro, damaged DNA were incubated with 200 mM of

NaCN at 65°C overnight, then purified through a G50 spin column and followed by ethanol

precipitation. Purified DNA were amplified by PCR and purified by native PAGE as described to

make the library.

Sequencing and genome alignments

All sequencing libraries were sequenced on HiSeq 2500 platform by the University of North

Carolina High-‐Throughput Sequencing Facility. Based on our previous experience with XR-‐seq, in

which 5 million uniquely mapped reads are sufficient to detect repair enrichment over genes,

we sequenced Damage-‐seq to at least 10 million mapped reads per sample. Generally, this

6

required multiplexing ≤ 4 samples per lane for Damage-‐seq and ≤ 8 samples per lane for XR-‐seq.

Summary of alignments are available in SI appendix Tables S3,4.

Damage-‐seq sequence analysis. Libraries were sequenced to produce paired-‐end 50nt reads,

allowing us to establish unique aligned reads and distinguish between damage hot spots and

amplification artifacts. Products of primer extension of undamaged DNA were filtered using

cutadapt (2) filtering the adapter sequence 5’-‐GACTGGTTCCAATTGAAAGTGCTCTTCCGATCT-‐3’.

Paired end reads were aligned to the hg19 reference genome with bowtie using command line

options –q -‐-‐nomaqround -‐-‐phred33-‐quals –S –X 1000 –m 4 –seed 123. The damage position

and nucleotide composition were then determined as the 2 nt upstream of the first read start

using samtools and bedtools.

XR-‐seq. XR-‐seq data was analyzed as previously reported. Flanking adapter sequences were

removed using trimmomatic (3). Reads were aligned to the hg19 human reference genome

using bowtie (4) with the command options -‐q -‐-‐nomaqround -‐-‐phred33-‐quals -‐m 4 -‐n 2 -‐e 70 -‐l

20 -‐-‐best –S. Uniquely aligned reads were obtained using samtools.

Data visualization

For comparison of the DNA damage and repair signal, we normalized all the count data by the

sequencing depth and data is available for viewing as a track hub on the UCSC genome browser

(https://genome.ucsc.edu/cgi-‐bin/hgGateway) by pasting the link:

http://trackhubs.its.unc.edu/sancarlb/Platinum_damage/hub.txt. The raw data and bigwig

tracks are available with GEO accession GSE82213. (www.ncbi.nlm.nih.gov/geo/).

7

ENCODE data

GM12878 stranded RNA-‐seq (ENCODE DCC accessions ENCSR00CUH), and DNase-‐seq (accession

ENCSR000EJD), as well as chromHMM chromatin state segmentation (UCSC accession

wgEncodeEH000784) and nucleosome data (Mnase-‐seq, accession ENCSR000CXP) were

downloaded from the ENCODE portal (http://genome.ucsc.edu/ENCODE/) or viewed on the

UCSC browser.

Di-‐nucleotide frequencies

The di-‐nucleotide positions for the hg19 reference genome were established using oligoMatch

from the UCSC tools.

Chromatin state analysis

Bedtools (5) coverage was used to calculate the damage and repair levels over each of the 15

predicted chromatin states defined by the ChromHMM algorithm (6). Merged data from two

biological replicates was used. Values were normalized per million mapped reads and per Kb of

interval length and plotted with R.

Plotting average damage and repair profiles

Average damage and repair profiles from the merged biological replicates was calculated over

GM12878 DNase peaks using bedtools (5) coverage. Counts were normalized per million

mapped reads and plotted with R. For plots, data was binned into 50nt windows.

For average XR-‐seq profiles relative to the annotated TSS or TES, we limited the gene list to

genes that do not have overlapping or neighboring genes for at least 6000bp upstream or

downstream on either strand and were at least 10,000bp in length. The highest quartile of

expressed genes from GM12878 (n=442) cells was identified as previously described. Briefly, we

8

calculated FPKM for the two mapped RNA-‐seq replicates using cufflinks (7) and the UCSC hg19

genes.gtf. Merged biological replicates were used for plotting. Read counts were calculated

from the aligned .bam files using bedtools coverage, normalized per million mapped reads and

plotted with R. For plots, data was binned into 50nt windows.

For nucleosome analysis, nucleosome positions were determined with DNAPOS2 (8). Average

damage and repair signal from merged biological replicates surrounding the 2,000,000 randomly

picked nucleosome center position was calculated using the R GenomicRanges (9) and

genomation packages (10). For plotting, data was binned to 5nt windows.

References

1. Hu, J., Adar, S., Selby, C.P., Lieb, J.D. & Sancar, A. Genome-‐wide analysis of human global and transcription-‐coupled excision repair of UV damage at single-‐nucleotide resolution. Genes Dev 29, 948-‐960 (2015).

2. Martin M (2011) Cutadapt removes adapter sequences from high-‐throughput sequencing reads. EMBnet. journal 17(1), pp-10.

3. Bolger AM, Lohse M, & Usadel B (2014) Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30(15):2114-‐2120.

4. Langmead B, Trapnell C, Pop M, & Salzberg SL (2009) Ultrafast and memory-‐efficient alignment of short DNA sequences to the human genome. Genome Biol 10(3):R25.

5. Quinlan AR & Hall IM (2010) BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26(6):841-‐842.

6. Ernst, J. et al. Mapping and analysis of chromatin state dynamics in nine human cell types. Nature 473, 43-‐49 (2011).

7. Trapnell C, et al. (2010) Transcript assembly and quantification by RNA-‐Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol 28(5):511-‐515.

8. Chen K, et al. (2013) DANPOS: dynamic analysis of nucleosome position and occupancy by sequencing. Genome Res 23(2):341-‐351.

9. Lawrence M, et al. (2013) Software for computing and annotating genomic ranges. PLoS Comput Biol 9(8):e1003118.

10. Akalin A, Franke V, Vlahovicek K, Mason CE, & Schubeler D (2015) Genomation: a toolkit to summarize, annotate and visualize genomic intervals. Bioinformatics 31(7):1127-‐1129.

SIappendixFigureS1DetailedSchema-cofDamage-Seq.

9

A. B.

SIappendixFigureS2Damage-seqandXR-seqofoxalipla-n.A)AgarosegelanalysisofDamage-seqlibraries. DNA fragments from oxalipla-n-treated cells were amplified with sets of primerscomplementary to the1stand2ndadapters.B)Na-vepolyacrylamidegelelectrophoresisofXR-seqlibrariesshowingoxalipla-nadductreversalbyNaCNisnecessaryforPCRamplifica-onofsequencinglibraries.

10

Cispla-n

Rep1 Rep2

GA

TC

Oxalipla-n

Rep1 Rep2

A.

B.

SI appendix Figure S3. Single nucleo-de frequencies in damage-seq reads. Supplemental to mainFigure2a.Nucleo-defrequenciesareploKedforposi-ons3ntupstreamofthereadstartand10ntintothe read for each damage type and an undamaged control.A) Data is represented for the secondreplicateofcispla-nDamage-seqnotshowninthemainFigure.LeNpanelisfrommainFigure2a.B)Data for two biological replicates of oxalipla-n Damage-seq.C)Data is represented for the secondreplicateofsequencedundamagedDNA.notshowninthemainFigure.LeNpanelisfrommainFigure2c.

C.

Control

Rep1 Rep2

11

−5 −4 −3 −2 −1 G G 1 2 3 4 5

Position relative to dimers

Nuc

leot

ide

frequ

ency

(%)

0

20

40

60

80

100

Cispla-ndamage-seqreplicate2

−5 −4 −3 −2 −1 G G 1 2 3 4 5


Nuc

leot

ide

frequ

ency

(%)

0

20

40

60

80

100

Randomgenomicloci

−5 −4 −3 −2 −1 G G 1 2 3 4 5


Nuc

leot

ide

frequ

ency

(%)

0

20

40

60

80

100

GA TC

−5 −4 −3 −2 −1 G G 1 2 3 4 5


Nuc

leot

ide

frequ

ency

(%)

0

20

40

60

80

100

Oxalipla-ndamage-seq

A. B.

C.Rep1 Rep2

SI appendix Figure S4. Sequence context for the 5nt flanking G-G di-nucleo-des at the -1 and -2posi-ons rela-ve to the read start for the second biological replicate of cispla-n damage seq (A),randomlyselected26mersinthehg19referencegenome(B)andoxalipla-nDamage-seq(C).

12

Histogram of GMCisP_XR3h_Rep1_length[, 1]

GMCisP_XR3h_Rep1_length[, 1]

Freq

uenc

y

10 20 30 40 50

0e+0

02e

+06

4e+0

66e

+06

8e+0

61e

+07

Histogram of GMCisP_XR3h_Rep2_length[, 1]

GMCisP_XR3h_Rep2_length[, 1]Fr

eque

ncy

0 10 20 30 40 50

0.0e

+00

2.0e

+06

4.0e

+06

6.0e

+06

8.0e

+06

1.0e

+07

1.2e

+07

Histogram of OXP_XR_Rep1_length[, 1]

OXP_XR_Rep1_length[, 1]

Freq

uenc

y

0 10 20 30 40 50

0.0e

+00

4.0e

+06

8.0e

+06

1.2e

+07

Histogram of OXP_XR_Rep2_length[, 1]

OXP_XR_Rep2_length[, 1]

Freq

uenc

y

0 10 20 30 40 50

0.0e

+00

4.0e

+06

8.0e

+06

1.2e

+07

Cispla-n

Rep1 Rep2

Oxalipla-n

SIappendixFigureS5.Distribu-onofXR-seqfragmentsizesfromtheCispla-n-andOxalipla-n-XR-seqbiologicalreplicatesinGM12878.Readsof50ntlikelyreflectsmallfrac-onofcontaminantDNA.

13

1−2

2−3

3−4

4−5

5−6

6−7

7−8

8−9

9−10

10−1

111−1

212−1

313−1

414−1

515−1

616−1

717−1

818−1

919−2

020−2

121−2

222−2

323−2

424−2

525−2

6

GG

dim

er fr

eque

ncy

(%)

010203040

Position along excised fragment

Cispla-nXR-seq–rep2

−5 −4 −3 −2 −1 G G 1 2 3 4 5


Nuc

leot

ide

frequ

ency

(%)

0

20

40

60

80

100

1−2

2−3

3−4

4−5

5−6

6−7

7−8

8−9

9−10

10−1

111−1

212−1

313−1

414−1

515−1

616−1

717−1

818−1

919−2

020−2

121−2

222−2

323−2

424−2

525−2

6

GG

dim

er fr

eque

ncy

(%)

01020304050


Oxalipla-nXR-seq–rep1

14 15 16 17 18 19 20 21 22 23 24 25

Position in excised oligo

% o

f rea

ds

0

20

40

60

80

100

A. B.

14 15 16 17 18 19 20 21 22 23 24 25

Position in excised oligo

% o

f rea

ds

0

20

40

60

80

100

1−2

2−3

3−4

4−5

5−6

6−7

7−8

8−9

9−10

10−1

111−1

212−1

313−1

414−1

515−1

616−1

717−1

818−1

919−2

020−2

121−2

222−2

323−2

424−2

525−2

6

GG

dim

er fr

eque

ncy

(%)

01020304050



Cispla-nXR-seq–rep2



GA TC

SIappendixFigureS6.Singlenucleo-deresolu-onmappingofrepair.SupplementaltomainFig2e,f.A)Frequencyoftherelevantdi-nucleo-de,G-G,ateachposi-onof26ntXR-seqexcisionfragmentsinthe second replicate of cispla-n- and for oxalipla-n- XR-seq. B) The corresponding nucleo-defrequenciesatthe5ntflankingtheG-Gdimeratposi-on19-20.

14

Scalechr1:

100 Mb hg1950,000,000 100,000,000 150,000,000 200,000,000

HLCisP_P

GMCisPXR3hRep1PLU

HLCisP_M

GMCisPXR3hRep1MIN

Scalechr8:

50 Mb hg1950,000,000 100,000,000

HLCisP_P

GMCisPXR3hRep1PLU

HLCisP_M

GMCisPXR3hRep1MIN

Chr1Damage+Repair+

Damage-Repair-

Chr2Damage+Repair+

Damage-Repair-

Chr3Damage+Repair+

Damage-Repair-

Chr4Damage+Repair+

Damage-Repair-

Chr5Damage+Repair+

Damage-Repair-

Chr6Damage+Repair+

Damage-Repair-

Chr7Damage+Repair+

Damage-Repair-

Chr8Damage+Repair+

Damage-Repair-

Scalechr2:

100 Mb hg1950,000,000 100,000,000 150,000,000 200,000,000

HLCisP_P

GMCisPXR3hRep1PLU

HLCisP_M

GMCisPXR3hRep1MIN

Scalechr3:

50 Mb hg1950,000,000 100,000,000 150,000,000

HLCisP_P

GMCisPXR3hRep1PLU

HLCisP_M

GMCisPXR3hRep1MIN

Scalechr4:

50 Mb hg1950,000,000 100,000,000 150,000,000

HLCisP_P

GMCisPXR3hRep1PLU

HLCisP_M

GMCisPXR3hRep1MIN

Scalechr5:

50 Mb hg1950,000,000 100,000,000 150,000,000

HLCisP_P

GMCisPXR3hRep1PLU

HLCisP_M

GMCisPXR3hRep1MIN

Scalechr6:

50 Mb hg1950,000,000 100,000,000 150,000,000

HLCisP_P

GMCisPXR3hRep1PLU

HLCisP_M

GMCisPXR3hRep1MIN

Scalechr7:

50 Mb hg1950,000,000 100,000,000 150,000,000

HLCisP_P

GMCisPXR3hRep1PLU

HLCisP_M

GMCisPXR3hRep1MIN

Scalechr9:

50 Mb hg1950,000,000 100,000,000

HLCisP_P

GMCisPXR3hRep1PLU

HLCisP_M

GMCisPXR3hRep1MIN

Chr9

Chr10

Chr11

Chr12

Scalechr10:

50 Mb hg1950,000,000 100,000,000

HLCisP_P

GMCisPXR3hRep1PLU

HLCisP_M

GMCisPXR3hRep1MIN

Scalechr11:

50 Mb hg1950,000,000 100,000,000

HLCisP_P

GMCisPXR3hRep1PLU

HLCisP_M

GMCisPXR3hRep1MIN

Scalechr12:

50 Mb hg1950,000,000 100,000,000

HLCisP_P

GMCisPXR3hRep1PLU

HLCisP_M

GMCisPXR3hRep1MIN

Scalechr16:

20 Mb hg1950,000,000

HLCisP_P

GMCisPXR3hRep1PLU

HLCisP_M

GMCisPXR3hRep1MIN

Chr14

Chr16

Chr15

Scalechr14:

50 Mb hg1950,000,000 100,000,000

HLCisP_P

GMCisPXR3hRep1PLU

HLCisP_M

GMCisPXR3hRep1MIN

Scalechr15:

50 Mb hg1950,000,000 100,000,000

HLCisP_P

GMCisPXR3hRep1PLU

HLCisP_M

GMCisPXR3hRep1MIN

Scalechr17:

20 Mb hg1910,000,000 20,000,000 30,000,000 40,000,000 50,000,000 60,000,000 70,000,000 80,000,000

HLCisP_P

GMCisPXR3hRep1PLU

HLCisP_M

GMCisPXR3hRep1MIN

Chr17

Chr18

Chr19

Chr20

Chr21

Chr22

ChrX

Scalechr18:

20 Mb hg1910,000,000 20,000,000 30,000,000 40,000,000 50,000,000 60,000,000 70,000,000

HLCisP_P

GMCisPXR3hRep1PLU

HLCisP_M

GMCisPXR3hRep1MIN

Scalechr19:

20 Mb hg1910,000,000 20,000,000 30,000,000 40,000,000 50,000,000

HLCisP_P

GMCisPXR3hRep1PLU

HLCisP_M

GMCisPXR3hRep1MIN

Scalechr20:

20 Mb hg1910,000,000 20,000,000 30,000,000 40,000,000 50,000,000 60,000,000

HLCisP_P

GMCisPXR3hRep1PLU

HLCisP_M

GMCisPXR3hRep1MIN

Scalechr21:

10 Mb hg1910,000,000 20,000,000 30,000,000 40,000,000

HLCisP_P

GMCisPXR3hRep1PLU

HLCisP_M

GMCisPXR3hRep1MIN

Scalechr22:

20 Mb hg1910,000,000 20,000,000 30,000,000 40,000,000 50,000,000

HLCisP_P

GMCisPXR3hRep1PLU

HLCisP_M

GMCisPXR3hRep1MIN

ScalechrX:

50 Mb hg1950,000,000 100,000,000 150,000,000

HLCisP_P

GMCisPXR3hRep1PLU

HLCisP_M

GMCisPXR3hRep1MIN

Chr13Scalechr13:

50 Mb hg1950,000,000 100,000,000

HLCisP_P

GMCisPXR3hRep1PLU

HLCisP_M

GMCisPXR3hRep1MIN

Damage+Repair+

Damage-Repair-

Damage+Repair+

Damage-Repair-

Damage+Repair+

Damage-Repair-

Damage+Repair+

Damage-Repair-

SIappendixFigureS7.15

Scalechr13:

50 Mb hg1950,000,000 100,000,000

HLCisP_P

GMCisPXR3hRep1PLU

HLCisP_M

GMCisPXR3hRep1MIN

Scalechr1:

100 Mb hg1950,000,000 100,000,000 150,000,000 200,000,000

HLOxP_P

GMOxPXR3hRep1PLU

HLOxP_M

GMOxPXR3hRep1MIN

Scalechr12:

50 Mb hg1950,000,000 100,000,000

HLCisP_P

GMCisPXR3hRep1PLU

HLCisP_M

GMCisPXR3hRep1MIN

Chr1Damage+Repair+

Damage-Repair-

Chr2Damage+Repair+

Damage-Repair-

Chr3Damage+Repair+

Damage-Repair-

Chr4Damage+Repair+

Damage-Repair-

Chr5Damage+Repair+

Damage-Repair-

Chr6Damage+Repair+

Damage-Repair-

Chr7Damage+Repair+

Damage-Repair-

Chr8Damage+Repair+

Damage-Repair-

Chr9

Chr10

Chr11

Chr12

Chr14

Chr16

Chr15

Chr17

Chr18

Chr19

Chr20

Chr21

Chr22

ChrX

Chr13

Damage+Repair+

Damage-Repair-

Damage+Repair+

Damage-Repair-

Damage+Repair+

Damage-Repair-

Damage+Repair+

Damage-Repair-

Scalechr8:

50 Mb hg1950,000,000 100,000,000

HLOxP_P

GMOxPXR3hRep1PLU

HLOxP_M

GMOxPXR3hRep1MIN

Scalechr9:

50 Mb hg1950,000,000 100,000,000

HLOxP_P

GMOxPXR3hRep1PLU

HLOxP_M

GMOxPXR3hRep1MIN

Scalechr17:

20 Mb hg1910,000,000 20,000,000 30,000,000 40,000,000 50,000,000 60,000,000 70,000,000 80,000,000

HLOxP_P

GMOxPXR3hRep1PLU

HLOxP_M

GMOxPXR3hRep1MIN

Scalechr16:

20 Mb hg1950,000,000

HLOxP_P

GMOxPXR3hRep1PLU

HLOxP_M

GMOxPXR3hRep1MIN

Scalechr2:

100 Mb hg1950,000,000 100,000,000 150,000,000 200,000,000

HLOxP_P

GMOxPXR3hRep1PLU

HLOxP_M

GMOxPXR3hRep1MIN

Scalechr3:

50 Mb hg1950,000,000 100,000,000 150,000,000

HLOxP_P

GMOxPXR3hRep1PLU

HLOxP_M

GMOxPXR3hRep1MIN

Scalechr4:

50 Mb hg1950,000,000 100,000,000 150,000,000

HLOxP_P

GMOxPXR3hRep1PLU

HLOxP_M

GMOxPXR3hRep1MIN

Scalechr5:

50 Mb hg1950,000,000 100,000,000 150,000,000

HLOxP_P

GMOxPXR3hRep1PLU

HLOxP_M

GMOxPXR3hRep1MIN

Scalechr6:

50 Mb hg1950,000,000 100,000,000 150,000,000

HLOxP_P

GMOxPXR3hRep1PLU

HLOxP_M

GMOxPXR3hRep1MIN

Scalechr7:

50 Mb hg1950,000,000 100,000,000 150,000,000

HLOxP_P

GMOxPXR3hRep1PLU

HLOxP_M

GMOxPXR3hRep1MIN

Scalechr10:

50 Mb hg1950,000,000 100,000,000

HLOxP_P

GMOxPXR3hRep1PLU

HLOxP_M

GMOxPXR3hRep1MIN

Scalechr11:

50 Mb hg1950,000,000 100,000,000

HLOxP_P

GMOxPXR3hRep1PLU

HLOxP_M

GMOxPXR3hRep1MIN

Scalechr14:

50 Mb hg1950,000,000 100,000,000

HLOxP_P

GMOxPXR3hRep1PLU

HLOxP_M

GMOxPXR3hRep1MIN

Scalechr15:

50 Mb hg1950,000,000 100,000,000

HLOxP_P

GMOxPXR3hRep1PLU

HLOxP_M

GMOxPXR3hRep1MIN

Scalechr18:

20 Mb hg1910,000,000 20,000,000 30,000,000 40,000,000 50,000,000 60,000,000 70,000,000

HLOxP_P

GMOxPXR3hRep1PLU

HLOxP_M

GMOxPXR3hRep1MIN

Scalechr19:

20 Mb hg1910,000,000 20,000,000 30,000,000 40,000,000 50,000,000

HLOxP_P

GMOxPXR3hRep1PLU

HLOxP_M

GMOxPXR3hRep1MIN

Scalechr20:

20 Mb hg1910,000,000 20,000,000 30,000,000 40,000,000 50,000,000 60,000,000

HLOxP_P

GMOxPXR3hRep1PLU

HLOxP_M

GMOxPXR3hRep1MIN

Scalechr21:

10 Mb hg1910,000,000 20,000,000 30,000,000 40,000,000

HLOxP_P

GMOxPXR3hRep1PLU

HLOxP_M

GMOxPXR3hRep1MIN

Scalechr22:

20 Mb hg1910,000,000 20,000,000 30,000,000 40,000,000 50,000,000

HLOxP_P

GMOxPXR3hRep1PLU

HLOxP_M

GMOxPXR3hRep1MIN

ScalechrX:

50 Mb hg1950,000,000 100,000,000 150,000,000

HLOxP_P

GMOxPXR3hRep1PLU

HLOxP_M

GMOxPXR3hRep1MIN

SIappendixFigureS8. 16

SI appendix Figure S8. Whole genome map of damage and repair of oxalipla-n damage.Screenshots of damage and repair signals, separated by strand, for all the chromosomes of thehumangenome.

SIappendixFigureS7.Wholegenomemapofdamageandrepairofcispla-ndamage.Screenshotsofdamageandrepairsignals,separatedbystrand,forallthechromosomesofthehumangenome.

17

Scalechr17:

20 Mb hg1950,000,000

HLOxP_P

GMOxPXR3hRep1PLU

HLOxP_M

GMOxPXR3hRep1MIN

GM12878 Sg 2

GM78 cel pA+ - 2

GM78 cel pA+ + 2

Scalechr17:

20 kb hg197,570,000 7,580,000 7,590,000 7,600,000 7,610,000

ATP1B2TP53

TP53

TP53TP53TP53TP53TP53TP53TP53TP53TP53TP53

HV941431HV941433HV941428HV941434HV941486

TP53TP53TP53

HV941429TP53

HV941440HV941478HV941442HV941444

TP53HV941430

WRAP53

WRAP53WRAP53

WRAP53WRAP53WRAP53

WRAP53EFNB3

HLOxP_P

GMOxPXR3hRep1PLU

HLOxP_M

GMOxPXR3hRep1MIN

GM12878 Sg 1

GM78 cel pA+ - 2

GM78 cel pA+ + 2

20Kb

Scalechr17:

20 Mb hg1950,000,000

HLCisP_P

GMCisPXR3hRep1PLU

HLCisP_M

GMCisPXR3hRep1MIN

GM12878 Sg 2

GM78 cel pA+ - 1

GM78 cel pA+ + 1

OxaliplaGn20Mb

Damage-seq:GG…50ntXR-seq:TCTTTTTGAAAGCTGGTCTGGTCCTTT

Damage+

Damage-Repair+

Repair-

RNA+RNA-

DNAseHS

Damage+

Damage-Repair+

Repair-

RNA+RNA-

DNAseHS

A.

B.

C.

SI appendix Figure S9. Genome-wide paKerns of damage and repair of oxalipla-n damage. A)Representa-ve screen shot of damage and repair signals, separated by strand, for the en-rechromosome 17. B) Zoom-in on a ~80kbp segment of chromosome 17 which includes TP53. C)Representa-veXR-seqandDamage-seqreadsthatcaptureaspecificPt-d(GpG)damage.

18

CisplaGnCisplaGn

OxaliplaGnOxaliplaGn

DamageTS DamageNTSRepairTS RepairNTS

GGfrequency

B.

C. D.

G.

OxaliplaGn OxaliplaGn

E. F.

A.

SI appendix Figure S10. Repair and damage at transcribed genes. Supplemental tomain Fig3d-f.A)Oxalipla-n damage and repair profiles at the transcribed and non-transcribed strands are ploKedsurroundingtheTSSofhighlyexpressedgenes,B)similartoa,exceptwithazoomed-inscaleforthedamagelevels.C)Cispla-ndamageandrepairareploKedsurroundingtheTESofhighlyexpressedD)similartoc,exceptwithazoomed-inscaleforthedamagelevels.E)Sameasc,exceptforoxalipla-ndamageandrepair,F)sameasd,exceptforoxalipla-ndamage.G)G-GfrequencyploKedsurroundingtheTESofhighlyexpressedgenes.

19

CisplaGn OxaliplaGn

RepairDamage

RepairDamage

A. B.

SIappendixFigureS11.OpenregionsinthegenomehavehigherrepairbutliKledifferenceindamage.A) Plofng cispla-n damage and repair around DNAse-HS sites in GM12878. B) Same as c, exceptploKedisoxalipla-ndamageandrepair.

20

3.Poisedpromoter

4.Strongenhancer

1.Ac-vepromoter

2.Weakpromoter

6.Weakenhancer

8.Insulator

9.Txntransi-on

11.Weaktxn

12.Repressed

5.Strongenhancer

7.Weakenhancer

10.Txnelonga-on

13.Heterochroma-c

14.Repe--ve

15.Repe--ve

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

0.0

0.5

1.0

1.5

2.0

coun

ts p

er K

b pe

r mil

read

s

B. C.1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

0.0

0.5

1.0

1.5

2.0

coun

ts p

er K

b pe

r mil

read

s

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

0.0

0.5

1.0

1.5

2.0

coun

ts p

er K

b pe

r mil

read

s

CisplaGndamage GGfrequency

OxaliplaGndamage

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

0123456

coun

ts p

er K

b pe

r mil

read

s

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

0123456

coun

ts p

er K

b pe

r mil

read

s

OxaliplaGn

012

3456

012

Readsp

erKbpe

rmilmappe

dRe

pair

Damage

A.

SI appendix Figure S12. Damage and repair at different chroma-n states.A) Analysis of oxalipla-nrepair (top) and damage (bo*om) levels across the 15 annotated chroma-n states in GM12878 B)Small varia-ons in cispla-n and oxalipla-n damage levels at the different chroma-n states areobserved when plofng damage on a smaller scale. C) Varia-ons in the frequency of G-G in thedifferentstatesmirrorsthevaria-onindamagelevels.

21

22

TableS1.DinucleotidefrequenciesflankingthereadstartofDamage-seqreadsCisplatin Rep1 (-3)-(-4) (-2)-(-3) (-1)-(-2) (-1)-(1) (1)-(2) TT 5.9 1.8 1.2 4.0 5.9 TC 2.0 2.3 0.7 2.5 4.4 CT 7.2 1.6 5.5 0.9 6.5 CC 2.5 1.0 0.8 0.7 8.7 AA 10.0 2.2 1.2 2.4 8.2 AC 1.8 1.0 0.6 2.0 5.1 AG 9.0 33.6 5.3 3.0 8.1 AT 6.3 1.5 1.4 2.5 5.4 CA 11.3 2.1 1.1 1.2 7.5 CG 1.6 3.5 0.6 0.4 1.4 GA 9.7 2.6 6.9 20.3 5.9 GC 1.8 3.7 1.1 19.1 4.6 GG 9.5 21.1 64.1 18.7 6.5 GT 5.2 1.4 5.0 15.7 8.8 TA 7.4 1.6 0.7 2.9 5.6 TG 8.8 18.9 3.7 3.7 7.2 Cisplatin Rep2 (-3)-(-4) (-2)-(-3) (-1)-(-2) (-1)-(1) (1)-(2) TT 6.0 1.9 1.3 3.8 5.8 TC 2.1 2.1 0.8 2.2 4.1 CT 7.1 1.7 4.9 1.0 6.5 CC 2.5 1.0 0.8 0.7 8.4 AA 10.4 2.6 1.3 2.7 8.7 AC 1.9 1.1 0.7 2.1 5.0 AG 8.9 33.6 6.0 3.2 8.1 AT 6.5 1.7 1.5 2.7 5.5 CA 11.2 2.3 1.2 1.3 7.7 CG 1.5 3.3 0.7 0.4 1.4 GA 9.8 2.9 7.3 20.5 6.0 GC 1.9 3.4 1.2 18.9 4.5 GG 8.9 20.2 62.7 19.0 6.5 GT 5.0 1.6 4.6 15.2 9.2 TA 7.6 1.8 0.8 2.9 5.7 TG 8.6 18.7 4.1 3.5 7.0

23

Oxaliplatin Rep1 (-3)-(-4) (-2)-(-3) (-1)-(-2) (-1)-(1) (1)-(2) TT 9.0 3.9 3.8 5.7 6.1 TC 3.3 3.8 2.3 3.4 4.0 CT 9.0 3.0 7.6 2.5 7.8 CC 3.5 2.3 2.1 1.6 10.0 AA 8.8 3.9 3.2 3.8 9.1 AC 2.9 2.1 2.0 1.7 5.2 AG 6.9 22.3 4.1 3.6 8.3 AT 9.1 3.1 3.2 2.5 6.3 CA 8.7 3.1 2.5 3.4 9.3 CG 1.0 3.8 0.6 1.0 1.6 GA 6.3 2.8 3.7 16.8 4.7 GC 2.5 4.6 2.1 22.0 4.4 GG 7.4 13.6 51.9 10.2 5.4 GT 6.1 2.2 4.8 11.6 5.7 TA 7.6 2.7 2.1 4.7 6.0 TG 7.8 22.8 3.9 5.4 6.1 Oxaliplatin Rep2 (-3)-(-4) (-2)-(-3) (-1)-(-2) (-1)-(1) (1)-(2) TT 8.8 2.0 1.7 4.6 5.8 TC 2.4 2.6 1.0 2.9 4.2 CT 9.8 1.6 7.2 1.1 9.2 CC 2.8 1.1 0.9 0.8 12.5 AA 8.4 2.1 1.5 2.1 8.9 AC 2.3 1.0 0.8 1.2 4.9 AG 6.7 27.1 3.3 2.0 7.7 AT 9.6 1.6 1.5 1.5 6.0 CA 9.3 1.8 1.2 1.5 11.0 CG 1.1 4.9 0.4 0.4 1.9 GA 6.4 1.8 3.2 20.3 3.9 GC 1.9 4.9 1.1 29.7 3.8 GG 8.2 16.0 67.9 11.0 4.3 GT 6.6 1.2 4.5 13.4 5.3 TA 7.7 1.5 0.9 3.6 5.7 TG 8.0 28.8 2.8 3.8 5.1

24

TableS2.DinucleotidefrequenciesatspecificpositioninXR-seqreads.Onlyreadsof26ntlengthwereusedforthisanalysis.Cisplatin XR-seq Rep1 (17-18) (18-19) (19-20) (20-21) (21-22) TT 5.350120035 2.317744776 0.718172576 1.332937183 2.958202774 TC 3.021537512 1.102780437 0.456586363 0.946426107 4.74323262 CT 9.326330915 4.67418824 1.051472956 1.374649293 3.707487615 CC 5.759760449 2.035778312 0.639850027 0.913014126 4.651138201 AA 6.109291016 3.49313812 2.509942972 5.042909549 4.295448622 AC 3.619173191 1.50948673 2.22282197 3.125262683 7.512144885 AG 10.9638881 16.73015119 12.42875503 7.419283895 9.434655521 AT 6.904649499 4.174673823 3.04189133 4.624519637 6.423850022 CA 8.522170066 3.969999001 1.06146483 2.428395532 2.105985767 CG 2.709965494 4.207795037 3.245138742 1.211739448 1.255883443 GA 5.322919932 9.759365003 15.68422928 18.63592941 6.936977706 GC 2.487289437 1.349881076 2.608540039 6.73579211 12.21517211 GG 6.76671934 19.05072436 37.55327017 33.65717932 19.09367884 GT 3.991198692 3.472519967 3.31349585 6.70629229 8.329178864 TA 5.953068854 2.980909176 0.956338681 1.558864564 1.20838239 TG 13.19191747 19.17086475 12.50802918 4.286804858 5.12858062 Cisplatin XR-seq Rep2

(17-18) (18-19) (19-20) (20-21) (21-22) TT 5.286587879 2.296428978 0.607421775 1.185720056 2.938130453 TC 2.984899054 1.068941063 0.384098469 0.883939507 4.362072401 CT 9.366312668 4.826545495 1.006859308 1.299078853 3.727651799 CC 5.713205613 2.066032082 0.619936586 0.868082231 4.495175061 AA 6.113653659 3.309740021 2.241576275 4.337871917 4.290170536 AC 3.741462139 1.479183309 2.009987493 3.111549979 6.878378578 AG 10.80339128 16.64304053 12.15094883 7.16961352 8.342948318 AT 7.040863838 4.114652253 2.692316761 4.182408425 6.013483349 CA 8.327661075 3.843083442 0.9882296 2.104068816 2.519182252 CG 2.677988209 4.288330884 3.346869885 1.213884859 1.293456257 GA 5.250520349 9.069817877 14.69166278 17.77608438 7.722726716 GC 2.584425097 1.347738926 2.471092211 7.171893652 12.98866386 GG 6.936547835 19.61291145 39.95334022 36.30616669 19.33354355 GT 4.234203678 3.470955663 3.118597657 7.034101176 8.693311209 TA 5.854781034 2.872188022 0.879975188 1.30695567 1.35149596 TG 13.08349659 19.69041 12.83708696 4.048580269 5.049609696

25

Oxaliplatin XR-seq Rep1 (17-18) (18-19) (19-20) (20-21) (21-22) TT 6.94017857 2.807392784 0.608301687 0.774051157 2.356301081 TC 2.947505461 1.088990394 0.305937101 0.628042286 3.588398813 CT 10.9939323 4.530829171 1.088108864 1.614634478 4.794398678 CC 5.516208983 1.825137043 0.59115826 1.330805668 6.719562985 AA 6.147870321 2.928139115 1.688499985 3.525763931 5.357903838 AC 3.716893977 1.285269876 1.340087086 2.302179903 6.640712627 AG 7.919834755 13.95087561 8.510168906 4.815346567 10.48503293 AT 9.119200241 4.952432209 2.064913042 2.523675406 4.796022546 CA 7.708244794 3.097044789 0.803537253 2.103279354 2.689147486 CG 2.918159852 5.5469542 3.373613041 1.648318126 1.893730875 GA 4.104789575 5.312838548 10.10432205 20.81930138 5.844159309 GC 2.819312705 1.657020106 4.459842856 11.83581217 14.83678858 GG 6.195101673 19.23135275 46.99375274 34.59902192 17.84467802 GT 4.678734014 3.950017283 2.741324475 6.379177196 6.806323896 TA 5.155826345 2.265646567 0.570581508 0.831327277 1.063169135 TG 13.11820643 25.57005955 14.75585115 4.269263184 4.283669207 Oxaliplatin XR-seq Rep2 (17-18) (18-19) (19-20) (20-21) (21-22) TT 6.517572508 2.652764575 0.646059225 0.758311531 2.092331946 TC 3.013541828 1.16317109 0.387253983 0.70420062 3.551743993 CT 11.0024725 4.682189634 1.218973125 1.763590814 4.923139092 CC 5.8261404 2.082799034 0.751484748 1.521219022 7.316419627 AA 5.814715533 2.822620419 1.892261316 3.919724208 4.896712329 AC 3.772981164 1.351974479 1.632437155 2.794104512 6.944905726 AG 7.781180187 13.31539748 8.329743634 5.210733385 10.94644667 AT 8.92336342 4.655827833 2.14076898 2.669666758 4.6862521 CA 7.440775504 3.151969493 0.930836044 2.249916969 2.692191292 CG 3.115845337 5.94486785 3.58089073 1.929001975 2.142649024 GA 4.20493654 5.935632548 11.17454178 20.48639986 5.858069853 GC 3.249214947 1.884240044 4.692573928 12.05487488 15.13516455 GG 6.438812464 18.96349996 45.19496568 32.98842686 17.23601636 GT 4.90421647 3.861418443 2.608322394 5.798235672 6.232228598 TA 4.685344627 2.084988624 0.596630849 0.818275787 0.956284627 TG 13.30888657 25.44663849 14.22225643 4.333317146 4.38944421

26

TableS3.SummarystatisticsforallDamage-seqsequencingsamplesinthisstudy.

SampleName TotalReads Reads after

filtering

Total Aligned Pairs

Unique Aligned Pairs

% Aligned Average Fragment Length

HLCisP_rep1 34514632 29424268 24378977 23654148 68.5 176.7 HLCisP_rep2 40418699 34552648 27785100 26970800 66.7 164.6 HLOxP_Rep1 29651260 14604091 10514412 10149131 34.2 209.3 HLOxP_Rep2 33016641 25643907 18324749 16370527 49.6 202.8

HLunD_Rep1 6889258 6788256 3171901 3151831 45.7 161.6 HLunD_Rep2 9175301 9060212 3733693 3711577 40.5 157.2

HLCisPvitro_Rep1 41928298 37969998 24872196 24335674 58.0 185.4 HLCisPvitro_Rep2 34580644 32120669 22012872 21586194 62.4 191.6

TableS4.SummarystatisticsforallXR-seqsequencingsamplesinthisstudy.

SampleName Mean Length

Total FASTQ Reads

Unique Aligned Reads

Percent Unique Aligned

GMCisP_XR3h_Rep1 26.4 80338902 35203560 43.8 GMCisP_XR3h_Rep2 27.1 83700727 35203560 42.1

OXP_XR_Rep1 26.74 99390886 37221727 37.4 OXP_XR_Rep2 27.97 124668994 39786126 31.9

2016 08 30 MaterialsAndMethods submission with ref · ! 4! Primer,extensionandpurification, NEBNextQ5HotStartHiFiPCRMasterMixwasusedforprimerextensioninthepresenceof!...

Documents