1 Supplementary Information SI Materials and Methods Cell lines Human lymphocyte GM12878 was purchased from the NIGMS Human Genetic Cell Repository (Coriell Institute) and cultured in RPMI 1640 medium (no phenol red) supplemented with 15% fetal bovine serum. Cells were maintained at 37°C in a 5% CO 2 humidified chamber. Oligonucleotides and adapteors Oligonucleotides for Ad1: AD1T: 5’phosGATCGGAAGAGCACACGTCTGAACTCCAGTCASpC3; AD1B: 5’NNNNNGACTGGTTCCAATTGAAAGTGCTCTTCCGATC*T. Oligonucleotides for Ad2: AD2T: 5’phosAGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTSpC3; AD2B: 5’ ACACTCTTTCCCTACACGACGCTCTTCCGATCTNNNNNSpC3. Oligonucleotides for primer extension: Bio3: 5’bioTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT. Above oligonucleotides were synthesized by IDT. PCR primers for Ad1 were ordered from Sigma: Pu: 5’ GACTGGTTCCAATTGAAAGTGCTCTTCCGATC*T; Pi: 5’ TGACTGGAGTTCAGACGTGTGCTCTTCCGATC*T. “*” indicated phosphorothioate bond. PCR primers for Ad2 and final library (Universal and Index primers for Illumina) were purchased from New England Biolabs. To prepare Ad1 or Ad2, AD1T & AD1B or AD2T & AT2B were annealed, respectively. 5nmol AD1T or AD2T and 6 nmol AD1B or AD2B were mixed together in 50 µL Hybridization Buffer (10 mM TrisHCl pH7.5, 100 mM NaCl, 0.1 mM EDTA) and boiled for 2 min, then slowly cooled down to 25°C.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
1
Supplementary Information
SI Materials and Methods
Cell lines
Human lymphocyte GM12878 was purchased from the NIGMS Human Genetic Cell Repository
(Coriell Institute) and cultured in RPMI 1640 medium (no phenol red) supplemented with 15%
fetal bovine serum. Cells were maintained at 37°C in a 5% CO2 humidified chamber.
Oligonucleotides and adapteors
Oligonucleotides for Ad1: AD1T: 5’-‐phos-‐GATCGGAAGAGCACACGTCTGAACTCCAGTCA-‐SpC3;
AD1B: 5’-‐NNNNNGACTGGTTCCAATTGAAAGTGCTCTTCCGATC*T. Oligonucleotides for Ad2: AD2T:
primers for Ad2 and final library (Universal and Index primers for Illumina) were purchased from
New England Biolabs.
To prepare Ad1 or Ad2, AD1T & AD1B or AD2T & AT2B were annealed, respectively. 5nmol AD1T
or AD2T and 6 nmol AD1B or AD2B were mixed together in 50 µL Hybridization Buffer (10 mM
Tris-‐HCl pH7.5, 100 mM NaCl, 0.1 mM EDTA) and boiled for 2 min, then slowly cooled down to
25°C.
2
Damage-‐Seq library preparation
Treatment of cells and isolation of fragmented genomic DNA
GM12878 cells were grown to ~8x105 cells/ml before treatment. To treat the cells with cisplatin
(Sigma) or oxaliplatin (LC Labs), fresh stocks were made every time before treatment. The drug
was dissolved in DMSO to 20 mM and immediately added to medium to a final concentration of
200 µM. Cells were further incubated at 37°C for 1.5 hour, then transferred to a pre-chilled 15 ml
tube on ice, collected by centrifugation, and washed by ice-cold PBS. Cell pellets were frozen at -
80°C.
Cell pellets were resuspended in 900 µL cold lysis buffer (10 mM Tris-‐HCl pH8.0, 1 mM EDTA,
250 mM NaCl, 0.5% TritonX-‐100 and 0.1% SDS) with 10 µL RNaseA (Sigma) and incubated on ice
for at least 10 min. Lysates were sonicated by Misonix Sonicator 3000 with a microtip on ice
water to generate fragments averaging 400 bp in length and then centrifuged at 14,000 rpm for
10 min to pellet debris. The supernatant was incubated with 10 µL Proteinase K (New England
Biolabs) at 55°C for 30 min, followed by phenol/chloroform extraction and ethanol precipitation.
The DNA pellet was dissolved in 100 µL 1xTE buffer, and fragments 200-‐700 bp in length were
selected by 0.5x/0.7x (50/70 µL) HighPrep PCR beads (MagBio) according to the manufacture’s
guidelines. DNA concentration was determined by NanoDrop 1000 (Thermo).
For naked DNA samples, DNA fragments were prepared from untreated GM12878 cells as
described above. For untreated samples, 1 µg size-‐selected DNA was used to prepare library by
NEBNext DNA preparation kit following the manufacture’s protocol (unlike the Damage-‐seq
protocol). For cisplatin treatment, 5 µg DNA were incubated with 20 µM cisplatin in a final
volume of 50 µL at room temperature for 15 min. Treated DNA were purified through a G50 spin
column (GE) immediately and subjected to Damage-‐seq.
3
End-‐repair, dA-‐tailing and first adaptor ligation
Size-‐selected DNA (5 µg) were used for End-‐repair and dA-‐tailing by NEBNext DNA preparation
kit following the manufacture’s instruction. Ad1 (500 pmol) was ligated to both ends by
NEBNext Quik Ligase for more than 12 hr under 16°C, then purified by HighPrep PCR beads and
eluted in 126 µL 0.1x TE.
Damage-‐specific immunoprecipitation by antibodies
Damaged DNA immunoprecipitation was performed as described previously with modification.
To denature DNA, 42 µL of 8M Urea were added (to a final concentration of 2M), followed by
boiling for 2 min and immediately putting on ice water for 2 min. Then 50 µg denatured
sonicated salmon sperm DNA (Stratagene) and 20 µL of 10x PEXB Buffer (see below) were added,
followed by incubation with antibody-‐coated beads which were prepared as described below in
totally 200 µL of 1x PEXB Buffer (1xPBS, 2 mM EDTA, 0.01% Triton X-‐100 and 0.025% BSA). A
slurry of 40 µL of anti-‐rat Dynabeads (11035, Thermo) was washed three times with 1x PEXB
Buffer, and then incubated with 10 µg carrier DNA, 1.5 µL anti-‐cisplatin modified DNA antibody
(ab103261, Abcam) in 100 µL of 1x PEXB Buffer for 3 hrs at 4°C. After incubation, beads were
washed by 1x PEBX Buffer and incubated with denatured DNA overnight at 4°C.
The beads were washed sequentially with PEXU Buffer (1xPBS, 2 mM EDTA, 0.01% Triton X-‐100
and 1.6M Urea), PEX Buffer (1xPBS, 2 mM EDTA, 0.01% Triton X-‐100), IP Buffer (20 mM Tris-‐Cl
pH 8.0, 2 mM EDTA, 150 mM NaCl, 1% Triton X-‐100, and 0.5% sodium deoxycholate), and TE
Buffer (10 mM Tris-‐Cl pH 8.0 and 1 mM EDTA). The fragments containing damage were eluted
by incubation with 100 µL of Elution Buffer (50 mM NaHCO3, 1% SDS) at 65°C for 10 min. The
eluted DNA was then isolated by phenol/chloroform extraction followed by ethanol
precipitation.
4
Primer extension and purification
NEBNext Q5 Hot Start HiFi PCR Master Mix was used for primer extension in the presence of
purified DNA and 30 pmol Bio3 in a thermocycler for 45s at 98°C followed by 5 min at 65°C. We
chose this enzyme because it has high fidelity and furthermore, it can carry out primer extension
at the same temperature used for annealing. Then 2 µL of Exo1 (New England Biolabs) were
added to degrade excessive primers at 37°C for 10 min. After HighPrep PCR beads purification,
DNA was denatured by boiling for 2 min and immediately putting on ice water for 2 min, then
incubated with 5 µL of Dynabeads MyOne Streptavidin C1 (Thermo) in 30 µL of 1x B&W Buffer (5
mM Tris-‐HCl pH 7.5, 0.5 mM EDTA, 1 M NaCl) at 4°C for 1 hr. Biotinylated DNA were eluted by a
short incubation (~10s) in 100 µL of nonionic water at 75°C and concentrated by ethanol
precipitation.
Second adaptor ligation, PCR amplification and high-‐throughput sequencing
To add the second adaptor to the 3’ end, purified DNA were incubated with 40 pmol of Ad2 in
10 µL of 1x Hybridization Buffer for 10 min at 65°C and then for 5 min at 16°C in a thermal cycler.
To perform ligation, 4 µL of 5x ligase buffer, 1 µL of T4 DNA ligase HC (Thermo), 1 µL of 50%
PEG8000 (New England Biolabs), and 4 µL of H2O were added to each reaction. The reactions
were incubated overnight at 16°C.
For quality check, one percent of ligation products were PCR-‐amplified with primers Pu/Pi
(Primers 1 for Ad1) or Universal/Index1 (Primers 2 for Ad2, E7350S, New England Biolabs). After
HighPrep PCR beads purification, ligated DNA were PCR-‐amplified by NEBNext Q5 Hot Start HiFi
PCR Master Mix for 11-‐13 cycles with NEBNext Multiplex Oligos for Illumina (New England
Biolabs). The PCR products were purified by HighPrep PCR beads and concentration was
determined by Pico Green (Thermo).
5
XR-‐Seq library preparation for cisplatin and oxaliplatin induced damages
XR-‐seq libraries were prepared as described (1) with modifications in the damage-‐specific
immunoprecipitation and in vitro reversal steps. GM12878 cells were treated with 200 µM
cisplatin or oxaliplatin for 3 hr and collected as described above. Cells were lysed and primary
excision products were pulled down by TFIIH co-‐immunoprecipitation and followed by adapter
ligation on both ends. The damage-‐specific immunoprecipitation with anti-‐cisplatin modified
DNA antibody was performed as described in Damage-‐seq with minor difference. A slurry of 25
µL of anti-‐rat Dynabeads and 1 µL anti-‐cisplatin modified DNA antibody were used per reaction.
Pre-‐incubated beads were then incubated with ligation products and 20 µg of denatured
sonicated salmon sperm DNA in 100 µL of 1x PEXB Buffer at 4°C overnight. After sequential
wash with PEX Buffer twice, IP Buffer once and TE Buffer once, the ligation products containing
damage were eluted by incubation with 100 µL of Elution Buffer at 65°C for 10 min. The eluted
DNA was purified by phenol/chloroform extraction and ethanol precipitation. To reverse
cisplatin or oxalipaltin-‐induced damages in vitro, damaged DNA were incubated with 200 mM of
NaCN at 65°C overnight, then purified through a G50 spin column and followed by ethanol
precipitation. Purified DNA were amplified by PCR and purified by native PAGE as described to
make the library.
Sequencing and genome alignments
All sequencing libraries were sequenced on HiSeq 2500 platform by the University of North
Carolina High-‐Throughput Sequencing Facility. Based on our previous experience with XR-‐seq, in
which 5 million uniquely mapped reads are sufficient to detect repair enrichment over genes,
we sequenced Damage-‐seq to at least 10 million mapped reads per sample. Generally, this
6
required multiplexing ≤ 4 samples per lane for Damage-‐seq and ≤ 8 samples per lane for XR-‐seq.
Summary of alignments are available in SI appendix Tables S3,4.
Damage-‐seq sequence analysis. Libraries were sequenced to produce paired-‐end 50nt reads,
allowing us to establish unique aligned reads and distinguish between damage hot spots and
amplification artifacts. Products of primer extension of undamaged DNA were filtered using
cutadapt (2) filtering the adapter sequence 5’-‐GACTGGTTCCAATTGAAAGTGCTCTTCCGATCT-‐3’.
Paired end reads were aligned to the hg19 reference genome with bowtie using command line
options –q -‐-‐nomaqround -‐-‐phred33-‐quals –S –X 1000 –m 4 –seed 123. The damage position
and nucleotide composition were then determined as the 2 nt upstream of the first read start
using samtools and bedtools.
XR-‐seq. XR-‐seq data was analyzed as previously reported. Flanking adapter sequences were
removed using trimmomatic (3). Reads were aligned to the hg19 human reference genome
using bowtie (4) with the command options -‐q -‐-‐nomaqround -‐-‐phred33-‐quals -‐m 4 -‐n 2 -‐e 70 -‐l
20 -‐-‐best –S. Uniquely aligned reads were obtained using samtools.
Data visualization
For comparison of the DNA damage and repair signal, we normalized all the count data by the
sequencing depth and data is available for viewing as a track hub on the UCSC genome browser
(https://genome.ucsc.edu/cgi-‐bin/hgGateway) by pasting the link:
http://trackhubs.its.unc.edu/sancarlb/Platinum_damage/hub.txt. The raw data and bigwig
tracks are available with GEO accession GSE82213. (www.ncbi.nlm.nih.gov/geo/).
7
ENCODE data
GM12878 stranded RNA-‐seq (ENCODE DCC accessions ENCSR00CUH), and DNase-‐seq (accession
ENCSR000EJD), as well as chromHMM chromatin state segmentation (UCSC accession
wgEncodeEH000784) and nucleosome data (Mnase-‐seq, accession ENCSR000CXP) were
downloaded from the ENCODE portal (http://genome.ucsc.edu/ENCODE/) or viewed on the
UCSC browser.
Di-‐nucleotide frequencies
The di-‐nucleotide positions for the hg19 reference genome were established using oligoMatch
from the UCSC tools.
Chromatin state analysis
Bedtools (5) coverage was used to calculate the damage and repair levels over each of the 15
predicted chromatin states defined by the ChromHMM algorithm (6). Merged data from two
biological replicates was used. Values were normalized per million mapped reads and per Kb of
interval length and plotted with R.
Plotting average damage and repair profiles
Average damage and repair profiles from the merged biological replicates was calculated over
GM12878 DNase peaks using bedtools (5) coverage. Counts were normalized per million
mapped reads and plotted with R. For plots, data was binned into 50nt windows.
For average XR-‐seq profiles relative to the annotated TSS or TES, we limited the gene list to
genes that do not have overlapping or neighboring genes for at least 6000bp upstream or
downstream on either strand and were at least 10,000bp in length. The highest quartile of
expressed genes from GM12878 (n=442) cells was identified as previously described. Briefly, we
8
calculated FPKM for the two mapped RNA-‐seq replicates using cufflinks (7) and the UCSC hg19
genes.gtf. Merged biological replicates were used for plotting. Read counts were calculated
from the aligned .bam files using bedtools coverage, normalized per million mapped reads and
plotted with R. For plots, data was binned into 50nt windows.
For nucleosome analysis, nucleosome positions were determined with DNAPOS2 (8). Average
damage and repair signal from merged biological replicates surrounding the 2,000,000 randomly
picked nucleosome center position was calculated using the R GenomicRanges (9) and
genomation packages (10). For plotting, data was binned to 5nt windows.
References
1. Hu, J., Adar, S., Selby, C.P., Lieb, J.D. & Sancar, A. Genome-‐wide analysis of human global and transcription-‐coupled excision repair of UV damage at single-‐nucleotide resolution. Genes Dev 29, 948-‐960 (2015).
2. Martin M (2011) Cutadapt removes adapter sequences from high-‐throughput sequencing reads. EMBnet. journal 17(1), pp-10.
3. Bolger AM, Lohse M, & Usadel B (2014) Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30(15):2114-‐2120.
4. Langmead B, Trapnell C, Pop M, & Salzberg SL (2009) Ultrafast and memory-‐efficient alignment of short DNA sequences to the human genome. Genome Biol 10(3):R25.
5. Quinlan AR & Hall IM (2010) BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26(6):841-‐842.
6. Ernst, J. et al. Mapping and analysis of chromatin state dynamics in nine human cell types. Nature 473, 43-‐49 (2011).
7. Trapnell C, et al. (2010) Transcript assembly and quantification by RNA-‐Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol 28(5):511-‐515.
8. Chen K, et al. (2013) DANPOS: dynamic analysis of nucleosome position and occupancy by sequencing. Genome Res 23(2):341-‐351.
9. Lawrence M, et al. (2013) Software for computing and annotating genomic ranges. PLoS Comput Biol 9(8):e1003118.
10. Akalin A, Franke V, Vlahovicek K, Mason CE, & Schubeler D (2015) Genomation: a toolkit to summarize, annotate and visualize genomic intervals. Bioinformatics 31(7):1127-‐1129.
SIappendixFigureS1DetailedSchema-cofDamage-Seq.
9
A. B.
SIappendixFigureS2Damage-seqandXR-seqofoxalipla-n.A)AgarosegelanalysisofDamage-seqlibraries. DNA fragments from oxalipla-n-treated cells were amplified with sets of primerscomplementary to the1stand2ndadapters.B)Na-vepolyacrylamidegelelectrophoresisofXR-seqlibrariesshowingoxalipla-nadductreversalbyNaCNisnecessaryforPCRamplifica-onofsequencinglibraries.
10
Cispla-n
Rep1 Rep2
GA
TC
Oxalipla-n
Rep1 Rep2
A.
B.
SI appendix Figure S3. Single nucleo-de frequencies in damage-seq reads. Supplemental to mainFigure2a.Nucleo-defrequenciesareploKedforposi-ons3ntupstreamofthereadstartand10ntintothe read for each damage type and an undamaged control.A) Data is represented for the secondreplicateofcispla-nDamage-seqnotshowninthemainFigure.LeNpanelisfrommainFigure2a.B)Data for two biological replicates of oxalipla-n Damage-seq.C)Data is represented for the secondreplicateofsequencedundamagedDNA.notshowninthemainFigure.LeNpanelisfrommainFigure2c.
C.
Control
Rep1 Rep2
11
−5 −4 −3 −2 −1 G G 1 2 3 4 5
Position relative to dimers
Nuc
leot
ide
frequ
ency
(%)
0
20
40
60
80
100
Cispla-ndamage-seqreplicate2
−5 −4 −3 −2 −1 G G 1 2 3 4 5
Position relative to dimers
Nuc
leot
ide
frequ
ency
(%)
0
20
40
60
80
100
Randomgenomicloci
−5 −4 −3 −2 −1 G G 1 2 3 4 5
Position relative to dimers
Nuc
leot
ide
frequ
ency
(%)
0
20
40
60
80
100
GA TC
−5 −4 −3 −2 −1 G G 1 2 3 4 5
Position relative to dimers
Nuc
leot
ide
frequ
ency
(%)
0
20
40
60
80
100
Oxalipla-ndamage-seq
A. B.
C.Rep1 Rep2
SI appendix Figure S4. Sequence context for the 5nt flanking G-G di-nucleo-des at the -1 and -2posi-ons rela-ve to the read start for the second biological replicate of cispla-n damage seq (A),randomlyselected26mersinthehg19referencegenome(B)andoxalipla-nDamage-seq(C).
SIappendixFigureS6.Singlenucleo-deresolu-onmappingofrepair.SupplementaltomainFig2e,f.A)Frequencyoftherelevantdi-nucleo-de,G-G,ateachposi-onof26ntXR-seqexcisionfragmentsinthe second replicate of cispla-n- and for oxalipla-n- XR-seq. B) The corresponding nucleo-defrequenciesatthe5ntflankingtheG-Gdimeratposi-on19-20.
SI appendix Figure S8. Whole genome map of damage and repair of oxalipla-n damage.Screenshots of damage and repair signals, separated by strand, for all the chromosomes of thehumangenome.
SI appendix Figure S9. Genome-wide paKerns of damage and repair of oxalipla-n damage. A)Representa-ve screen shot of damage and repair signals, separated by strand, for the en-rechromosome 17. B) Zoom-in on a ~80kbp segment of chromosome 17 which includes TP53. C)Representa-veXR-seqandDamage-seqreadsthatcaptureaspecificPt-d(GpG)damage.
18
CisplaGnCisplaGn
OxaliplaGnOxaliplaGn
DamageTS DamageNTSRepairTS RepairNTS
GGfrequency
B.
C. D.
G.
OxaliplaGn OxaliplaGn
E. F.
A.
SI appendix Figure S10. Repair and damage at transcribed genes. Supplemental tomain Fig3d-f.A)Oxalipla-n damage and repair profiles at the transcribed and non-transcribed strands are ploKedsurroundingtheTSSofhighlyexpressedgenes,B)similartoa,exceptwithazoomed-inscaleforthedamagelevels.C)Cispla-ndamageandrepairareploKedsurroundingtheTESofhighlyexpressedD)similartoc,exceptwithazoomed-inscaleforthedamagelevels.E)Sameasc,exceptforoxalipla-ndamageandrepair,F)sameasd,exceptforoxalipla-ndamage.G)G-GfrequencyploKedsurroundingtheTESofhighlyexpressedgenes.
19
CisplaGn OxaliplaGn
RepairDamage
RepairDamage
A. B.
SIappendixFigureS11.OpenregionsinthegenomehavehigherrepairbutliKledifferenceindamage.A) Plofng cispla-n damage and repair around DNAse-HS sites in GM12878. B) Same as c, exceptploKedisoxalipla-ndamageandrepair.
20
3.Poisedpromoter
4.Strongenhancer
1.Ac-vepromoter
2.Weakpromoter
6.Weakenhancer
8.Insulator
9.Txntransi-on
11.Weaktxn
12.Repressed
5.Strongenhancer
7.Weakenhancer
10.Txnelonga-on
13.Heterochroma-c
14.Repe--ve
15.Repe--ve
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0.0
0.5
1.0
1.5
2.0
coun
ts p
er K
b pe
r mil
read
s
B. C.1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0.0
0.5
1.0
1.5
2.0
coun
ts p
er K
b pe
r mil
read
s
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0.0
0.5
1.0
1.5
2.0
coun
ts p
er K
b pe
r mil
read
s
CisplaGndamage GGfrequency
OxaliplaGndamage
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0123456
coun
ts p
er K
b pe
r mil
read
s
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0123456
coun
ts p
er K
b pe
r mil
read
s
OxaliplaGn
012
3456
012
Readsp
erKbpe
rmilmappe
dRe
pair
Damage
A.
SI appendix Figure S12. Damage and repair at different chroma-n states.A) Analysis of oxalipla-nrepair (top) and damage (bo*om) levels across the 15 annotated chroma-n states in GM12878 B)Small varia-ons in cispla-n and oxalipla-n damage levels at the different chroma-n states areobserved when plofng damage on a smaller scale. C) Varia-ons in the frequency of G-G in thedifferentstatesmirrorsthevaria-onindamagelevels.