1 SUPPLEMENTAL DATA FOR Integrated genetic and epigenetic analysis of childhood acute lymphoblastic leukemia Maria E. Figueroa,1 Shann-Ching Chen,2 Anna K. Andersson,2 Letha A. Phillips,2 Yushan Li,3 Jason Sotzen,1 Mondira Kundu,2 James R. Downing,2 Ari Melnick,3 and Charles G. Mullighan2 1 Department of Pathology, University of Michigan, Ann Arbor, Michigan, USA. 2Department of Pathology, St. Jude Children’s Research Hospital, Memphis, Tennessee, USA. 3Department of Medicine, Hematology Oncology Division, Weill Cornell Medical College, New York, New York, USA. Key words: Acute lymphoblastic leukemia, childhood ALL, epigenetics, DNA methylation, Integrative analysis * Correspondence: Charles G. Mullighan Department of Pathology MS 342, Room D4047E St. Jude Children's Research Hospital 262 Danny Thomas Place Memphis, TN 38105-3678 Phone: (901) 595-3387 FAX: (901) 595-5947 Email: [email protected]Ari Melnick Weill Cornell Medical College, 1300 York Ave, Room C620A, New York, NY 10065 Phone: (212) 746-7643 Fax: (212) 746-8866 Email: [email protected]SUPPLEMENTAL INFORMATION CONTENTS Supplementary Figure 1: Technical validation of HELP microarrays 3 2: Clustering of HELP methylation data with data derived from normal CD3+ T cells 4 Supplementary Figure 2 Clustering of HELP methylation data with data derived from normal CD3+ T cells. The normal T cell samples cluster distinctly from T-ALL and B-lineage samples. The normal T cell arrays were performed in a different batch from the remaining samples; to exclude temporal batch effects as
27
Embed
Integrated genetic and epigenetic analysis of childhood ...dm5migu4zj3pb.cloudfront.net/manuscripts/66000/... · 1 SUPPLEMENTAL DATA FOR Integrated genetic and epigenetic analysis
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
1
SUPPLEMENTAL DATA FOR
Integrated genetic and epigenetic analysis of childhood acute lymphoblastic leukemia
Maria E. Figueroa,1 Shann-Ching Chen,2 Anna K. Andersson,2 Letha A. Phillips,2 Yushan Li,3
Jason Sotzen,1 Mondira Kundu,2 James R. Downing,2 Ari Melnick,3 and Charles G. Mullighan2
1 Department of Pathology, University of Michigan, Ann Arbor, Michigan, USA. 2Department of Pathology, St. Jude Children’s Research Hospital, Memphis, Tennessee, USA. 3Department of Medicine, Hematology Oncology Division, Weill Cornell Medical College, New York, New York, USA.
Key words: Acute lymphoblastic leukemia, childhood ALL, epigenetics, DNA methylation, Integrative analysis
*Correspondence: Charles G. Mullighan Department of Pathology MS 342, Room D4047E St. Jude Children's Research Hospital 262 Danny Thomas Place Memphis, TN 38105-3678 Phone: (901) 595-3387 FAX: (901) 595-5947 Email: [email protected] Ari Melnick Weill Cornell Medical College, 1300 York Ave, Room C620A, New York, NY 10065 Phone: (212) 746-7643 Fax: (212) 746-8866 Email: [email protected] SUPPLEMENTAL INFORMATION CONTENTS Supplementary Figure 1: Technical validation of HELP microarrays 3 2: Clustering of HELP methylation data with data derived from normal CD3+ T cells 4 Supplementary Figure 2 Clustering of HELP methylation data with data derived from normal CD3+ T cells. The normal T cell samples cluster distinctly from T-ALL and B-lineage samples. The normal T cell arrays were performed in a different batch from the remaining samples; to exclude temporal batch effects as
2
a cause of the clustering observed, 4 leukemic samples were rerun, and these clustered with the data from the sample samples ran previously. Clustering was performed with 5535 probesets (selected at SD>0.9, chosen using the optimal Rand index metric).
3
Supplementary Figure 3: Hierarchical clustering of methylation data using different numbers of probe sets 5 Supplementary Figure 4 Patterns of methylation in the lymphoid signaling genes CD3 and CD79B in normal and leukemic T and B cells 7 5: MassARRAY validation of common epigenetic signature genes 8
4
Supplementary Figure 5 MassARRAY validation of common epigenetic signature genes. 5 genes were randomly selected from the common epigenetic signature for validation by MassARRAY EpiTYPER. For each of the 5 validated genes we depict (from top to bottom): Schematic representation of the RefSeq gene and where applicable its associated CpG island, along with the aligned location of the HELP probesets (grey) and the target regions validated by MassARRAY (brown). Below the schematic figure, heatmap representations depict the methylation status for normal B cells (NBM, left) and B ALL samples (B ALL, right) for each MassArray covered region. Each row represents a CpG site in the region, and each column represents a sample. CpG sites with missing vales for more than 5 cases were excluded from the analysis. (A) MOS, (B) HOXA6, (C) SCG5, (D) ELAVL2 and (E) ELF5.
5
6
Supplementary Table 1: Listing of patient, sample and array characteristics (provided as separate Excel Workbook) Supplementary Table 2: Listing of differentially methylated regions for each subtype of ALL (provided as separate Excel workbook). Supplementary Table 3: Top networks enriched in each subtype-specific DNA methylation profile. 11 Supplementary Table 4: Complete listing of correlation data between differentially methylated probe sets and gene expression probe sets in Refseq centric analysis (provided as separate Excel workbook) Supplementary Table 5: Fisher exact test results showing proportion of hypermethylated genes that are downregulated (5A) or upregulated (5B) and the proportion of downregulated (5A) or upregulated (5B) genes that are hypermethylated. 16 Supplementary Table 6: Summary of gene set enrichment analyses examining enrichment of hypo- and hypermethylated gene sets in gene expression signatures. 18 Supplementary Table 7. Relationship between gene methylation and expression by chromosome in Hyperdiploid samples (provided as separate excel workbook Supplementary Table 8: Overlap of normal T- v B-cell and leukemic T- v B-cell methylation signatures 19 Supplementary Table 9: Genes in the common methylation signature of ALL 20 Supplementary Table 10: Genes with recurrent genetic lesions in childhood ALL. 22 11: MassARRAY primers (provided as separate Excel workbook)
7
Supplementary Figure 1 Technical validation of the HELP microarray by quantitative bisulfate sequencing by MassARRAY. Correlation between HELP log2 ratio (x-axis) and percent methylation as measured by MassARRAY EpiTyping (y-axis), performed for 22 randomly selected HELP probe sets on 8 randomly selected cases. Pearson correlation coefficient: -0.91, p-value < 2.2e-16.
8
Supplementary Figure 2
Clustering of HELP methylation data with data derived from normal CD3+ T cells. The normal T cell samples cluster distinctly from T-ALL and B-lineage samples. The normal T cell arrays were performed in a different batch from the remaining samples; to exclude temporal batch effects as a cause of the clustering observed, 4 leukemic samples were rerun, and these clustered with the data from the sample samples ran previously. Clustering was performed with 5535 probesets (selected at SD>0.9, chosen using the optimal Rand index metric).
9
Supplementary Figure 3
Hierarchical clustering of 167 childhood ALL cases and 19 normal B cells based on their DNA methylation profiles based on varying number of probe sets. Dendrogram representation of unsupervised analysis. Hierarchical clustering was performed on probe sets with (A) SD>0.8, (B) SD>0.9, (C) SD> 1.1 and (D) SD > 1.2 across all patients in to determine the natural segregation of childhood ALL cases and normal B cell controls based on their DNA methylation profiles.
10
11
Supplementary Figure 4 Patterns of methylation in the lymphoid signaling genes CD3 and CD79B in normal and leukemic T and B cells
12
Supplementary Figure 5 MassARRAY validation of common epigenetic signature genes. 5 genes were randomly selected from the common epigenetic signature for validation by MassARRAY EpiTYPER. For each of the 5 validated genes we depict (from top to bottom): Schematic representation of the RefSeq gene and where applicable its associated CpG island, along with the aligned location of the HELP probesets (grey) and the target regions validated by MassARRAY (brown). Below the schematic figure, heatmap representations depict the methylation status for normal B cells (NBM, left) and B ALL samples (B ALL, right) for each MassArray covered region. Each row represents a CpG site in the region, and each column represents a sample. CpG sites with missing vales for more than 5 cases were excluded from the analysis. (A) MOS, (B) HOXA6, (C) SCG5, (D) ELAVL2 and (E) ELF5.
13
14
Supplementary Table 1 Listing of patient, sample and array characteristics. See separate excel workbook. Supplementary Table 2: Listing of differentially methylated regions for each subtype of ALL. See separate excel workbook.
15
Supplementary Table 3. Top networks enriched in each subtype-specific DNA methylation profile
Results of Ingenuity pathway analysis of the methylation signature of each subtype of ALL Supplementary Table 3A. ETV6-RUNX1 B-ALL
ID Molecules in Network Score Focus Molecules Top Functions
Lipid Metabolism, Molecular Transport, Small Molecule Biochemistry
19
Supplementary Table 4
Complete listing of correlation data between differentially methylated probe sets and gene expression probe sets in Refseq centric analysis. Analyses of significance of correlation of methylation and expression are provided in Supplementary Table 5 (Fisher exact tests) and Supplementary Table 6 (Gene Set Enrichment Analysis) See separate Excel workbook.
20
Supplementary Table 5
Fisher exact test results for correlation between differential expression and methylation. Supplementary Table 5A shows correlation between downregulated and hypermethylated genes. The first 8 data rows of the table represent data for genes downregulated in each subgroup of ALL (e.g. CRLF2, ETV6-RUNX1 etc) compared to the comparison group (e.g. non-CRLF2r B-ALL, or non-ETV6-RUNX1 ALL). The second 8 rows show data for genes downregulated in the comparison group compared to each specific subtype. Significant Fisher P values are highlighted in pink
Downregulated, hypermethylated (A
) Dow
nreg
ulat
ed
FDR
<0.1
and
hy
perm
ethy
late
d FD
R<0
.1
(B) D
ownr
egul
ated
FD
R>0
.1 a
nd
hype
rmet
hyla
ted
FDR
<0.1
(C) D
ownr
egul
ated
FD
R<0
.1 a
nd
hype
rmet
hyla
ted
FDR
>0.1
(D) D
ownr
egul
ated
FD
R>0
.1 a
nd
hype
rmet
hyla
ted
FDR
>0.1
Fish
er P
Frac
tion
of a
ll hy
perm
ethy
late
d ge
nes
that
are
do
wnr
egul
ated
(A
/A+B
)
Frac
tion
of a
ll do
wnr
egul
ated
gen
es
that
are
hy
perm
ethy
late
d (A
/A+C
)
B_CRLF2 v Non_B_CRLF2 0 52 17 12855 1 0.00 0.00
B_ETV6 v Non_B_ETV6 232 1336 1563 9793 0.28 0.15 0.13
B_H50 v Non_B_H50 1363 5146 968 5447 4.63E-18 0.21 0.58
B_MLLr v Non_B_MLLr 276 1656 881 10111 5.86E-17 0.14 0.24
B_ERGdel v Non_B_ERGdel 201 1204 1032 10487 1.10E-09 0.14 0.16
B_Ph v Non_B_Ph 163 1720 1001 10040 0.60 0.09 0.14
B_TCF3 v Non_B_TCF3 87 175 1999 10663 6.25E-12 0.33 0.04
T_all v B_all 629 2190 1248 8857 4.89E-37 0.22 0.34
Non_B_CRLF2 v B_CRLF2 1 57 14 12852 0.07 0.02 0.07
Non_B_ETV6 v B_ETV6 614 2432 1366 8512 1.53E-16 0.20 0.31
Non_B_H50 v B_H50 433 2383 1446 8662 0.16 0.15 0.23
Non_B_MLLr v B_MLLr 474 2957 1117 8376 0.002 0.14 0.30
Non_B_ERGdel v B_ERGdel 161 1140 1119 10504 0.002 0.12 0.13
Non_B_Ph v B_Ph 315 2408 791 9410 9.83E-10 0.12 0.28
Non_B_TCF3 v B_TCF3 25 162 1364 11373 0.23 0.13 0.02
B_all v T_all 1102 3779 1451 6592 4.78E-10 0.23 0.43
21
Supplementary Table 5B shows correlation between upregulated and hypermethylated genes. The first 8 data rows of the table represent data for genes downregulated in each subgroup of ALL (e.g. CRLF2, ETV6-RUNX1 etc) compared to the comparison group (e.g. non-CRLF2r B-ALL, or non-ETV6-RUNX1 ALL). The second 8 rows show data for genes upregulated in the comparison group compared to each specific subtype. Significant Fisher P values are highlighted in pink.
Upregulated, hypermethylated (A
) Upr
egul
ated
FD
R<0
.1 a
nd
hype
rmet
hyla
ted
FDR
<0.1
(B) U
preg
ulat
ed
FDR
>0.1
and
hy
perm
ethy
late
d FD
R<0
.1
(C) U
preg
ulat
ed
FDR
<0.1
and
hy
perm
ethy
late
d FD
R>0
.1
(D) U
preg
ulat
ed
FDR
>0.1
and
hy
perm
ethy
late
d FD
R>0
.1
Fish
er P
Frac
tion
of a
ll hy
perm
ethy
late
d ge
nes
that
are
upr
egul
ated
(A
/A+B
)
Frac
tion
of a
ll up
regu
late
d ge
nes
that
ar
e hy
perm
ethy
late
d (A
/A+C
)
B_CRLF2 v Non_B_CRLF2 0 52 15 12857 1 0.00 0.00 B_ETV6 v Non_B_ETV6 209 1359 1771 9585 0.02 0.13 0.11 B_H50 v Non_B_H50 817 5692 1062 5353 1.14E-10 0.13 0.43 B_MLLr v Non_B_MLLr 222 1710 1369 9623 0.24 0.11 0.14 B_ERGdel v Non_B_ERGdel 130 1275 1150 10369 0.42 0.09 0.10 B_Ph v Non_B_Ph 141 1742 965 10076 0.07 0.07 0.13 B_TCF3 v Non_B_TCF3 28 234 1361 11301 1 0.11 0.02
T_all v B_all 490 2329 2063 8042 0.003 0.17 0.19
Non_B_CRLF2 v B_CRLF2 0 58 17 12849 1 0.00 0.00 Non_B_ETV6 v B_ETV6 397 2649 1398 8480 0.12 0.13 0.22 Non_B_H50 v B_H50 455 2361 1876 8232 0.003 0.16 0.20 Non_B_MLLr v B_MLLr 291 3140 866 8627 0.26 0.08 0.25 Non_B_ERGdel v B_ERGdel 116 1185 1117 10506 0.46 0.09 0.09 Non_B_Ph v B_Ph 243 2480 921 9280 0.88 0.09 0.21 Non_B_TCF3 v B_TCF3 37 150 2049 10688 0.19 0.20 0.02
B_all v T_all 530 4351 1347 6696 8.81E-21 0.11 0.28
22
Supplementary Table 6. Summary of gene set enrichment analyses examining enrichment of hypo- and hypermethylated gene sets in gene expression signatures. G1 refers to the specific ALL subtype, G2 to the comparison group
Group comparison
Enrichment of G1 hypomethylated gene set in G1 vs. G2 gene signature
Enrichment of G1 hypermethylated gene set in G2 vs. G1 gene signature
CRLF2r vs non-CRLF2r B-ALL
No (FDR=0.89) No (FDR 1.0)
ETV6-RUNX1 vs non-ETV6-RUNX1 B-ALL
Yes (P<0.0001; FDR=0.18) No (FDR 0.85)
High hyperdiploid vs non high hyperdiploid B-ALL
No (FDR=0.78) No (FDR 0.96)
MLLr vs non-MLLr B-ALL No (FDR=0.53) Yes (P<0.0001, FDR 0.0016)
ERG B-ALL vs non-ERG B-ALL
No (FDR=0.65) Yes (P<0.0001, FDR=0.07)
BCR-ABL1 B-ALL vs non-BCR-ABL1 B-ALL
Yes (P<0.0001, FDR=0.0064)
No (FDR 0.87)
TCF3-PBX1 vs non-TCF3-PBX1 B-ALL
Yes (P=0.003, FDR=0.11) No (FDR 0.71)
T-ALL vs B-ALL Yes (P<0.0001, FDR=0.013) Yes (P<0.0001, FDR=0.00097)
Supplementary Table 7. Relationship between gene methylation and expression by chromosome in Hyperdiploid samples (provided as separate excel workbook).
23
Supplementary Table 8. Overlap of normal T- v B-cell and leukemic T- v B-ALL methylation signatures
Comparison Signature 1
Probe Number Signature 2 Probe Number
Overlap Probe Number
Signature 1 Residual Probe Number
Signature 2 Residual Probe Number
Signature 1 Residual/Probe Number
Overlap Ratio
Subtract Hyper signatures [Normal T v B] (FDR=0.1, Delta1) from [T- v B-ALL] (FDR=0.1, Delta1)
897 2463 413 484 2050 54.0% 39.7%
Subtract Hypo signatures [Normal T v B] (FDR=0.1, Delta1) from [T- v B-ALL] (FDR=0.1, Delta1)
422 1403 110 312 1293 73.9%
Subtract Hyper signatures [Normal T v B] (FDR=1, Delta1) from [T- v B-ALL] (FDR=1, Delta1)
897 2545 413 484 2132 54.0% 39.7%
Subtract Hyper signatures [Normal T v B] (FDR=1, Delta1) from [T- v B-ALL] (FDR=1, Delta1)
422 1403 110 312 1293 73.9%
Subtract Hyper signatures [Normal T v B] (FDR=1, Delta0.5) from [T- v B-ALL] (FDR=1, Delta0.5)
2214 5728 1086 1128 4642 50.9% 38.9%
Subtract Hyper signatures [Normal T v B] (FDR=1, Delta0.5) from [T- v B-ALL] (FDR=1, Delta0.5)
1923 6354 522 1401 5832 72.9%
Subtract Hyper signatures [Normal T v B] (FDR=1, Delta2) from [T- v B-ALL] (FDR=1, Delta2)
181 345 59 122 286 67.4% 31.3%
Subtract Hyper signatures [Normal T v B] (FDR=1, Delta2) from [T- v B-ALL] (FDR=1, Delta2)
46 175 12 34 163 73.9%
24
Supplementary Table 9. Genes in the common methylation signature of ALL
Methylation probeset Chr Start End Gene mRNA Refseq ID Status MSPI0406S00033551 chr1 43765535 43765963 TIE1 NM_005424 Hypermethylated MSPI0406S00047441 chr1 99729433 99730057 LPPR4 NM_001166252 / NM_014839 Hypermethylated