Top Banner
Tumor Biology and Immunology Transcriptomic Differences between Primary Colorectal Adenocarcinomas and Distant Metastases Reveal Metastatic Colorectal Cancer Subtypes Yasmin Kamal 1 , Stephanie L. Schmit 2 , Hannah J. Hoehn 2 , Christopher I. Amos 1,3 , and H. Robert Frost 1 Abstract Approximately 20% of colorectal cancer patients with colo- rectal adenocarcinomas present with metastases at the time of diagnosis, and therapies that specially target these metastases are lacking. We present a novel approach for investigating transcriptomic differences between primary colorectal adeno- carcinoma and distant metastases, which may help to identify primary tumors with high risk for future dissemination and to inform the development of metastasis-targeted therapies. To effectively compare the transcriptomes of primary colorectal adenocarcinoma and metastatic lesions at both the gene and pathway levels, we eliminated tissue specicity of the "host" organs where tumors are located and adjusted for confounders such as exposure to chemotherapy and radiation, and identi- ed that metastases were characterized by reduced epithelialmesenchymal transition (EMT) but increased MYC target and DNA-repair pathway activities. FBN2 and MMP3 were the most differentially expressed genes between primary tumors and metastases. The two subtypes of colorectal adenocarcino- ma metastases that were identied, EMT inammatory and proliferative, were distinct from the consensus molecular subtype (CMS) 3, suggesting subtype exclusivity. In summary, this study highlights transcriptomic differences between pri- mary tumors and colorectal adenocarcinoma metastases and delineates pathways that are activated in metastases that could be targeted in colorectal adenocarcinoma patients with met- astatic disease. Signicance: These ndings identify a colorectal adenocar- cinoma metastasis-specic gene-expression signature that is free from potentially confounding background signals coming from treatment exposure and the normal host tissue that the metastasis is now situated within. Introduction Roughly 20% of individuals with colorectal adenocarcinoma present with metastatic disease at the time of diagnosis, and colorectal adenocarcinoma is the primary cause of mortality due to cancer (1, 2). In colorectal adenocarcinoma, the liver (70%) is the most common site of disease metastasis followed by the lung (32%47%; ref. 3). Although colorectal adenocarcinoma metas- tases are aggressively treated with some combination of chemo- therapy, curative-intent surgical resection (4), biologics, such as epidermal growth factor (EGFR) inhibitors (5), and immunother- apy (for a subgroup of patients with mismatch-repair deciency; ref. 6), metastasis-targeted therapies are severely lacking. There- fore, understanding the dening features of metastatic tumor cells in distal organs is valuable for the development of targeted drugs and individualized therapies for patients with metastatic disease. One approach for characterizing the biology of metastatic lesions has been to compare primary tumors and metastatic lesions of the same cancer type (7). However, this is limited by the need for biopsies of normal host organ tissue where metastases are located such that the transcriptomic proles of metastases can be normal- ized (7, 8). One interesting survey evaluated primary versus metastatic sites and found that expression studies of metastases obtain signatures that partially reect the host tissue but have additional signatures (9). This nding highlights the need for considering the metastatic site during analyses. Approaches com- paring primary tumors and metastatic lesions often fail to address the role of treatment exposure in altering tumor transcriptomic proles (9, 10). This is particularly true for metastases, as biospeci- mens of metastases obtained in the clinical setting are usually drawn from patients heavily treated with chemotherapy and/or radiation prior to surgical resection (4, 11). To identify metastasis- specic features free from potentially confounding signals, we developed a novel approach for comparing primary tumors and metastases that takes both normal host tissue expression, anatomic origin of the tumors, and treatment exposure status of tumors into consideration as all three of these factors can substantially 1 Department of Biomedical Data Sciences, Geisel School of Medicine at Dartmouth, Hanover, New Hampshire. 2 Department of Cancer Epidemiology, H. Lee Moftt Cancer Center and Research Institute, Tampa, Florida. 3 Dan L. Duncan Comprehensive Cancer Center at Baylor College of Medicine, Hous- ton, Texas. Note: Supplementary data for this article are available at Cancer Research Online (http://cancerres.aacrjournals.org/). Y. Kamal and S.L. Schmit contributed equally to this article. Corresponding Authors: H. Robert Frost, Dartmouth College, HB 7936, Hanover, NH 03755. Phone: 603-667-1884; E-mail: [email protected]; and Christopher I. Amos, Institute for Clinical and Translational Research, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX 77030; E-mail: [email protected] Cancer Res 2019;79:422741 doi: 10.1158/0008-5472.CAN-18-3945 Ó2019 American Association for Cancer Research. Cancer Research www.aacrjournals.org 4227 on March 8, 2021. © 2019 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from Published OnlineFirst June 25, 2019; DOI: 10.1158/0008-5472.CAN-18-3945
16

Transcriptomic Differences between Primary Colorectal … · the PRAC1 and HOXC6 genes (AUC max ¼ 0.93), while the classi-fication of metastases was based on the expression of the

Oct 12, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Transcriptomic Differences between Primary Colorectal … · the PRAC1 and HOXC6 genes (AUC max ¼ 0.93), while the classi-fication of metastases was based on the expression of the

Tumor Biology and Immunology

Transcriptomic Differences between PrimaryColorectal Adenocarcinomas and DistantMetastases Reveal Metastatic Colorectal CancerSubtypesYasmin Kamal1, Stephanie L. Schmit2, Hannah J. Hoehn2, Christopher I. Amos1,3, andH. Robert Frost1

Abstract

Approximately 20% of colorectal cancer patients with colo-rectal adenocarcinomas present with metastases at the time ofdiagnosis, and therapies that specially target these metastasesare lacking. We present a novel approach for investigatingtranscriptomic differences between primary colorectal adeno-carcinoma and distant metastases, which may help to identifyprimary tumors with high risk for future dissemination and toinform the development of metastasis-targeted therapies. Toeffectively compare the transcriptomes of primary colorectaladenocarcinoma and metastatic lesions at both the gene andpathway levels, we eliminated tissue specificity of the "host"organswhere tumors are located and adjusted for confounderssuch as exposure to chemotherapy and radiation, and identi-fied that metastases were characterized by reduced epithelial–mesenchymal transition (EMT) but increased MYC target andDNA-repair pathway activities. FBN2 and MMP3 were the

most differentially expressed genes between primary tumorsand metastases. The two subtypes of colorectal adenocarcino-ma metastases that were identified, EMT inflammatory andproliferative, were distinct from the consensus molecularsubtype (CMS) 3, suggesting subtype exclusivity. In summary,this study highlights transcriptomic differences between pri-mary tumors and colorectal adenocarcinoma metastases anddelineates pathways that are activated inmetastases that couldbe targeted in colorectal adenocarcinoma patients with met-astatic disease.

Significance: These findings identify a colorectal adenocar-cinoma metastasis-specific gene-expression signature that isfree frompotentially confounding background signals comingfrom treatment exposure and the normal host tissue that themetastasis is now situated within.

IntroductionRoughly 20% of individuals with colorectal adenocarcinoma

present with metastatic disease at the time of diagnosis, andcolorectal adenocarcinoma is the primary cause of mortality dueto cancer (1, 2). In colorectal adenocarcinoma, the liver (70%) isthe most common site of disease metastasis followed by the lung(32%–47%; ref. 3). Although colorectal adenocarcinoma metas-tases are aggressively treated with some combination of chemo-therapy, curative-intent surgical resection (4), biologics, such as

epidermal growth factor (EGFR) inhibitors (5), and immunother-apy (for a subgroup of patients with mismatch-repair deficiency;ref. 6), metastasis-targeted therapies are severely lacking. There-fore, understanding the defining features ofmetastatic tumor cellsin distal organs is valuable for the development of targeted drugsand individualized therapies for patients with metastatic disease.

One approach for characterizing the biology of metastaticlesionshasbeen to compareprimary tumors andmetastatic lesionsof the same cancer type (7).However, this is limited by theneed forbiopsies of normal host organ tissue where metastases are locatedsuch that the transcriptomic profiles of metastases can be normal-ized (7, 8). One interesting survey evaluated primary versusmetastatic sites and found that expression studies of metastasesobtain signatures that partially reflect the host tissue but haveadditional signatures (9). This finding highlights the need forconsidering the metastatic site during analyses. Approaches com-paring primary tumors andmetastatic lesions often fail to addressthe role of treatment exposure in altering tumor transcriptomicprofiles (9, 10). This is particularly true formetastases, as biospeci-mens of metastases obtained in the clinical setting are usuallydrawn from patients heavily treated with chemotherapy and/orradiation prior to surgical resection (4, 11). To identify metastasis-specific features free from potentially confounding signals, wedeveloped a novel approach for comparing primary tumors andmetastases that takesbothnormalhost tissue expression, anatomicorigin of the tumors, and treatment exposure status of tumorsinto consideration as all three of these factors can substantially

1Department of Biomedical Data Sciences, Geisel School of Medicine atDartmouth, Hanover, New Hampshire. 2Department of Cancer Epidemiology,H. Lee Moffitt Cancer Center and Research Institute, Tampa, Florida. 3DanL. Duncan Comprehensive Cancer Center at Baylor College of Medicine, Hous-ton, Texas.

Note: Supplementary data for this article are available at Cancer ResearchOnline (http://cancerres.aacrjournals.org/).

Y. Kamal and S.L. Schmit contributed equally to this article.

CorrespondingAuthors:H.Robert Frost, DartmouthCollege, HB7936, Hanover,NH 03755. Phone: 603-667-1884; E-mail: [email protected]; andChristopher I.Amos, Institute for Clinical and TranslationalResearch, BaylorCollegeof Medicine, 1 Baylor Plaza, Houston, TX 77030; E-mail: [email protected]

Cancer Res 2019;79:4227–41

doi: 10.1158/0008-5472.CAN-18-3945

�2019 American Association for Cancer Research.

CancerResearch

www.aacrjournals.org 4227

on March 8, 2021. © 2019 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from

Published OnlineFirst June 25, 2019; DOI: 10.1158/0008-5472.CAN-18-3945

Page 2: Transcriptomic Differences between Primary Colorectal … · the PRAC1 and HOXC6 genes (AUC max ¼ 0.93), while the classi-fication of metastases was based on the expression of the

influence the detection of metastasis-specific gene-expression pat-terns. This analytical approach allows for the determination ofdefining features of lung and liver metastases of colorectal adeno-carcinoma, while avoiding the need to always obtain normal hosttissues from patients with metastatic disease for the purposesof normalizing tumor gene expression. Last, it allows for theidentification of unique subtypes of colorectal adenocarcinomametastases that are independent of the site of metastasis.

Materials and MethodsPrimary and metastatic colorectal cancer samples

Gene expression from human colorectal cancer tissues andcorresponding clinical data were analyzed. All participants pro-vided written informed consent for data and tissue collection aspart of the followingprotocols atMoffittCancerCenter (MCC)andConsortium sites: Total Cancer Care (12), Lifetime Cancer Screen-ing, General Banking, Pre-HIPAA Biobanking, or Clinical Collec-tion. All tissue and data analyzed for this project were utilizedunder the approval of the Advarra Institutional Review Board thatensures research is conducted in accordance with recognizedethical guidelines (MCC# 19066/Pro00023353) under the HHSregulations 45 CFR part 46 for human subjects protections; spe-cifically under Subpart A (US Common Rule) as authorized by45CFR46.110. All subjects were �18 years of age and free ofpsychiatric incapacity or dementia. Residual tissue collected aspart of routine clinical care was assayed using the Rosetta/MerckHuman RSTA Custom Affymetrix 2.0 microarray platform and asingle standardoperatingprocedure. For thepurposes of this study,only tissues collected via surgical resection, as opposed to biopsy,were analyzed. Microsatellite instability (MSI) status was deter-mined for a subset of patients (n ¼ 71) using the Bethesdapanel (13) genes (BAT25, BAT26, NR21, NR24, and NR27).

We established two distinct datasets from total cancer care(TCC) samples. The discovery cohort (MCC dataset), consistingof 517 human colorectal cancer samples from 502 distinctpatients, included 333 primary lesions and 184 lung and livermetastases of colorectal adenocarcinoma. All samples in theMCCdataset were collected at the MCC hospital location in Tampa, FL.Wherever possible, histology, clinical information, andMSI statusfor the MCC dataset were verified through examination of elec-tronic medical records. In addition, we examined 618 humancolorectal cancer samples from 618 distinct patients including545 primary lesions and 73 lung and livermetastases of colorectaladenocarcinoma in the validation cohort (Consortium dataset).All samples in the Consortium dataset were obtained from non-MCC TCC regional consortium site partner institutions. All tran-scriptomic and clinical data have been deposited to the Gene-Expression Omnibus site curated by the National Center forBioinformatics (accession number: GSE131418).

Microarray expressionnormalization andprincipal componentanalysis

Microarray gene-expression data passingmultiple internal qual-ity control filters and curated by the MCC Shared ResourcesBioinformatics and Biostatistics Core were obtained for all TCCsamples. Microarray chips were normalized using iterative rank-order normalization (IRON; ref. 14) and log2 transformation. Inaddition, we excluded all probes that mapped to multiple genes.Probe set expression was converted to gene-level expression byselecting the probewith themaximumexpression for a given gene.

In addition, we performed principal component analysis(PCA) within the MCC and the Consortium datasets using allgenes captured on themicroarray platform.OneMCC samplewasfound to be an outlier (>3 standard deviations away from themean of PC1) and thuswas subsequently removed from the study(Supplementary Fig. S1). Furthermore, we did not note anysignificant batch effects between the MCC and Consortium data-sets (Supplementary Fig. S2).

Anatomic origin determination for the MCC datasetAnatomic origin for primary tumorswasdetermined as follows:

tumors located in the cecum, ascending colon, hepatic flexure,and transverse colon were classified as proximal, while tumorslocated in the splenic flexure, descending colon, sigmoid colon,and rectumwere classified as distal. Anatomic origin ofmetastaseswas classified in the same manner as for primary tumor, with theexception of 7 patients for which anatomic origin was not clearlydesignated. Specifically, for six of these seven individuals, ana-tomic origin of the metastatic tumor was obtained from patient-reported history of site of primary tumor resection or site ofhemicolectomy (left or right side) noted in the electronic medicalrecord. Anatomic origin could not be determined for 1 patient inthe MCC dataset after medical record examination.

Tumor treatment exposure determination for the MCC datasetTumor treatment exposure status was determined by assessing

the history of chemotherapy and/or radiation treatment withintwo years prior to surgical resection of the tumor sample. If therewas no history of chemotherapy or radiation exposure prior tosurgical resection of the tumor sample, the samplewas consideredtreatment na€�ve (i.e., resection occurred "pre" treatment). If with-in two years prior to surgical resection of the tumor sample inquestion any history of chemotherapy or radiation was noted inthe medical records, the tumor sample was considered treatmentexposed (i.e., resection occurred "post" treatment).

Anatomic origin classifier for Consortium samplesAs clinical data on the Consortium dataset were limited, we

imputed missing tumor anatomic origin by developing gene-expression and logistic regression-based anatomic origin classifiers,which categorize tumors as either originating from the proximalcolon or the distal colon/rectum. These classifiers were developedbased on na€�ve differential gene-expression analysis comparingproximally and distally originating tumors in the MCC dataset. Agenewasdetermined tobedifferentially expressed if the fold changewas >1.5 and the P value <0.05. The classifiers were developedseparately for primary tumors and metastases. Anatomic originclassification of primary tumors was based on the expression ofthe PRAC1 and HOXC6 genes (AUCmax ¼ 0.93), while the classi-fication of metastases was based on the expression of the PRAC1,HOXC6, OGN, and MUC12 genes (AUCmax ¼ 0.85). We imputedmissing anatomic origin for 20 primary tumors and 13 metastasesof colorectal adenocarcinoma in the Consortium dataset. Gene-expression differences between proximally and distally originatingprimary colorectal adenocarcinoma tumors were used for classify-ing the anatomic origin of metastases. PRAC1 and MUC12 werefound tobedifferentially expressedbetweenproximally anddistallyoriginatingprimary tumorsandmetastases (Supplementary Figs. S3and S4). The four genes (OGN,MUC12, PRAC1, andHOXC6) usedto classify anatomic origin in the Consortium dataset were elimi-nated from all subsequent Consortium analyses.

Kamal et al.

Cancer Res; 79(16) August 15, 2019 Cancer Research4228

on March 8, 2021. © 2019 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from

Published OnlineFirst June 25, 2019; DOI: 10.1158/0008-5472.CAN-18-3945

Page 3: Transcriptomic Differences between Primary Colorectal … · the PRAC1 and HOXC6 genes (AUC max ¼ 0.93), while the classi-fication of metastases was based on the expression of the

Treatment classification for Consortium samplesSimilar to the development of an anatomic origin classifier, we

developed gene-expression logistic regression-based classifiers todetermine the treatment exposure status of primary tumors andmetastases. The classifiers categorize tumors as either treatmentexposed or treatment na€�ve. The treatment classifier for primaryConsortium samples was developed based on na€�ve differentialgene-expression analysis comparing primary pre- and post-treatment tumors in the MCC dataset. A gene was determined tobe differentially expressed if the fold change was >1.5 and theP value <0.05. TheMCC dataset was split into a training and a testset, and the optimal combination of differentially expressed genesthat maximized the AUC (AUCmax ¼ 0.815) in the test set wereincorporated into the logistic regression model and served as thebasis for the primary tumor treatment classifier. The eight genesused to classify primary tumors are SCRG1,HBB,GREM2, SCN7A,CHRDL2, HSPB6, CXCL12, and PLP1.

We were unable to identify a set of genes meeting the differ-ential expression threshold criteria when comparing pre- andpost-treated metastases in the MCC dataset, and thus usedgene-expression differences between pre- and post-treatmentprimary tumors to develop the treatment classifier for metastases(Supplementary Figs. S5 and S6). The treatment classifier formetastases (AUCmax ¼ 0.722) was based on the expression of11 genes: SYNPO2, GREM2, ADH1B, HBB, C7, PLN, SFRP1,AGTR1, CHRDL2, MAMDC2, MYH11. For all subsequent Con-sortium analyses, we excluded the 16 genes that were used todevelop the treatment classifiers.

Tissue-specific gene-level and pathway-level weightsBased on the bioinformatics approach developed by Frost (15) to

address the tissue specificity of genes and gene sets, we computedlung, liver, colon, and rectum tissue-specific weights for individualgenes and gene sets (pathways) in theMolecular SignatureDatabase(MSigDB Version 6.0; ref. 16). Tissue-sensitive analyses were per-formed by eliminating genes and pathways exhibiting tissue spec-ificity for lung, liver, colon, or rectum tissues. Thefiltering criteria forgene-level and pathway-level tissue specificity are as follows: allgenes with >2-fold increase in tissue-specific expression and allpathways with a tissue-specific weight >10 are labeled as tissuespecific. Additional information on the development of gene andpathway-level tissue weights has been described previously (15).

Linear models for microarray data analysis and CAMERAapplication

CAMERA (17) application considered pathways from theMSigDB Hallmark (18) and C2 (CPG and CP), collections.Tissue-agnostic linear models for microarray data analysis(LIMMA; ref. 19) and CAMERA application to determine differ-entially expressed genes and gene sets between primary tumorsand metastases did not consider tissue-specific gene or pathwayexpression. All tissue-agnostic analyses adjust for age, sex, tumortreatment exposure status, and anatomic origin. Tissue-sensitiveLIMMA and CAMERA application to determine differentialexpression of genes and pathways between primary tumors andmetastases account for tissue-specific expression by eliminatingall genes and pathways exhibiting tissue specificity based on thefiltering criteria highlighted above. Tissue-sensitive analyses alsoadjust for age, sex, anatomic origin, and treatment exposure status(unless explicitly indicated otherwise), when comparing primarytumors and metastases. In addition, previous studies have

highlighted genetic and transcriptomic differences between colo-rectal adenocarcinoma tumors arising from the proximal colonand the distal colon/rectum (20). Therefore, if tumor anatomicorigin and treatment exposure status were missing for samples inthe Consortium dataset, the imputed missing data from gene-expression–based classifiers were used as inputs for LIMMA andCAMERA (Supplementary Figs. S4 and S6).MSI status for a subsetof MCC tumor samples (n ¼ 71) was available. Microsatelliteinstable tumors were typically MSI-high and were classified assuch, while tumors labeled MSI-low and microsatellite stable(MSS) were classified as MSS. For this small cohort, LIMMA andCAMERA analyses adjusted for MSI status. Lastly, we also usedCAMERA to determine pathways differentially expressed betweentreatment-na€�ve and treatment-exposed tumors while adjustingfor age, sex, anatomic origin, and tumor type (primary tumor ormetastasis of colorectal adenocarcinoma).

Determination of the M1 and M2 clusters in the MCC andConsortium datasets

To discover subtypes of metastases, we performed unsuper-vised hierarchical clustering using the top 500 differentiallyexpressed genes between primary tumors and metastases in eachdataset. Based on our clustering analysis, two main clusters ofmetastases were identified in each dataset, and differential path-way expression between these two clusters of metastases wasdetermined using CAMERA. To ensure that the M1 and M2clusters in each dataset exhibit similar underlying biology, wedeveloped an M1/M2 classifier trained on the MCC dataset andtested on the Consortium dataset. Cluster membership wasdefined by the hierarchical clustering of the top 500 differentiallyexpressed genes between primary tumors and metastases in eachdataset. Inputs for the logistic regression-based classifier includedthe covariates age, sex, tumor anatomic origin, and treatmentexposure status as well as single-sample gene set enrichmentscores (21) for pathways found to be differentially expressedbetween the M1 and M2 clusters in the MCC dataset.

Examining adaptations to distal sites of metastases observed inprimary tumors

We examined adaptations to distal sites of metastases in primarytumors of patients who either went on to develop lung (n¼ 18) orliver (n ¼ 48) metastases for their initial distal metastasis. Tumorsfrompatientswhodevelopedboth lungand livermetastases at onceorwhose initial distal siteofmetastasiswasnot the liveror lungwereexcluded. Using CAMERA, we determined differential pathwayexpression. Next, we assessed if these differential pathways exhib-ited high tissue specificity for normal lung or liver tissues (tissueweight >10). Enrichment of pathways with high tissue-specificweights was considered to be an adaptation to the lung and liverobserved in the primary colorectal adenocarcinoma tumors.

CMS classification and logistic regressionCMS is a transcriptome-based classification of colorectal ade-

nocarcinomas with prognostic value (22). Tumors were classifiedinto CMS groups 1–4 or CMS_NA using the single samplepredictor method as previously described (22). We performedlogistic regression to determine the association of CMS withprimary tumors ormetastases of colorectal adenocarcinomawhileadjusting for age, sex, anatomic origin, and tumor treatmentexposure status. For both the MCC and Consortium datasets, weexcluded CMS3 from the logistic regression models as metastases

Transcriptomes of Primary and Metastatic Colorectal Tumors

www.aacrjournals.org Cancer Res; 79(16) August 15, 2019 4229

on March 8, 2021. © 2019 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from

Published OnlineFirst June 25, 2019; DOI: 10.1158/0008-5472.CAN-18-3945

Page 4: Transcriptomic Differences between Primary Colorectal … · the PRAC1 and HOXC6 genes (AUC max ¼ 0.93), while the classi-fication of metastases was based on the expression of the

were never classified as CMS3. Inclusion of CMS3 into theregression model would have resulted in complete separation,thereby eliminating amaximumlikelihood estimate and resultingin inflated beta coefficients for the predictor.

Statistical analysisAll statistical analyses were performed in R (Version 3.5).

Difference in group means was determined using the Wil-coxon-rank sum test. Spearman rank correlation was used todetermine degree of overlap in pathway enrichment comparisonsbetween datasets. We set the false discovery rate (FDR) to 0.1 toidentify associations in expression analyses.

ResultsCharacteristics of TCC participants

Clinical and gene-expression data corresponding to humancolorectal adenocarcinoma surgical resection biospecimens wereobtained through the MCC TCC Protocol (12), including partic-ipating TCC Consortium sites. Data were separated into a dis-covery cohort of MCC samples (n¼ 517) and a validation cohortof non-MCC TCC Consortium samples (n ¼ 618). Both datasetsprimarily consist of unmatched tumor samples with n ¼ 15

matched primary tumors and metastatic lesion samples from thesame individual in the MCC dataset. Therefore, given the smallnumber of paired samples, all statistical analyses were performedignoring paired status. Clinical characteristics and tumor sampleinclusion/exclusion criteria for the discovery cohort, hereonreferred to as the MCC dataset, and the validation cohort, hereonreferred to as the Consortium dataset, are described in Table 1and Fig. 1. In the MCC dataset, lung and liver metastases ofcolorectal adenocarcinoma were more likely to originate fromthe distal colon or rectum (Wilcoxon-rank sum test; P¼ 0.00073)andweremore likely to have been exposed to chemotherapy and/or radiation treatment prior to surgery (Wilcoxon-rank sum test;P < 0.0001). Interestingly, even within metastases, we observedgene-expression differences between metastases originating fromthe proximal or distal colon/rectum. Therefore, we adjusted foranatomic origin and treatment exposure status in all analysescomparing primary tumors and metastases of colorectal adeno-carcinoma (Supplementary Figs. S3–S7).

Elimination of host tissue–specific gene expressionLung and liver resectionof colorectal adenocarcinomametastases

improves long-term survival (23–25). Surgical resection marginsshould be tumor-free to ensure removal of the entire tumor mass.

Table 1. Baseline characteristics for TCC MCC and Consortium participants in the primary and metastasis cohorts

MCC (n ¼ 517) Consortium (n ¼ 618)Primary (n ¼ 333) Metastases (n ¼ 184) Primary (n ¼ 545) Metastases (n ¼ 73)

Age at diagnosis (years) 63.64 59.30 67.94 58.09Race/ethnicity (%)White 303 (90.9%) 157 (85.3%) 470 (86.2%) 62 (85.0%)Black/African American 15 (4.5%) 10 (5.4%) 36 (6.6%) 8 (10.9%)Other/unknown 15 (4.5 %) 17(9.2%) 39 (7.2%) 3 (4.1%)

Gender (%)Male 183 (54.9%) 106 (57.6%) 265 (48.6%) 47 (64.4%)Female 150 (45.0%) 78 (42.4%) 280 (51.4%) 26 (35.6%)

Treatment status (%)Pretreatment 235 (70.6%) 56 (30.4%) 448 (82.2%)a 10 (13.7%)a

Post-treatment 98 (29.4%) 128 (69.6%) 97 (17.8%)a 63 (86.3%)a

Chemotherapy Only 28 (8.4%) 100 (54.3%)Chemotherapy and radiation 63 (18.9%) 27 (14.7%)Radiation only 7 (2.1%) 1(0.5%)

Anatomic origin (%)Proximal colon 129 (38.7%) 44 (23.9%) 284 (52.1%)a 19 (26.0%)a

Distal colon/rectum 204 (61.3%) 139 (75.5%) 261 (47.9%)a 54 (74.0%)a

MSI Status (%)MSI-high 3 (0.9%) 0 (0%)MSSMSI-low 5 (1.5%) 3 (1.6%)MSS 37 (11.1%) 23 (12.5%)

Unknown 288 (86.4%) 158 (85.9%)Primary tumor stage (%)Stage 1 56 (16.8%) 0 (0%)Stage 2 105 (31.5%) 7 (1.3%)Stage 3 100 (30.0%) 21 (3.8%)Stage 4 72 (21.7%) 27 (5.0%)Unknown 490 (89.9%)

Site of metastasis (%)Liver 141 (76.6%) 56 (76.7%)Lung 43 (23.4%) 17 (23.3%)

NOTE: Treatment status refers to the exposure of primary tumors andmetastases of colorectal adenocarcinoma to chemotherapy and/or radiation treatment prior tosurgical resection of the tumor. MSI statuswas determinedusing PCR forfiveMSImarkers (BAT25, BAT26, NR21, NR24, andNR27). Only 15 samples in theMCCdatasetwere paired samples with primary tumors and metastases originating from the same patient.aIf anatomic origin and treatment exposure status for tumors in the Consortium dataset was not available, the missing data were imputed using gene-expression–based classifiers. Anatomic origin was imputed for 20 primary tumors and 13 metastases of colorectal adenocarcinoma in the Consortium dataset. Treatment wasimputed for 336 primary tumors and 67 metastases of colorectal adenocarcinoma in the Consortium dataset. Imputed values are italicized.

Kamal et al.

Cancer Res; 79(16) August 15, 2019 Cancer Research4230

on March 8, 2021. © 2019 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from

Published OnlineFirst June 25, 2019; DOI: 10.1158/0008-5472.CAN-18-3945

Page 5: Transcriptomic Differences between Primary Colorectal … · the PRAC1 and HOXC6 genes (AUC max ¼ 0.93), while the classi-fication of metastases was based on the expression of the

Therefore, some remnants of normal tissue will inevitably be foundin resected biospecimens. This makes effective comparative tran-scriptomic analysis of primary tumors and metastatic lesions chal-lenging as expression differences between the normal primary andmetastasis host tissue sites can overshadow the true differencesbetween primary tumors and metastases. To address this issue, wegenerated tissue-specific gene and pathway-level weights, as previ-ously described (15), for all normal host tissues of interest (colon,rectum, lung, and liver). Theuseof discretized tissue-specificweightseliminates the need to profile each normal host tissue sampleadjacent to the metastatic colorectal adenocarcinoma lesions forthe purposes of normalizing the tumor transcriptomic data. Usingthese weights, we performed tissue-sensitive analyses that onlyincluded genes where the expression levels in the normal tissuewere below a specific threshold (described further in Materials andMethods). This resulted in the elimination of genes exhibiting hightissue specificity. In addition,weperformed tissue-agnostic analyses,which ignore the tissue specificity of genes and pathways. For thetissue-sensitive analysis, we eliminated all genes and pathwaysexhibiting tissue-specific activity in lung, liver, colon, or rectum(Supplementary Table S1); the tissue-agnostic analysis included allgenes and pathways irrespective of tissue specificity.

As a visual confirmation of this approach, we appliedt-distributed stochastic neighbor embedding (tSNE; ref. 26) on theMCCandConsortium gene-expression datasets in the tissue agnos-tic (i.e., without elimination of tissue-specific genes) and tissue-sensitive (i.e., with elimination of tissue-specific genes) settings(Fig. 2). In the tissue-agnostic setting, samples clustered based onthe site (colon/rectum, lung, or liver) of tumor resection (HotellingT2 test comparing tSNEclustersof liver and lungmetastases;MCC:P< 1 � 10�16). Conversely, in the tissue-sensitive setting, weobserved separation of primary tumors and metastases. However,

we no longer observed sample clustering based on the site ofmetastatic tumor resection (Hotelling T2 test comparing tSNEclusters of liver and lung metastases; MCC: P ¼ 0.0491), such thatlung and liver metastases are integrated across clusters (Fig. 2). Toevaluate the potential influence of tumor purity, we inferred tumorpurity for all samples in both datasets using the ESTIMATE algo-rithm (27), which uses gene-expression signatures to infer fractionsof stromal, immune, and cancer cells from a mixture. We did notfind significant differences in tumor purity between samples drawnfromprimary tumors andmetastases of colorectal adenocarcinoma(MCCWilcoxon-rank sum test,P¼ 0.7087,ConsortiumWilcoxon-rank sum test, P ¼ 0.695; Supplementary Fig. S7). This indicatedtumorpurity isnot themaindriver of differences betweenprimariesand metastases as both are likely capturing similar quantities ofnormal host tissue during tumor resection.

We examined differential expression of genes at the pathway-level using pathways in the Hallmark (18) and C2 collections ofthe Molecular Signature Database (MSigDB Version 6.0; ref. 16).Differential pathway analyses were performed, adjusting for age,sex, treatment exposure status, and anatomic site of originbetween colorectal adenocarcinoma primary and lungmetastasesas well as between colorectal adenocarcinoma primary and livermetastases in the tissue-sensitive and tissue-agnostic settings(Table 2; Supplementary Tables S2 and S3). Materials and Meth-ods and Supplementary Materials describe additional detailsabout the pathway-level analyses. Spearman rank correlation wasused to assess if similar pathways were enriched when comparingcolorectal adenocarcinoma primaries with lung metastases andwhen comparing colorectal adenocarcinoma primaries with livermetastases. In the tissue-agnostic setting, the rank correlations (r)observed for theHallmark andC2 gene set collections in theMCCdataset were rHallmark¼ 0.19 and rC2¼ 0.1, respectively, while in

Figure 1.

CONSORT flow diagram detailing inclusion and exclusion criteria for primary andmetastatic colorectal adenocarcinoma samples. All possible colorectaladenocarcinoma samples with available gene-expression data originating from the large bowel, rectum, or anus were considered for this study. colorectaladenocarcinomametastases were restricted to those found in the liver or lung. All tumor samples were restricted to one sample per patient with the exception of15 patients with matching colorectal adenocarcinoma primary and lung or liver metastases.

Transcriptomes of Primary and Metastatic Colorectal Tumors

www.aacrjournals.org Cancer Res; 79(16) August 15, 2019 4231

on March 8, 2021. © 2019 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from

Published OnlineFirst June 25, 2019; DOI: 10.1158/0008-5472.CAN-18-3945

Page 6: Transcriptomic Differences between Primary Colorectal … · the PRAC1 and HOXC6 genes (AUC max ¼ 0.93), while the classi-fication of metastases was based on the expression of the

the tissue-sensitive setting they were rHallmark ¼ 0.61 and rC2 ¼0.25. Convergence of pathway enrichment results in the tissuesensitive but not tissue-agnostic settings was replicated in theConsortium dataset (Table 2). In the tissue-agnostic setting, liver-specific pathways, such as bile acid production (FDRMCC¼ 1.77�10�08) and xenobiotic metabolism (FDRMCC ¼ 2.39 � 10�13),were enriched in liver metastases compared with primary tumors.However, in the tissue-sensitive setting, cancer-related pathways,such as MYC targets (FDRMCC ¼ 4.74 � 10�04, FDRConsortium ¼8.59� 10�09), were enriched in both liver and lungmetastases ascompared with primaries in both datasets (Tables 2 and 3; Sup-plementary Tables S2 and S3). These results highlight the role ofhost organ tissue gene expression when comparing primaries andmetastases. Elimination of tissue-specific gene expression of thehost organs allowed us to perform a meta-analysis of liver andlung metastases to determine defining features of metastases ofcolorectal adenocarcinoma after also accounting for tumor ana-tomic origin and tumor treatment exposure status.

The role of tumor treatment exposure in comparing colorectaladenocarcinoma primaries and metastases

Given the unbalanced distribution of treatment exposurebetween colorectal adenocarcinomaprimary tumors andmetastaticlesion biospecimens that underwent gene-expression profiling, we

aimed to examine the role of treatment as a potential confounderwhen comparing the transcriptomic patterns of primaries andmetastases. In the tissue-sensitive setting, we found that pathwayssuch as angiogenesis and hypoxia were enriched in metastases ofcolorectal adenocarcinoma compared with primary tumors whentreatment status is ignored (Table 3; Supplementary Table S4).Importantly, angiogenesis and hypoxia were also enriched in treat-ment-exposed tumors relative to treatment-na€�ve tumors, indicatingthat their apparent enrichment inmetastases is due to confoundingby treatment status (Table 3; Supplementary Tables S4 and S5).Supporting the role of treatment status as a confounder, angiogen-esis and hypoxia are no longer enriched in metastases when thepathway analysis adjusts for treatment exposure status. In order todetermine features of chemotherapy and/or radiation treatment-exposed metastases of colorectal adenocarcinoma, we comparedtreatment-na€�ve (n ¼ 56) and treatment-exposed (n ¼ 128)metastases in the MCC dataset to one another. Treatment-exposed metastases shared characteristics with treatment-exposed primaries, such as increased epithelial–mesenchymaltransition (EMT; HALLMARK_EPITHELIAL_MESENCHYMAL_TRANSITION; FDRMCC ¼ 4.81 � 10�04), angiogenesis (HALL-MARK_ANGIOGENESIS; FDRMCC ¼ 6.21� 10�02), and hypoxia(HALLMARK_HYPOXIA; FDRMCC¼ 6.36� 10�02) whereas treat-ment-na€�ve metastases exhibited increased MYC_TARGETS_V2

Figure 2.

tSNE visualizations of MCC and Consortium primary colorectal adenocarcinoma and lung and liver metastases of colorectal adenocarcinoma in the tissue-agnostic and tissue-sensitive analysis settings. tSNE visualizations were generated using the first 50 principal components based on tumor gene-expressiondata. In the tissue-agnostic setting, all possible genes were considered for PCA and subsequent tSNE visualization. In the tissue-sensitive setting, genesexhibiting tissue specificity, defined as a 2-fold expression increase of a given gene in normal lung, liver, colon, or rectum tissues, were excluded.

Kamal et al.

Cancer Res; 79(16) August 15, 2019 Cancer Research4232

on March 8, 2021. © 2019 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from

Published OnlineFirst June 25, 2019; DOI: 10.1158/0008-5472.CAN-18-3945

Page 7: Transcriptomic Differences between Primary Colorectal … · the PRAC1 and HOXC6 genes (AUC max ¼ 0.93), while the classi-fication of metastases was based on the expression of the

Table

2.Differences

betwee

nprimarytumors

andmetastasesofco

lorectal

aden

ocarcinomas

withan

dwitho

utco

nsiderationoftissue

-specificpathw

ayexpression

MC

C P

rimar

y C

RC

vs.

Liv

er M

etas

tase

s: T

issu

e A

gnos

ticM

CC

Prim

ary

CR

C v

s. L

ung

Met

asta

ses:

Tis

sue

Agn

ostic

CPa

thw

ayFD

RD

irect

ion

Path

way

FDR

Dire

ctio

n

HH

ALL

MA

RK

_XE

NO

BIO

TIC

_ME

TAB

OLI

SM

2.39

*10-1

3U

pH

ALLM

AR

K_M

YC_T

AR

GET

S_V

21.

40*1

0-03

Up

HAL

LMAR

K_C

OA

GU

LATI

ON

1.77

*10-0

8U

pH

ALLM

AR

K_M

YC_T

AR

GET

S_V

14.

76*1

0-03

Up

HA

LLM

AR

K_B

ILE

_AC

ID_M

ETA

BO

LIS

M2.

66*1

0-05

Up

HA

LLM

AR

K_D

NA

_RE

PA

IR2.

82*1

0-02

Up

HAL

LMAR

K_M

YC_T

AR

GET

S_V

21.

69*1

0-04

Up

HAL

LMAR

K_E

2F_T

AR

GET

S9.

31*1

0-02

Up

HAL

LMAR

K_M

YC_T

AR

GET

S_V

14.

83*1

0-03

Up

HA

LLM

AR

K_T

NFA

_SIG

NA

LIN

G_V

IA_N

FKB

1.98

*10-1

0D

own

HAL

LMAR

K_FA

TTY_

AC

ID_M

ETA

BO

LIS

M1.

45*1

0-02

Up

HA

LLM

AR

K_E

PIT

HE

LIA

L_M

ES

EN

CH

YMA

L_TR

AN

SIT

ION

9.12

*10-1

8D

own

C2

LIV

ER

_SP

EC

IFIC

_GE

NE

S4.

83*1

0-84

Up

LUN

G_C

AN

CE

R_D

IFFE

RE

NTI

ATI

ON

_MA

RK

ER

S1.

23*1

0-22

Up

LIV

ER

1.48

*10-6

7U

pC

OLO

N_A

ND

_RE

CTA

L_C

AN

CE

R_U

P7.

91*1

0-07

Up

LIV

ER

_CA

NC

ER

_SU

BC

LAS

S_P

RO

LIFE

RA

TIO

N_D

N7.

29*1

0-42

Up

BR

EA

ST_

CA

NC

ER

_20Q

11_A

MP

LIC

ON

1.18

*10-0

6U

pLI

VE

R_C

AN

CE

R_S

UR

VIV

AL_

UP

2.19

*10-2

6U

pR

EA

CTO

ME

_IN

FLU

EN

ZA_V

IRA

L_R

NA

_TR

AN

SC

RIP

TIO

N_A

ND

_R

EP

LIC

ATI

ON

1.43

*10-0

6U

p

BIO

CA

RTA

_IN

TRIN

SIC

_PA

THW

AY

2.77

10-2

6U

pR

ICK

MA

N_H

EA

D_A

ND

_NE

CK

_CA

NC

ER

_D1.

67*1

0-06

Up

KE

GG

_CO

MP

LEM

EN

T_A

ND

_CO

AGU

LATI

ON

_CA

SC

AD

ES

2.43

10-2

4U

pR

EA

CTO

ME

_PE

PTI

DE

_CH

AIN

_ELO

NG

ATI

ON

2.32

*10-0

6U

p

MC

C P

rimar

y C

RC

vs.

Liv

er M

etas

tase

s: T

issu

e Se

nsiti

veM

CC

Prim

ary

CR

C v

s. L

ung

Met

asta

ses:

Tis

sue

Sens

itive

CPa

thw

ayFD

RD

irect

ion

Path

way

FDR

Dire

ctio

n

HH

ALLM

ARK_

MYC

_TA

RG

ETS

_V2

4.74

*10-0

4U

pH

ALLM

AR

K_M

YC_T

AR

GET

S_V

29.

25*1

0-04

Up

HAL

LMAR

K_M

TOR

C1_

SIG

NAL

ING

7.7*

10-0

2U

pH

ALLM

AR

K_D

NA

_REP

AIR

7.79

*10-0

2U

pH

ALLM

ARK_

DN

A_R

EPA

IR8.

44*1

0-02

Up

HA

LLM

AR

K_E

PIT

HE

LIA

L_M

ES

EN

CH

YMA

L_TR

AN

SIT

ION

1.28

*10-0

7D

own

HA

LLM

AR

K_G

LYC

OLY

SIS

9.32

*10-0

2U

pH

ALL

MA

RK

_UV

_RE

SP

ON

SE

_DN

4.61

*10-0

4D

own

HAL

LMAR

K_E

PIT

HEL

IAL_

ME

SE

NC

HYM

AL_

TRA

NS

ITIO

N5.

38*1

0-04

Dow

nH

ALLM

AR

K_M

YO

GEN

ES

IS2.

61*1

0-03

Dow

nH

ALLM

ARK_

UV_

RES

PO

NS

E_D

N4.

14*1

0-03

Dow

nH

ALL

MA

RK

_PA

NC

RE

AS

_BE

TA_C

ELL

S1.

50*1

0-02

Dow

nC

2S

EM

EN

ZA_H

IF1_

TAR

GE

TS1.

01*1

0-04

Up

BR

EA

ST_

CA

NC

ER

_20Q

11_A

MP

LIC

ON

1.84

*10-0

6U

pR

EA

CTO

ME

_IN

FLU

EN

ZA_V

IRA

L_R

NA

_TR

AN

SC

RIP

TIO

N_A

ND

_R

EP

LIC

ATI

ON

8.90

*10-0

4U

pR

EA

CTO

ME

_IN

FLU

EN

ZA_V

IRA

L_R

NA

_TR

AN

SC

RIP

TIO

N_A

ND

_R

EP

LIC

ATI

ON

2.00

*10-0

6U

p

RE

AC

TOM

E_N

ON

SE

NS

E_M

ED

IATE

D_D

EC

AY_

EN

HA

NC

ED

_BY_

THE

_EX

ON

_JU

NC

TIO

N_C

OM

PLE

X1.

27*1

0-03

Up

RE

AC

TOM

E_P

EP

TID

E_C

HA

IN_E

LON

GA

TIO

N3.

09*1

0-06

Up

BR

EAS

T_C

AN

CE

R_2

0Q11

_AM

PLI

CO

N1.

28*1

0-03

Up

KE

GG

_RIB

OS

OM

E4.

51*1

0-06

Up

KE

GG

_RIB

OS

OM

E1.

72*1

0-03

Up

REA

CTO

ME_

3_U

TR_M

EDIA

TED

_TR

ANSL

ATIO

NA

L_R

EGU

LATI

ON

5.21

*10-0

6U

pR

EA

CTO

ME

_PE

PTID

E_C

HA

IN_E

LON

GA

TIO

N1.

89*1

0-03

Up

RE

AC

TOM

E_N

ON

SE

NS

E_M

ED

IATE

D_D

EC

AY_

EN

HA

NC

ED

_BY_

THE

_EX

ON

_JU

NC

TIO

N_C

OM

PLE

X5.

21*1

0-06

Up

NOTE:Pathw

ayen

richmen

tdifferences

aredisplaye

dbetwee

nprimaryco

lorectalad

enocarcinomaan

dliver

colorectalad

enocarcinomametastasesan

dprimaryco

lorectalad

enocarcinomaan

dlung

colorectal

aden

ocarcinomametastasesin

theMCCco

hortin

thetissue

-agno

stican

dtissue

-sen

sitive

settings.Displaye

darethetopfive

pathw

aysfoun

din

each

analysis.O

verlap

pingen

richmen

tresultsbetwee

nthean

alyses

ofprimaryco

lorectalad

enocarcinomavs.liver

metastasesan

dprimaryco

lorectalad

enocarcinomavs.lun

gmetastasesarehighlighted

inblue.Onlyfiltered

pathw

ays

enriched

withan

FDR<0.1areshown.Allpathw

aysexam

ined

arefromtheMSigDBdatab

ase.C,collectionintheMSigDBdatab

ase;H,H

allm

arkco

llection;C2,curated(CGPan

dCP)co

llection.Direction

ischoseninreferenceto

pathw

aysen

riched

inco

lorectalad

enocarcinomametastases,such

that

Up¼en

riched

inco

lorectalad

enocarcinomametastases,whileDown¼en

riched

inprimaryco

lorectal

aden

ocarcinoma.Filtered

analyses

wereperform

edafterremovalofp

athw

aysexhibitingtissue

specificity

forliver,lun

g,colon,an

drectum

tissue

sinea

chco

llectionexam

ined

.Allan

alyses

adjustfor

age,

sex,

trea

tmen

tstatus,a

ndan

atomic

origin.F

ortheConsortium

dataset,m

issing

trea

tmen

tstatus

andan

atomic

origin

weredetermined

usinggen

eexpression–b

ased

classifiers.

Abbreviation:

CRC,colorectal

aden

ocarcinoma.

Transcriptomes of Primary and Metastatic Colorectal Tumors

www.aacrjournals.org Cancer Res; 79(16) August 15, 2019 4233

on March 8, 2021. © 2019 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from

Published OnlineFirst June 25, 2019; DOI: 10.1158/0008-5472.CAN-18-3945

Page 8: Transcriptomic Differences between Primary Colorectal … · the PRAC1 and HOXC6 genes (AUC max ¼ 0.93), while the classi-fication of metastases was based on the expression of the

Table

3.The

role

ofchem

otherap

yan

dradiationexposure

whe

nco

mparingprimarytumors

andmetastasesofco

lorectal

aden

ocarcinoma

MC

C D

atas

et: P

rimar

y C

RC

vs.

lung

and

live

r CR

C m

etas

tase

sad

just

ed fo

r tre

atm

ent s

tatu

s.

MC

C D

atas

et: P

rimar

y C

RC

vs.

lung

and

live

r CR

C m

etas

tase

s

not a

djus

ted

for t

reat

men

t sta

tus

MC

C D

atas

et: P

re v

s. p

ost t

reat

men

tad

just

ed fo

r prim

ary

vs.

met

asta

tic tu

mor

sta

tus

C

Path

way

FD

R

Path

way

FD

R

Path

way

FD

R

Dire

ctio

n

H

HAL

LMAR

K_M

YC_T

AR

GET

S_V

2 9.

25*1

0-04

HA

LLM

AR

K_A

NG

IOG

EN

ES

IS

6.69

*10-0

2 H

ALL

MA

RK

_EP

ITH

ELI

AL_

ME

SE

NC

HYM

AL

_TR

AN

SIT

ION

3.

30*1

0-18

Up

HA

LLM

AR

K_D

NA

_RE

PA

IR

7.78

*10-0

2 H

ALL

MA

RK

_HYP

OX

IA

6.69

*10-0

2 H

ALL

MA

RK

_MY

OG

EN

ES

IS

4.28

*10-1

2 U

p

HA

LLM

AR

K_G

LYC

OLY

SIS

7.

78*1

0-02

HA

LLM

AR

K_G

LYC

OLY

SIS

1.

75*1

0-01

H

ALL

MA

RK

_HYP

OX

IA

9.15

*10-0

8 U

p

HA

LLM

AR

K_E

PIT

HE

LIA

L_M

ES

EN

CH

YMA

L_T

RA

NS

ITIO

N

9.25

*10-0

4 H

ALL

MA

RK

_PA

NC

RE

AS

_BE

TA_C

ELL

S

6.69

*10-0

2 H

ALLM

ARK_

MYC

_TA

RG

ETS

_V2

4.60

*10-1

8 D

own

HAL

LMAR

K_U

V_R

ESP

ON

SE

_DN

4.

66*1

0-03

HA

LLM

AR

K_U

V_R

ES

PO

NS

E_D

N

1.59

*10-0

1 H

ALLM

ARK_

DN

A_R

EPA

IR

1.36

*10-0

3 D

own

HA

LLM

AR

K_P

AN

CR

EA

S_B

ETA

_CE

LLS

2.

13*1

0-02

HA

LLM

AR

K_P

RO

TEIN

_SE

CR

ETI

ON

1.

76*1

0-01

HA

LLM

AR

K_M

TOR

C1_

SIG

NA

LIN

G

2.00

*10-0

3 D

own

Con

sort

ium

Dat

aset

: Prim

ary

CR

C v

s. lu

ng a

nd li

ver C

RC

m

etas

tase

sad

just

ed fo

r tre

atm

ent s

tatu

s.

Con

sort

ium

Dat

aset

: Prim

ary

CR

C v

s. lu

ng a

nd li

ver C

RC

m

etas

tase

sno

t adj

uste

d fo

r tre

atm

ent s

tatu

s C

onso

rtiu

m D

atas

et: P

re v

s. p

ost t

reat

men

tad

just

ed fo

r prim

ary

vs.

met

asta

tic tu

mor

sta

tus

C

Path

way

FD

R

Path

way

FD

R

Path

way

FD

R

Dire

ctio

n

H

HAL

LMAR

K_M

YC_T

AR

GET

S_V

2 8.

59*1

0-09

HA

LLM

AR

K_A

NG

IOG

EN

ES

IS

2.17

*10-0

3 H

ALL

MA

RK

_EP

ITH

ELI

AL_

ME

SE

NC

HYM

AL

_TR

AN

SIT

ION

1.

22*1

0-24

Up

HAL

LMAR

K_M

TOR

C1_

SIG

NAL

ING

6.

38*1

0-04

HA

LLM

AR

K_H

YPO

XIA

2.

86*1

0-03

HA

LLM

AR

K_M

YO

GE

NE

SIS

1.

45*1

0-12

Up

HA

LLM

AR

K_G

LYC

OLY

SIS

5.

75*1

0-03

HA

LLM

AR

K_P

ER

OX

ISO

ME

5.

54*1

0-02

HA

LLM

AR

K_U

V_R

ES

PO

NS

E_D

N

7.82

*10-1

0 U

p

HA

LLM

AR

K_E

PIT

HE

LIA

L_M

ES

EN

CH

YMA

L_

TRA

NS

ITIO

N

2.53

*10-1

4 H

ALL

MA

RK

_PA

NC

RE

AS

_BE

TA_C

ELL

S

1.96

*10-0

1 H

ALLM

ARK_

MYC

_TA

RG

ETS

_V2

8.97

*10-1

9 D

own

HAL

LMAR

K_M

YO

GE

NE

SIS

1.

46*1

0-06

HA

LLM

AR

K_E

STR

OG

EN

_RE

SP

ON

SE

_LA

TE

6.72

*10-0

1 H

ALL

MA

RK

_MTO

RC

1_S

IGN

ALI

NG

4.

64*1

0-06

Dow

n

HAL

LMAR

K_U

V_R

ESP

ON

SE

_DN

4.

93*1

0-06

HA

LLM

AR

K_S

PE

RM

ATO

GE

NE

SIS

6.

72*1

0-01

HAL

LMAR

K_D

NA

_REP

AIR

1.

68*1

0-04

Dow

n

NOTE:W

eev

alua

tedpathw

ayen

richmen

tdifferences

betwee

nprimariesan

dmetastaseswhile

adjustingfortumortrea

tmen

tstatus

andwhile

igno

ring

tumortrea

tmen

tstatus.Inad

dition,

we

exam

ined

differences

betwee

ntrea

tmen

t-na€�vean

dtrea

tmen

texposed

tumors

inboth

datasets.

Herewedisplaythetop

threepathw

aysen

riched

inmetastasesan

dprimaryco

lorectal

aden

ocarcinomatumorsinthetrea

tmen

t-ad

justed

andtrea

tmen

t-na€�vean

alyses,aswellasthetopthreepathw

aysen

riched

inna€�vean

dtrea

tmen

texposedtumors.Pathw

aysove

rlap

pingbetwee

ntheMCCan

dConsortiuman

alyses

arehighlighted

inye

llow.A

llpathw

aysexam

ined

arefromtheMSigDBgen

esetcollections.Topthreeup

regulated

andtopthreedownreg

ulated

pathw

aysfromea

chMSigDBco

llectionareshown.C,collectionintheMSigDBdatab

ase;H,H

allm

arkco

llection.Directionischoseninreferenceto

pathw

aysen

riched

inco

lorectalad

enocarcinomametastases(U

p)orin

trea

tmen

texposedsamples(U

p).The

downdirectionrefers

topathw

aysen

riched

ineither

primaryco

lorectal

aden

ocarcinomatumors

ortrea

tmen

t-na€ �vetumors

dep

endingonthean

alysis.A

llan

alyses

contrastingprimaryco

lorectalad

enocarcinomaan

dmetastasesofcolorectalad

enocarcinomawereperform

edafterfi

lteringforp

athw

aysexhibitingtissue

specificity

forcolon,rectum

,lun

g,

orliver

tissue

s,an

dafterad

justingforag

e,sex,an

atomicorigin,and

trea

tmen

tstatus

(unlessindicated

otherwise).W

henco

ntrastingpathw

ayen

richmen

tbetwee

ndifferent

trea

tmen

tgroup

s,we

adjusted

fortum

ortyp

e(primaryco

lorectalad

enocarcinomavs.lun

g/liver

colorectalad

enocarcinomametastases).ForC

onsortium

cases,missing

dataontrea

tmen

tstatus

andan

atomicoriginwere

imputed

usinggen

eexpression–b

ased

classifiers.

Abbreviation:

CRC,colorectal

aden

ocarcinoma.

Kamal et al.

Cancer Res; 79(16) August 15, 2019 Cancer Research4234

on March 8, 2021. © 2019 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from

Published OnlineFirst June 25, 2019; DOI: 10.1158/0008-5472.CAN-18-3945

Page 9: Transcriptomic Differences between Primary Colorectal … · the PRAC1 and HOXC6 genes (AUC max ¼ 0.93), while the classi-fication of metastases was based on the expression of the

(FDRMCC ¼ 3.59 � 10�07) and proliferative activity (REACTO-ME_S_PHASE; FDRMCC¼ 2.76� 10�03). These findings (Table 3;Supplementary Table S6) highlighted not only the role of treat-ment exposure in altering the transcriptomic landscapes of bothcolorectal adenocarcinoma primary tumors and metastases, butalso demonstrated the importance of considering treatment expo-sure as a covariate when comparing gene-expression patterns ofprimary tumors and metastatic lesions.

Key gene and pathway enrichment differences betweencolorectal adenocarcinoma primaries and metastases

We aimed to discover transcriptomic signatures of metastaticlesions after consideration of host tissue expression, treatmentexposure status, tumor anatomic origin, age, and sex. We exam-inedpathway and gene enrichment differences between colorectaladenocarcinoma metastases in the lung and liver versus thecolorectal adenocarcinoma primary tumors (SupplementaryTables S7 and S8) in the MCC discovery and Consortium vali-dation datasets.

Examination of the Hallmark collection showed that colorectaladenocarcinoma metastases exhibited increased MYC signaling(FDRMCC ¼ 4.74 � 10�04), DNA repair (FDRMCC ¼ 8.44 �10�02), and glycolysis (FDRMCC ¼ 7.78 � 10�02) activity (Table 3;SupplementaryTableS7).Examinationof theC2collectionprovidesadditional support for enhanced MYC signaling in metastases ofcolorectal adenocarcinoma based on the numerous transcription,translation, and ribosomal pathways found to be enriched inmetastases (Supplementary Table S7). Metabolic machinery wasalso altered in metastases showing increased gluconeogenesis(MOOTHA_GLUCONEOGENESIS; FDRMCC ¼ 6.53 � 10�03 andREACTOME_GLUCONEOGENESIS; FDRMCC ¼ 6.52 � 10�03)activity, likely as a result of MYC upregulation. In addition, Hyp-oxia-inducible factor (HIF) targets (SEMENZA_HIF1_TARGETS;FDRMCC ¼ 9.87 � 10�05; ref. 28) and hypoxia targets of VHL(WACKER_HYPOXIA_TARGETS_OF_VHL; FDRMCC ¼ 1.18 �10�02; ref. 29) were also enriched in metastatic lesions, albeit to alesserdegree thanMYCand themetabolic changes associatedwith it.

MSI status could be a potential confounder when assessingdifferences between primary tumors and colorectal adenocarci-noma metastases, as MSI-high tumors are typically diagnosed atless advanced stage (30). Therefore, we replicated ourfindings in asmaller subset of samples in theMCCdataset for whichMSI statuswas available and adjusted for in the tissue-sensitive setting(Supplementary Table S9). We found the PECE_MAMMARY_STEM_CELL_UP (FDRMCC ¼ 6.84 � 10�03; ref. 31) andBENPORATH_ES_CORE_NINE (FDRMCC¼6.69�10�02; ref. 32)gene sets, which are potential cancer stem cell pathways, to beenriched in lung and liver metastases of colorectal adenocarcino-ma. We observed almost no overlap between the genes definingthese stem cell signatures and genes defining EMT activity (Fig. 3).As such, our findings suggest EMT and cancer stemness are notnecessarily coupled as EMT signatures were more prevalent incolorectal adenocarcinomaprimarieswhile cancer stemness activ-ity was enriched in metastases. Furthermore, we showed thatmetastases are enriched in expression of cancer stem cell geneseven after adjusting for treatment. This suggested that cancer stemcells likely exist in all metastases, but that chemotherapy andradiation treatment exposure may select for them. Similarly,hypoxia and angiogenesis activity are likely enriched in all metas-tases, but treatment exposure again appears to enhance theactivation of these pathways. Viral replication and transcription

pathways were also found in metastases. However, these path-ways highly overlapped with multiple global cellular transcrip-tion and translation pathways, which are likely due to theenhancedMYC signaling observed inmetastases (SupplementaryTable S10) and therefore are not indicative of distinct viral activity.Similarly, the hallmark myogenesis pathway was enriched inprimary tumors as it shares many mesenchymal phenotype geneswith the hallmark EMT pathway (Supplementary Table S11).

The most significant differentially expressed genes (Fig. 3;Supplementary Table S8) between primary tumors and metasta-ses of colorectal adenocarcinoma are related to EMT. FBN2 (foldchange ¼ 7.6; FDRMCC ¼ 1.12 � 10�99), MMP3 (fold change ¼38.2; FDRMCC ¼ 1.64 � 10�91), and FGF10 (fold change ¼ 5.7;FDRMCC ¼ 8.74 � 10�67) are all either known stimulators ormarkers of EMT (33, 34) and were highly elevated in primarytumors compared with metastases (Fig. 3). These findings weresupported by our pathway-level results that showed EMT (HALL-MARK_EPITHELIAL_MESENCHYMAL_TRANSITION; FDRMCC¼1.16 � 10�22) is highly upregulated in primary colorectal ade-nocarcinomas. In comparison, genes significantly elevated inmetastases (Fig. 3; Supplementary Table S8) includeGATA4 (foldchange¼ 3.9; FDRMCC¼ 2.09� 10�36), CLND10 (fold change¼3.9; FDRMCC ¼ 3.92 � 10�33), and SYT12 (fold change ¼ 3.4;FDRMCC ¼ 1.12 � 10�54). GATA4 is thought to mark fullydifferentiated epithelial cells and its expression is often silencedin colorectal adenocarcinoma as forced expression of GATA4results in impaired colorectal adenocarcinoma cell line prolifer-ation and migration (35). Increased expression of GATA4 inmetastases supported our pathway-level results, which showeddecreased EMT activity in metastases. Similarly, CLND10 codesfor a claudin protein. Claudin proteins are integral components oftight junctions and their expression has been associated withrecurrence of primary hepatocellular carcinoma (36). Lastly,SYT12 is involved in regulating calcium-independent sections innonneuronal cells, and it has been previously linked with unfa-vorable prognosis in pancreatic cancer (37).

Based on the top 500 differentially expressed genes betweenprimary tumors andmetastases of colorectal adenocarcinoma (Sup-plementary Table S8), we performed hierarchical clustering of allcolorectal adenocarcinoma tumor samples in both datasets in ordertodetermine thedegreeof transcriptomic similarity between clustersof primaries and metastases. We hypothesized a strong degree ofseparation between primary tumors and metastases. However,despite having generated five main clusters from the top 500differentially expressed genes, we observed several primary colorec-tal adenocarcinoma tumors embeddedwithin the clusters ofmetas-tases. Surprisingly, we also noticed thatmetastases only appeared intwomain clusters, hereoncalledM1andM2(Fig. 3; Table4), inbothdatasets. We aimed to confirm that the M1 and M2 clusters in theMCC and Consortium datasets were defined by similar underlyingbiology.M1/M2 clustermembership in each dataset was defined bythe top 500 differentially expressed genes between primary tumorsand metastases in an adjusted regression analysis in each dataset.Therefore, we developed a classifier trained on theMCCdataset thatpredicted the cluster membership of the Consortium metastases(Fig. 3E) based on pathway-level expression differences between theMCCM1 andM2 clusters (Table 4). Similar to the differential gene-expression analysis comparing primaries and metastases, inputs fortheM1/M2 classifier also adjusted for age, sex, anatomic origin, andtreatment exposure status. Based on the strong performance of ourclassifier (AUC¼ 0.905), we believe theM1 andM2 clusters in both

Transcriptomes of Primary and Metastatic Colorectal Tumors

www.aacrjournals.org Cancer Res; 79(16) August 15, 2019 4235

on March 8, 2021. © 2019 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from

Published OnlineFirst June 25, 2019; DOI: 10.1158/0008-5472.CAN-18-3945

Page 10: Transcriptomic Differences between Primary Colorectal … · the PRAC1 and HOXC6 genes (AUC max ¼ 0.93), while the classi-fication of metastases was based on the expression of the

Figure 3.

Heat map visualization and volcano plots showing the top differentially expressed genes between primary tumors and metastases of colorectaladenocarcinomas in the MCC and Consortium datasets. A and B,Of the top 500 differentially expressed genes between primary tumors and metastases whileadjusting for clinical variables, the top 25 genes are shown. Hierarchal clustering was performed, which revealed twomain clusters of metastases in bothdatasets. For Consortium, we excluded 21 genes from the differential expression analysis, which were used to develop the anatomic origin and treatment statusclassifiers. In both the MCC and Consortium datasets, both clusters of metastases, named as MCC-M1 or Consortium-M1 and MCC-M2 and Consortium-M2, includeliver and lung metastases. Treatment status of each tumor is also denoted. C and D, Volcano plots displaying the top 500 differentially expressed genes. Thex-axis shows the log2-fold change (FC), and the y-axis displays the�log10 of the P values, where all P values are <0.001. The most differentially expressed genesin primaries (negative log2 FC) and metastases (positive log2 FC) are highlighted in blue and red, respectively. E, An ROC curve showing the performance of theM1/M2 classifier predicting M1 and M2 status in the Consortium dataset based on MCCM1/M2 cluster membership is shown. F,Overlap between stem cell genesets and the EMT gene set is highlighted.

Kamal et al.

Cancer Res; 79(16) August 15, 2019 Cancer Research4236

on March 8, 2021. © 2019 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from

Published OnlineFirst June 25, 2019; DOI: 10.1158/0008-5472.CAN-18-3945

Page 11: Transcriptomic Differences between Primary Colorectal … · the PRAC1 and HOXC6 genes (AUC max ¼ 0.93), while the classi-fication of metastases was based on the expression of the

Table

4.Differences

inclusters

ofmetastasesin

theMCCan

dConsortium

datasets

Hal

lmar

k &

C2.

CP.

Rea

ctom

e Pa

thw

ays

MC

CD

atas

etC

onso

rtiu

m D

atas

et

CPa

thw

ayR

ank

FDR

Ran

kFD

RD

irect

ion

H

HA

LLM

AR

K_E

PIT

HE

LIA

L_M

ES

EN

CH

YMA

L_TR

AN

SIT

ION

11.

16*1

0-22

19.

47*1

0-22

M1

HAL

LMAR

K_A

LLO

GR

AFT_

REJ

EC

TIO

N2

1.78

*10-1

92

1.05

*10-1

9M

1H

ALLM

ARK_

MY

OG

EN

ES

IS3

9.41

*10-1

67

4.93

*10-0

3M

1H

ALL

MA

RK

_E2F

_TA

RG

ETS

11.

16*1

0-22

49.

54*1

0-05

M2

HA

LLM

AR

K_M

YC_T

AR

GE

TS_V

12

1.78

*10-1

93

4.69

*10-0

5M

2H

ALL

MA

RK

_MYC

_TA

RG

ETS

_V2

39.

41*1

0-16

15.

61*1

0-15

M2

C2.

CP.

Rea

ctom

e

RE

AC

TOM

E_I

MM

UN

OR

EG

ULA

TOR

Y_IN

TER

AC

TIO

NS

_BE

TWEE

N_A

_LYM

PH

OID

_AN

D_A

_NO

N_L

YMP

HO

ID_C

ELL

11.

03*1

0-11

35.

40*1

0-12

M1

RE

AC

TOM

E_C

OLL

AG

EN

_FO

RM

ATI

ON

21.

83*1

0-10

18.

19*1

0-13

M1

RE

AC

TOM

E_E

XTR

AC

ELL

ULA

R_M

ATR

IX_O

RG

AN

IZA

TIO

N3

1.83

*10-1

02

8.56

*10-1

3M

1R

EA

CTO

ME

_CH

ON

DR

OIT

IN_S

ULF

ATE

_DE

RM

ATA

N_S

ULF

ATE

_M

ETA

BO

LIS

M4

1.96

*10-0

925

9.86

*10-0

4M

1

RE

AC

TOM

E_G

LYC

OS

AM

INO

GLY

CA

N_M

ETA

BO

LIS

M5

4.85

*10-0

918

3.18

*10-0

4M

1R

EA

CTO

ME

_IN

TEG

RIN

_CE

LL_S

UR

FAC

E_I

NTE

RA

CTI

ON

S6

7.08

*10-0

97

5.60

*10-0

7M

1R

EA

CTO

ME

_GE

NE

RA

TIO

N_O

F_S

EC

ON

D_M

ES

SE

NG

ER

_MO

LEC

ULE

S7

1.76

*10-0

84

4.68

*10-0

9M

1R

EA

CTO

ME

_PH

OS

PH

OR

YLA

TIO

N_O

F_C

D3_

AN

D_T

CR

_ZE

TA_C

HA

INS

85.

95*1

0-08

93.

03*1

0-06

M1

RE

AC

TOM

E_P

D1_

SIG

NA

LIN

G9

6.04

*10-0

85

5.99

*10-0

8M

1R

EA

CTO

ME

_TR

AN

SLO

CA

TIO

N_O

F_ZA

P_7

0_TO

_IM

MU

NO

LOG

ICA

L_S

YNA

PSE

106.

10*1

0-08

86.

00*1

0-07

M1

RE

AC

TOM

E_P

LATE

LET_

AC

TIV

ATI

ON

_SIG

NA

LIN

G_A

ND

_A

GG

RE

GA

TIO

N11

1.68

*10-0

731

1.26

*10-0

3M

1

RE

AC

TOM

E_C

HO

ND

RO

ITIN

_SU

LFA

TE_B

IOS

YNTH

ES

IS12

2.50

*10-0

738

2.45

*10-0

3M

1R

EAC

TOM

E_D

NA

_REP

LIC

ATI

ON

11.

03*1

0-11

211.

72*1

0-02

M2

RE

AC

TOM

E_M

ITO

TIC

_M_M

_G1_

PH

AS

ES

23.

59*1

0-11

252.

03*1

0-02

M2

RE

AC

TOM

E_G

2_M

_CH

EC

KP

OIN

TS3

2.48

*10-1

011

1.01

*10-0

2M

2R

EA

CTO

ME

_AC

TIV

ATI

ON

_OF_

THE_

PR

E_R

EP

LIC

ATI

VE

_CO

MP

LEX

43.

94*1

0-10

171.

46*1

0-02

M2

REA

CTO

ME_

S_P

HAS

E5

5.51

*10-1

071

6.76

*10-0

2M

2R

EA

CTO

ME

_AC

TIV

ATI

ON

_OF_

ATR

_IN

_RE

SP

ON

SE

_TO

_R

EP

LIC

ATI

ON

_STR

ES

S6

8.18

*10-1

020

1.66

*10-0

2M

2

REA

CTO

ME

_DN

A_S

TRA

ND

_ELO

NG

ATIO

N7

1.39

*10-0

963

5.91

*10-0

2M

2R

EA

CTO

ME

_TE

LOM

ER

E_M

AIN

TEN

AN

CE

81.

39*1

0-09

121.

01-0

2M

2R

EA

CTO

ME

_DE

PO

SIT

ION

_OF_

NEW

_CE

NP

A_C

ON

TAIN

ING

_N

UC

LEO

SO

ME

S_A

T_TH

E_C

EN

TRO

ME

RE

93.

80*1

0-09

312.

81*1

0-02

M2

REA

CTO

ME_

SY

NTH

ESI

S_O

F_D

NA

101.

58*1

0-09

726.

76*1

0-02

M2

RE

AC

TOM

E_C

HR

OM

OS

OM

E_M

AIN

TEN

AN

CE

111.

80*1

0-09

131.

04*1

0-02

M2

RE

AC

TOM

E_G

1_S

_TR

AN

SIT

ION

121.

96*1

0-09

524.

79*1

0-02

M2

NOTE:The

top50

0differentially

expressed

gen

esbetwee

nprimaryco

lorectalad

enocarcinomaan

dco

lorectalad

enocarcinomametastasesinboth

datasetswereused

toperform

hierarchalclustering

ofp

rimarytumorsan

dmetastases.Twomainclusters

ofcolorectalad

enocarcinomametastaseswerefoun

dinea

chdataset,hereo

nreferred

toas

M1a

ndM2.The

M1a

ndM2clustersinea

chdataset

wereco

mpared

withea

chother

todeterminedifferences

inpathw

ayactivity.Sho

wnarethetoppathw

aysen

riched

intheM1and

M2clustersfromboth

datasets(FDR<0.1).Spea

rman

rank

correlation

showssimilarove

rlap

inpathw

ayen

richmen

tbetwee

ntheM1an

dM2clusters

ofthetw

odatasets(H

0.66;C

2r¼

0.56).Allpathw

aysexam

ined

arefrom

theMSigDBgen

esetco

llections.C

,co

llectionintheMSigDBdatab

ase;H,H

allm

arkco

llection,C2,Rea

ctomeco

llection.Pathw

ayen

richmen

tdifferences

weredetermined

afterremovalo

fpathw

aysexhibitingtissue

specificity

forliver,

lung

,colon,an

drectum

tissue

sinea

chco

llectionexam

ined

.Inad

dition,allpathw

ayen

richmen

tana

lysesad

justfora

ge,sex,trea

tmen

tstatus,an

dan

atomicorigin.Forthe

Consortium

dataset,m

issing

dataontrea

tmen

texposure

status

andan

atomic

origin

wereim

puted

usinggen

eexpression–b

ased

classifiers.

Transcriptomes of Primary and Metastatic Colorectal Tumors

www.aacrjournals.org Cancer Res; 79(16) August 15, 2019 4237

on March 8, 2021. © 2019 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from

Published OnlineFirst June 25, 2019; DOI: 10.1158/0008-5472.CAN-18-3945

Page 12: Transcriptomic Differences between Primary Colorectal … · the PRAC1 and HOXC6 genes (AUC max ¼ 0.93), while the classi-fication of metastases was based on the expression of the

datasets have similar biology. Therefore, we further investigated thepathway-level differences between the M1 and M2 clusters ofmetastases observed in both datasets.

Subtypes of colorectal adenocarcinoma metastases in the lungand liver

We compared enrichment of pathways in theMSigDBhallmarkand C2.CP.REACTOME collections between the M1 and M2clusters of metastases found in both the MCC and Consortiumdatasets.We restricted the analysis to these two collections to bothassist with interpretation of results and avoid redundant enrich-ment of functional pathways (results for the complete C2 collec-tion canbe found in Supplementary Table S12). Tumors in theM2cluster primarily exhibited a proliferative phenotype withincreased MYC target activity (HALLMARK_MYC_TARGETS_V1;FDRMCC ¼ 1.78 � 10�19; HALLMARK_MYC_TARGETS_V2;FDRMCC ¼ 9.41 � 10�16) and E2F target activity (HALLMARK_E2F_TARGETS; FDRMCC¼ 1.16� 10�22). Tumors in theMCCM1cluster primarily exhibited an inflammatory and immune-escapephenotype (Table 4; Supplementary Table S12). Notably, path-way enrichment differences showed not only an innate immuneresponse (REACTOME_INNATE_IMMUNE_SYSTEM; FDRMCC ¼4.91 � 10�04) in the MCC M1 cluster, but also a verystrong adaptive immune response (REACTOME_ADAPTIVE_IMMUNITY; FDRMCC ¼ 2.53 � 10�03), defined by T-cell infiltra-tion (Table 4; Supplementary Table S12) in M1 metastases,which is likely blunted by the tumor through expression ofimmune-checkpoint inhibitors such as PD1 (REACTOME_PD1_SIGNALING; FDRMCC ¼ 6.04 � 10�08). We examined thedegree of overlap from pathway enrichment results when com-paring M1 and M2 clusters in both the MCC and Consortiumdatasets using Spearman rank correlation, where rHallmark ¼ 0.66and rC2.CP.Reactome ¼ 0.56. These results further highlighted therobustness of the M1 and M2 metastatic clusters found in eachdataset. Of note, the immune-related differences found betweenthe M1 and M2 clusters in the MCC dataset best predicted clustermembership of Consortium metastases (Fig. 3E). Overall, ourresults suggested there are two main types of colorectal adeno-carcinoma metastases to the lung and liver—those that can beconsidered immune "hot" tumors and exhibit an inflammatoryphenotype, and those that can be considered immune "cold"tumors that are not characterized by inflammation, but rathercanonical MYC and E2F signaling with a proliferative signature.

Consensus molecular subtype classification of colorectal lungand liver metastases

Due to the observance of canonical oncogenic signaling incolorectal adenocarcinomametastases, we assessed whether colo-rectal adenocarcinoma metastases are enriched for a particularconsensus molecular subtype (CMS). CMS is one of the mostrobust gene-expression–based colorectal adenocarcinoma classi-fication systems with known prognostic implications (22). Thereare four main CMS groups, CMS1-CMS4, as well as the CMS_NAgroup, which consists of tumor samples that cannot be classifiedas CMS1-CMS4. Samples in the CMS_NA group are thought tocontain properties ofmultiple CMS groups and, as such, CMS_NAis not considered to be a distinct CMS (22). We implemented theCMS classifier (22) in order to determine if metastases of colo-rectal adenocarcinoma exhibit a propensity to be in a specific CMSgroup. In both the MCC and Consortium datasets, metastaseswere never classified as CMS3, suggesting subtype exclusivity. In

addition, implementation of logistic regression models (Supple-mentary Fig. S8), which adjust for age, sex, anatomic origin, andtreatment exposure status of the tumor, showed the odds of atumor being a metastasis compared with a primary tumor is2.3 times higher among CMS2 tumors than among CMS4 tumors(MCC dataset: odds ratio 2.30, 95% confidence interval 1.40–3.82, P < 0.001). Furthermore, CMS2 (MCC: 36.4%, Consortium:38.3%) and CMS4 (MCC: 44.0%, Consortium: 45.2%) appearedto be the dominant subtypes found in colorectal adenocarcinomametastases compared with primary tumors. In addition, 86.6%and85.7%ofmetastases in theM1 clusterswereCMS4 and63.4%and 51.9% of metastases in the M2 clusters were CMS2 in theMCC (Fisher exact test; P < 2.2 � 10�16) and Consortium (Fisherexact test; P ¼ 2.5 � 10�05) datasets, respectively. Logistic regres-sion modeling applied to the MCC dataset further supported theassociation between M1metastases and the CMS4 group and M2metastases and the CMS2 group (Supplementary Table S13).

Adaptations to distal tissue sites observed within primarycolorectal adenocarcinoma tumors

As metastases are likely to have adapted to the microenviron-ments of their sites of metastasis prior to dissemination (38), weaimed todetermine if these adaptations could alreadybedetected inprimary tumors of patients who later go on to develop lung or livermetastases. We compared primary colorectal adenocarcinomatumors from patients who developed lung (n ¼ 18) metastases toprimary colorectal adenocarcinoma tumors from patients whodeveloped liver metastases (n ¼ 48) while adjusting for age, sex,stage 4 disease status, tumor anatomic origin, and treatment expo-sure status. Specifically, we looked for pathways with high lung orliver-specific tissue weights when comparing primary colorectaladenocarcinoma tumors to one another (Supplementary TableS14). In primary tumors from patients who went on to developliver metastases, we found lipid digestion (REACTOME_LIPID_DIGESTION_MOBILIZATION_AND_TRANSPORT; FDRMCC ¼6.96 � 10�02), lipid transport (REACTOME_CHYLOMICRON_MEDIATED_LIPID_TRANSPORT; FDRMCC ¼ 7.28 � 10�02),and adipogenesis (STEGER_ADIPOGENESIS_UP; FDRMCC ¼6.28 � 10�03) pathways to be enriched. In primary tumorsfrom patients who went on to develop lung metastases,we found enrichment of interferon alpha response pathways(HALLMARK_INTERFERON_ALPHA_RESPONSE; FDRMCC ¼7.17 � 10�03 and MOSERLE_IFNA_RESPONSE; FDRMCC ¼1.21� 10�03), which are known tomodulate lung inflammation.

DiscussionA tissue-sensitive approach for determining features ofmetastases

Our tissue-sensitive approach allows the comparison of metas-tases located in different host tissue sites without the need forsampling and transcriptomic profiling of normal host tissues. Forthe comparative analyses of primary tumors and metastases, atissue-sensitive approach supports pooling of metastatic cancersfrom multiple tissue sites, which both improves statistical powerand helps elucidate the common phenotype of metastases from asingle primary cancer type (Table 2). In addition, the use of gene-based classifiers to impute tumor anatomic origin and treatmentexposure status allows future researchers to adjust for thesevariables in their analyses when clinical data are limited forretrospective studies.

Kamal et al.

Cancer Res; 79(16) August 15, 2019 Cancer Research4238

on March 8, 2021. © 2019 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from

Published OnlineFirst June 25, 2019; DOI: 10.1158/0008-5472.CAN-18-3945

Page 13: Transcriptomic Differences between Primary Colorectal … · the PRAC1 and HOXC6 genes (AUC max ¼ 0.93), while the classi-fication of metastases was based on the expression of the

Role of treatment exposure in comparing colorectaladenocarcinoma primary and metastatic tumors

We highlighted the role of chemotherapy and/or radiationexposure prior to tumor surgical resection as a confounder whencomparing transcriptomic profiles of primary colorectal adeno-carcinomas and metastatic lesions (Table 3). As a key example ofthe confounding effect of treatment status, hypoxia and angio-genesis activity were found to be enriched in treatment-exposedtumors and inmetastases when treatment exposure statuswas notappropriately considered (Supplementary Tables S4 and S5).With appropriate adjustments for treatment exposure, hypoxiaand angiogenesis genes were no longer differentially expressedbetween metastases and primaries. Similarly, we showed thattreatment exposed tumors were enriched in EMT, and oncetreatment exposure status is taken into consideration, we foundthat primary tumors are more likely to exhibit EMT enrichmentthan metastases of colorectal adenocarcinoma. Our findings,especially on GATA4 enrichment in metastases, align with previ-ous studies that suggest that metastases undergo mesenchymal–epithelial transition (MET) to establish themselves at distal sitesand that metastatic tumor cells with a MET phenotype are morelikely to exhibit rapid proliferation compared with cells with anEMT phenotype (39, 40), which typically divide slowly. In addi-tion, neoadjuvant chemotherapy has been strongly associatedwith a mesenchymal phenotype in both primary tumorsand metastases and is therefore known to affect colorectal ade-nocarcinoma CMS classification (10, 41), which corroborate ourfindings on the role of treatmentwhen comparing primary tumorsand metastases of colorectal adenocarcinoma.

Characteristic features of metastases of colorectaladenocarcinoma

Compared with colorectal adenocarcinoma primary tumors,lung and liver metastases tended to be more differentiated withreduced EMT activity.Metastases exhibited a shift toward glycolysisand were also enriched in MYC target pathways and the down-stream effects of MYC such as increased proliferation and globalupregulation of transcription and translational cellular machin-ery (42). HIF targets were also found to be enriched; indeed,oncogenic MYC is known to collaborate with HIF to inducemetabolic alterations such as increased glycolysis (Warburg effect;refs. 43, 44). In particular, HIF1a expression is required for MYC-induced proliferation and anchorage-independent growth.

Previous studies in melanoma suggest genetic stability appearsto be necessary for the development of metastases (45). Theobservance of activated DNA-repair pathways in metastases sug-gests that a similarmetastatic programmay be at play in colorectaladenocarcinoma. These results are corroborated by our CMSanalysis, where genetically unstable subtypes such as CMS1 andCMS3 are almost nonexistent among metastases of colorectaladenocarcinoma (22). CMS classification results showmetastasesare more likely to be CMS2 in reference to CMS4 than primarytumors. Although CMS4 has previously been associated withadvanced stages (III and IV) of disease, CMS2 was not previouslyassociated with advanced disease. CMS2 is characterized byepithelial differentiation and strong upregulation of MYC andWNT signaling (22). This is supported by our gene and pathwayenrichment results, which showed metastases exhibit lower EMTactivity, were more likely to be differentiated compared withprimary tumors, and were enriched in MYC signaling and itsdownstream proliferative pathways. Furthermore, the lack of

CMS3 metastases in both the MCC and Consortium datasets wasparticularly intriguing and raises the question of whether themetabolic and genomic features of CMS3 are incompatible withmetastases. Future studies using paired primaries and metastasesare warranted to address this question.

Many metastases are also thought to contain cancer stem cells,which can drive drug resistance and are often associated with poorprognosis and survival.We foundpotential cancer stem cell activitywas upregulated in metastases of colorectal adenocarcinoma, sug-gesting that cancer stem cell features exist in metastases indepen-dent of treatment exposure and EMT activity. Our results alsosuggest that cancer stem cell–like features are not exclusively foundin EMT-high tumors as primary colorectal adenocarcinoma tumorsexhibitedhigherEMTactivity comparedwithmetastases, but cancerstem cell gene sets were found to be enriched in metastases, whichexhibited lower EMT activity. This agreeswith our gene-level resultsshowing increased GATA4 expression in metastases, which arethought to undergo MET at distal organs and therefore shouldexhibit epithelial features compared with primary tumors.

Identification of two main phenotypes of colorectaladenocarcinoma metastases

We identified two main groups of metastases based ontranscriptomic features. When comparing the two groups ofmetas-tases to one another, we found the first group (M1) was charac-terized primarily by inflammation featuring adaptive immunesystem responses, immune evasion pathways (e.g., PD1 signaling;refs. 46, 47), and lymphocytic cell-mediated immunity (Table 4;Supplementary Table S12). The second group (M2) was character-ized by cell proliferation and MYC signaling (Table 4; Supplemen-tary Table S12).Moreover, the enrichment of EMT activity found inboth the M1 cluster and post-treatment metastases and the enrich-ment ofMYCactivity inpretreatmentmetastases and theM2 cluster(Table 4) suggests these metastatic phenotypes may be influencedby treatment exposure. Nevertheless, the M1 cluster exhibits verystrong activation of inflammatory and immune response pathwaysand this immune-phenotype appears to be the defining feature ofthe M1 clusters in both datasets. This immune phenotype was notobserved in post-treatment metastases. Therefore, it is not clear iftreatment exposure can help drive metastases to specific pheno-types. However, our results are consistent with previous research inhumans and mouse models, which have suggested metastases fallinto two main subtypes—those characterized by EMT and inflam-mation signatures and those characterizedbyproliferation (48, 49).Recent work in melanoma (50) has shown "cold" metastases,which do not respond to immunotherapy and which are enrichedin a T-cell exclusion program, are characterized by MYC signalingand E2F targets. As our work has potentially characterized twophenotypes of metastases, one of which is also characterized byMYC and E2F proliferation signaling, we believe these metastaticphenotypes can inform immunotherapy treatment decisions forcolorectal adenocarcinoma as well.

Limitations and future directionsThough we describe a novel method for transcriptomic com-

parative analyses between primary tumors and metastases ofcolorectal adenocarcinoma, this study should be consideredwithin the context of its limitations. While this study comparesprimary colorectal adenocarcinoma tumors to liver and lungcolorectal adenocarcinoma metastases, it does so with a limitedset of matched (n ¼ 15) primary and metastatic tumor samples

Transcriptomes of Primary and Metastatic Colorectal Tumors

www.aacrjournals.org Cancer Res; 79(16) August 15, 2019 4239

on March 8, 2021. © 2019 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from

Published OnlineFirst June 25, 2019; DOI: 10.1158/0008-5472.CAN-18-3945

Page 14: Transcriptomic Differences between Primary Colorectal … · the PRAC1 and HOXC6 genes (AUC max ¼ 0.93), while the classi-fication of metastases was based on the expression of the

from the same individuals. Moreover, our analysis comparingprimary tumors from patients who go on to develop lung or livermetastases suggests some adaptations to the distal site of metas-tasis can already be observed in the primary tumor. These adapta-tions may be lost when implementing our tissue-sensitive adjust-ments for comparing primary tumors and metastases, and thiscould potentially produce false-negative results. Future studiesshould examine larger cohorts of matched samples, where avail-able, to explore features of metastatic progression and drugresistance. In addition, although this study captured and appro-priately adjusted for chemotherapy and radiation exposure, dueto data limitations, it did not capture specific features of treat-ment, such as radiation dose, duration of treatment, and drugclass of the chemotherapeutic agents. The use of classifiers toimpute anatomic origin and treatment status, though imperfectand requiring significant methodological improvement, allowsfuture research onmetastases with limited clinical information tobe performed while appropriately adjusting for potential con-founders. With regard to the M1/M2 clusters, we show replica-bility across datasets at the pathway-level but acknowledge thelimitations of M1/M2 cluster replicability across datasets at thegene-level due to underlying differences between the clinicalcharacteristics of each dataset as these differences inform gene-level M1/M2 cluster membership. Lastly, the prognostic differ-ences between subtypes of metastases should be carefullyexplored. In particular, the EMT inflammatory group of metas-tases can be better characterized to understand immune-escapemechanisms used by metastases.

Disclosure of Potential Conflicts of InterestNo potential conflicts of interest were disclosed.

Authors' ContributionsConception and design: Y. Kamal, S.L. Schmit, C.I. Amos, H.R. FrostDevelopment of methodology: Y. Kamal, C.I. Amos, H.R. FrostAcquisition of data (provided animals, acquired and managed patients,provided facilities, etc.): Y. Kamal, S.L. SchmitAnalysis and interpretation of data (e.g., statistical analysis, biostatistics,computational analysis): Y. Kamal, S.L. Schmit, C.I. Amos, H.R. FrostWriting, review, and/or revision of the manuscript: Y. Kamal, S.L. Schmit,H.J. Hoehn, C.I. Amos, H.R. FrostAdministrative, technical, or material support (i.e., reporting or organizingdata, constructing databases): H.J. HoehnStudy supervision: S.L. Schmit, H.J. Hoehn, C.I. Amos, H.R. Frost

AcknowledgmentsThe authors are grateful for the financial support from research grants

5T32LM012204-03 NIH-NLM, 1K01LM012426 NIH-NLM, and NCI CancerCenter Support Grant 5P30 CA023108-37 to the Norris Cotton Cancer Center.This work was supported in part by Moffitt's Total Cancer Care Initiative, theCollaborative Data Services Core and the Biostatistics and BioinformaticsShared Resource at the H. Lee Moffitt Cancer Center and Research Institute,an NCI-designated Comprehensive Cancer Center, under grant number P30-CA076292. The content is solely the responsibility of the authors and does notnecessarily represent the official views of the NIH or the H. Lee Moffitt CancerCenter andResearch Institute.Wewould also like to thankDrs. Eric A.Welsh andMichael J. Schell for their assistance in the acquisition, curation, and cleaning ofthe datasets used in this article. Partial support for this research was provided tosupport Dr. C.I. Amos efforts through Cancer Prevention Research Institute ofTexas (CPRIT) grant RR170048 and NIH/NCI grant U01CA196386. Dr. C.I.Amos is a CPRIT Research Scholar.

The costs of publication of this articlewere defrayed inpart by the payment ofpage charges. This article must therefore be hereby marked advertisement inaccordance with 18 U.S.C. Section 1734 solely to indicate this fact.

Received December 15, 2018; revised April 11, 2019; accepted June 20, 2019;published first June 25, 2019.

References1. Chambers AF, Groom AC, MacDonald IC. Dissemination and growth of

cancer cells in metastatic sites. Nat Rev Cancer 2002;2:563–72.2. van der Geest LGM, Lam-Boer J, KoopmanM, Verhoef C, ElferinkMAG, de

Wilt JHW. Nationwide trends in incidence, treatment and survival ofcolorectal cancer patients with synchronous metastases. Clin Exp Metas-tasis 2015;32:457–65.

3. Riihim€aki M, Hemminki A, Sundquist J, Hemminki K. Patterns of metas-tasis in colon and rectal cancer. Sci Rep 2016;6:29765.

4. Glynne-Jones R, Wyrwicz L, Tiret E, Brown G, R€odel C, Cervantes A, et al.Rectal cancer: ESMO Clinical Practice Guidelines for diagnosis, treatmentand follow-up†. Ann Oncol 2017;28:iv22–iv40.

5. Chan DLH, Segelov E, Wong RS, Smith A, Herbertson RA, Li BT, et al.Epidermal growth factor receptor (EGFR) inhibitors for metastatic colo-rectal cancer. Cochrane Database Syst Rev 2017;6:CD007047.

6. Le DT, Uram JN,WangH, Bartlett BR, KemberlingH, Eyring AD, et al. PD-1blockade in tumors with mismatch-repair deficiency. N Engl J Med 2015;372:2509–20.

7. Vignot S, Lefebvre C, FramptonGM,MeuriceG, Yelensky R, Palmer G, et al.Comparative analysis of primary tumour and matched metastases incolorectal cancer patients: evaluation of concordance between genomicand transcriptional profiles. Eur J Cancer 2015;51:791–9.

8. Wang S, ZhangC, ZhangZ,QianW, SunY, Ji B, et al. Transcriptome analysisin primary colorectal cancer tissues from patients with and without livermetastases using next-generation sequencing. Cancer Med 2017;6:1976–87.

9. Hartung F, Wang Y, Aronow B, Weber GF. A core program of geneexpression characterizes cancermetastases.Oncotarget 2017;8:102161–75.

10. Trumpi K, Ubink I, Trinh A, Djafarihamedani M, Jongen JM, Govaert KM,et al. Neoadjuvant chemotherapy affects molecular classification of colo-rectal tumors. Oncogenesis 2017;6:e357.

11. Van Cutsem E, Cervantes A, Nordlinger B, Arnold D, ESMO GuidelinesWorking Group. Metastatic colorectal cancer: ESMO Clinical PracticeGuidelines for diagnosis, treatment and follow-up. Ann Oncol 2014;25:iii1–iii9.

12. Fenstermacher DA, Wenham RM, Rollison DE, Dalton WS. Implement-ing personalized medicine in a cancer center. Cancer J 2011;17:528–36.

13. Rodriguez-Bigas MA, Boland CR, Hamilton SR, Henson DE, Jass JR, KhanPM, et al. A National Cancer Institute Workshop on Hereditary Nonpo-lyposis Colorectal Cancer Syndrome: meeting highlights and Bethesdaguidelines. J Natl Cancer Inst 1997;89:1758–62.

14. Welsh EA, Eschrich SA, Berglund AE, Fenstermacher DA. Iterativerank-order normalization of gene expression microarray data. BMCBioinformatics 2013;14:153.

15. Frost HR. Computation and application of tissue-specific gene set weights.Bioinformatics 2018;34:2957–64.

16. Liberzon A, Subramanian A, Pinchback R, Thorvaldsdottir H, Tamayo P,Mesirov JP. Molecular signatures database (MSigDB) 3.0. Bioinformatics2011;27:1739–40.

17. Wu D, Smyth GK. Camera: a competitive gene set test accounting for inter-gene correlation. Nucleic Acids Res 2012;40:e133.

18. Liberzon A, Birger C, Thorvaldsd�o H, Ghandi M, Mesirov JP, Tamayo P,et al. The molecular signatures database hallmark gene set collection.Cell Syst 2015;1:417–25.

19. Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, et al. limma powersdifferential expression analyses for RNA-sequencing and microarray stud-ies. Nucleic Acids Res 2015;43:e47.

20. Bufill JA. Colorectal cancer: evidence for distinct genetic categoriesbased on proximal or distal tumor location. Ann Intern Med 1990;113:779–88.

Cancer Res; 79(16) August 15, 2019 Cancer Research4240

Kamal et al.

on March 8, 2021. © 2019 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from

Published OnlineFirst June 25, 2019; DOI: 10.1158/0008-5472.CAN-18-3945

Page 15: Transcriptomic Differences between Primary Colorectal … · the PRAC1 and HOXC6 genes (AUC max ¼ 0.93), while the classi-fication of metastases was based on the expression of the

21. Barbie DA, Tamayo P, Boehm JS, Kim SY, Moody SE, Dunn IF, et al.Systematic RNA interference reveals that oncogenic KRAS-driven cancersrequire TBK1. Nature 2009;462:108–12.

22. Guinney J, Dienstmann R, Wang X, de Reyni�es A, Schlicker A, Soneson C,et al. The consensus molecular subtypes of colorectal cancer. Nat Med2015;21:1350–6.

23. Valderrama-Trevi~no AI, Barrera-Mera B, Ceballos-Villalva JC, Montalvo-Jav�e EE. Hepatic metastasis from colorectal cancer. Euroasian J Hepatogas-troenterol 2017;7:166–75.

24. McCormack PM, Burt ME, Bains MS, Martini N, Rusch VW, Ginsberg RJ.Lung resection for colorectal metastases. 10-year results. Arch Surg 1992;127:1403–6.

25. Shah SA, Haddad R, Al-Sukhni W, Kim RD, Greig PD, Grant DR, et al.Surgical resection of hepatic and pulmonary metastases from colorectalcarcinoma. J Am Coll Surg 2006;202:468–75.

26. vanderMaaten L,HintonG.VisualizingData using t-SNE. JMach LearnRes2008;9:2579–605.

27. Yoshihara K, Shahmoradgoli M, Martínez E, Vegesna R, Kim H, Torres-Garcia W, et al. Inferring tumour purity and stromal and immune celladmixture from expression data. Nat Commun 2013;4:2612.

28. Semenza GL. Hypoxia-inducible factor 1: oxygen homeostasis and diseasepathophysiology. Trends Mol Med 2001;7:345–50.

29. Wacker I, Sachs M, Knaup K, Wiesener M,Weiske J, Huber O, et al. Key rolefor activin B in cellular transformation after loss of the von Hippel-Lindautumor suppressor. Mol Cell Biol 2009;29:1707–18.

30. Benatti P, Gaf�a R, Barana D, Marino M, Scarselli A, Pedroni M, et al.Microsatellite instability and colorectal cancer prognosis. Clin Cancer Res2005;11:8332–40.

31. Pece S, Tosoni D, Confalonieri S, Mazzarol G, Vecchi M, Ronzoni S, et al.Biological and molecular heterogeneity of breast cancers correlates withtheir cancer stem cell content. Cell 2010;140:62–73.

32. Ben-Porath I, Thomson MW, Carey VJ, Ge R, Bell GW, Regev A, et al. Anembryonic stem cell-like gene expression signature in poorly differentiatedaggressive human tumors. Nat Genet 2008;40:499–507.

33. Abolhassani A, Riazi GH, Azizi E, Amanpour S, Muhammadnejad S,Haddadi M, et al. FGF10: type III epithelial mesenchymal transition andinvasion in breast cancer cell lines. J Cancer 2014;5:537–47.

34. Chen QK, Lee K, Radisky DC, Nelson CM. Extracellular matrix proteinsregulate epithelial–mesenchymal transition in mammary epithelial cells.Differentiation 2013;86:126–32.

35. Zheng R, Blobel GA. GATA transcription factors and cancer. Genes Cancer2010;1:1178–88.

36. Cheung ST, Leung KL, Ip YC, Chen X, Fong DY, Ng IO, et al. Claudin-10expression level is associated with recurrence of primary hepatocellularcarcinoma. Clin Cancer Res 2005;11:551–6.

37. Uhlen M, Zhang C, Lee S, Sj€ostedt E, Fagerberg L, Bidkhori G, et al. Apathology atlas of the human cancer transcriptome. Science 2017;357. pii:eaan2507.

38. Cunningham JJ, Brown JS, Vincent TL, Gatenby RA. Divergent and con-vergent evolution in metastases suggest treatment strategies based onspecific metastatic sites. Evol Med Public Heal 2015;2015:76–87.

39. Tsai JH, Donaher JL, Murphy DA, Chau S, Yang J. Spatiotemporal regu-lation of epithelial-mesenchymal transition is essential for squamous cellcarcinoma metastasis. Cancer Cell 2012;22:725–36.

40. del Pozo Martin Y, Park D, Ramachandran A, Ombrato L, Calvo F,Chakravarty P, et al. Mesenchymal cancer cell-stroma crosstalk promotesniche activation, epithelial reversion, andmetastatic colonization. Cell Rep2015;13:2456–69.

41. Lee HH, Bellat V, Law B. Chemotherapy induces adaptive drug resistanceand metastatic potentials via phenotypic CXCR4-expressing cell statetransition in ovarian cancer. PLoS One 2017;12:e0171044.

42. Stine ZE, Walton ZE, Altman BJ, Hsieh AL, Dang CV. MYC, metabolism,and cancer. Cancer Discov 2015;5:1024–39.

43. Doe MR, Ascano JM, Kaur M, Cole MD. Myc posttranscriptionally inducesHIF1 protein and target gene expression in normal and cancer cells.Cancer Res 2012;72:949–57.

44. Podar K, Anderson KC. A therapeutic role for targeting c-Myc/Hif-1-dependent signaling pathways. Cell Cycle 2010;9:1722–8.

45. Kauffmann A, Rosselli F, Lazar V, Winnepenninckx V, Mansuet-Lupo A,Dessen P, et al. High expression of DNA repair pathways is associated withmetastasis in melanoma patients. Oncogene 2008;27:565–73.

46. Keir ME, Butte MJ, Freeman GJ, Sharpe AH. PD-1 and its ligands intolerance and immunity. Annu Rev Immunol 2008;26:677–704.

47. Fife BT, Bluestone JA. Control of peripheral T-cell tolerance and autoim-munity via the CTLA-4 and PD-1 pathways. Immunol Rev 2008;224:166–82.

48. Robinson R,WuY-M, Lonigro J, Vats P, Cobain R, Everett J, et al. Integrativeclinical genomics of metastatic cancer. Nature 2017;548:297–303.

49. Bakhoum SF, Ngo B, Laughney AM, Cavallo J-A, Murphy CJ, Ly P, et al.Chromosomal instability drives metastasis through a cytosolic DNAresponse. Nature 2018;553:467–72.

50. Jerby-Arnon L, Shah P, Cuoco MS, Rodman C, Su M-J, Melms JC, et al. Acancer cell program promotes T-cell exclusion and resistance to checkpointblockade. Cell 2018;175:984–97.

www.aacrjournals.org Cancer Res; 79(16) August 15, 2019 4241

Transcriptomes of Primary and Metastatic Colorectal Tumors

on March 8, 2021. © 2019 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from

Published OnlineFirst June 25, 2019; DOI: 10.1158/0008-5472.CAN-18-3945

Page 16: Transcriptomic Differences between Primary Colorectal … · the PRAC1 and HOXC6 genes (AUC max ¼ 0.93), while the classi-fication of metastases was based on the expression of the

2019;79:4227-4241. Published OnlineFirst June 25, 2019.Cancer Res   Yasmin Kamal, Stephanie L. Schmit, Hannah J. Hoehn, et al.   Colorectal Cancer SubtypesAdenocarcinomas and Distant Metastases Reveal Metastatic Transcriptomic Differences between Primary Colorectal

  Updated version

  10.1158/0008-5472.CAN-18-3945doi:

Access the most recent version of this article at:

  Material

Supplementary

  http://cancerres.aacrjournals.org/content/suppl/2019/06/25/0008-5472.CAN-18-3945.DC1

Access the most recent supplemental material at:

   

   

  Cited articles

  http://cancerres.aacrjournals.org/content/79/16/4227.full#ref-list-1

This article cites 50 articles, 6 of which you can access for free at:

  Citing articles

  http://cancerres.aacrjournals.org/content/79/16/4227.full#related-urls

This article has been cited by 1 HighWire-hosted articles. Access the articles at:

   

  E-mail alerts related to this article or journal.Sign up to receive free email-alerts

  Subscriptions

Reprints and

  [email protected]

To order reprints of this article or to subscribe to the journal, contact the AACR Publications Department at

  Permissions

  Rightslink site. Click on "Request Permissions" which will take you to the Copyright Clearance Center's (CCC)

.http://cancerres.aacrjournals.org/content/79/16/4227To request permission to re-use all or part of this article, use this link

on March 8, 2021. © 2019 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from

Published OnlineFirst June 25, 2019; DOI: 10.1158/0008-5472.CAN-18-3945