Top Banner
Complimentary access to articles online: aacrjournals.org/hot-topics MATHEMATICAL MODELING & AI Recent Articles Published in the AACR Journals
90

MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

Jul 06, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

Complimentary access to articles online:

aacrjournals.org/hot-topicsMATHEMATICAL MODELING & AIRecent Articles Published in the AACR Journals

Page 2: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

Global Imaging CRO SolutionsPreclinical | Phase 0-IV | Imaging Core Lab | Image Analytics

Pathology Services | Radiochemistry | Data Management

Oncology

Neurology

Systemic Diseases

Rare Diseases

+1 617-904-2100

www.invicro.com

[email protected]

THE WAY TO ANSWERS:Operational Excellence & Scientific Expertise

Page 3: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

Cross-Journal Collection:Mathematical Modeling & AI

Table of Contents

Use of Natural Language Processing to Extract Clinical Cancer Phenotypes from Electronic Medical RecordsGuergana K. Savova, Ioana Danciu, Folami Alamudun, Timothy Miller, Chen Lin, Danielle S. Bitterman, Georgia Tourassi, andJeremy L. WarnerCancer Res Nov 1, 2019 79:21 5463–70; doi: 10.1158/0008-5472.CAN-19-0579

AI-Assisted In Situ Detection of Human Glioma Infiltration Using a Novel Computational Method forOptical Coherence TomographyRonald M. Juarez-Chambi, Carmen Kut, Jose J. Rico-Jimenez, Kaisorn L. Chaichana, Jiefeng Xi, Daniel U. Campos-Delgado,Fausto J. Rodriguez, Alfredo Quinones-Hinojosa, Xingde Li, and Javier A. JoClin Cancer Res Nov 1, 2019 25:21 6329–38; doi: 10.1158/1078-0432.CCR-19-0854

The Clonal Evolution of Metastatic Osteosarcoma as Shaped by Cisplatin TreatmentSamuel W. Brady, Xiaotu Ma, Armita Bahrami, Gryte Satas, Gang Wu, Scott Newman, Michael Rusch, Daniel K. Putnam,Heather L. Mulder, Donald A. Yergeau, Michael N. Edmonson, John Easton, Ludmil B. Alexandrov, Xiang Chen,Elaine R. Mardis, Richard K. Wilson, James R. Downing, Alberto S. Pappo, Benjamin J. Raphael, Michael A. Dyer,and Jinghui ZhangMol Cancer Res Apr 1 2019 17:4 895–906; doi: 10.1158/1541-7786.MCR-18-0620

Genetic and Circulating Biomarker Data Improve Risk Prediction for Pancreatic Cancer in theGeneral PopulationJihye Kim, Chen Yuan, Ana Babic, Ying Bao, Clary B. Clish, Michael N. Pollak, Laufey T. Amundadottir, Alison P. Klein,Rachael Z. Stolzenberg-Solomon, Pari V. Pandharipande, Lauren K. Brais, Marisa W. Welch, Kimmie Ng,Edward L. Giovannucci, Howard D. Sesso, JoAnn E. Manson, Meir J. Stampfer, Charles S. Fuchs, Brian M. Wolpin,and Peter KraftCancer Epidemiol Biomarkers Prev May 1, 2020 29:5 999–1008; doi: 10.1158/1055-9965.EPI-19-1389

Genetic Interactions and Tissue Specificity Modulate the Association of Mutations with Drug ResponseDina Cramer, Johanna Mazur, Octavio Espinosa, Matthias Schlesner, Daniel H€ubschmann, Roland Eils, and Eike StaubMol Cancer Ther Mar 1, 2020 19:3 927–36; doi: 10.1158/1535-7163.MCT-19-0045

Editors of the AACR journals reviewed recently published content toidentify hot topics across the entire portfolio. This publication focuseson mathematical modeling and AI and highlights articles based on anumber of key metrics, such as usage and citations. We hope that youenjoy this complimentary cross-journal collection.

Mathematical Modeling & AI

Page 4: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

High-Throughput Prediction of MHC Class I and II Neoantigens with MHCnuggetsXiaoshan M. Shao, Rohit Bhattacharya, Justin Huang, I.K. Ashok Sivakumar, Collin Tokheim, Lily Zheng, Dylan Hirsch,Benjamin Kaminow, Ashton Omdahl, Maria Bonsack, Angelika B. Riemer, Victor E. Velculescu, Valsamo Anagnostou,Kymberleigh A. Pagel, and Rachel KarchinCancer Immunol Res Mar 1, 2020 8:3 396–408; doi: 10.1158/2326-6066.CIR-19-0464

Modeling Cellular Response in Large-Scale Radiogenomic Databases to Advance Precision RadiotherapyVenkata SK. Manem, Meghan Lambie, Ian Smith, Petr Smirnov, Victor Kofia, Mark Freeman, Marianne Koritzinsky, Mohamed E. Abazeed,Benjamin Haibe-Kains, and Scott V. BratmanCancer Res Dec 15, 2019 79:24 6227–37; doi: 10.1158/0008-5472.CAN-19-0179

pVACtools: A Computational Toolkit to Identify and Visualize Cancer NeoantigensJasreet Hundal, Susanna Kiwala, Joshua McMichael, Christopher A. Miller, Huiming Xia, Alexander T. Wollam, Connor J. Liu,Sidi Zhao, Yang-Yang Feng, Aaron P. Graubert, Amber Z. Wollam, Jonas Neichin, Megan Neveau, Jason Walker, William E. Gillanders,Elaine R. Mardis, Obi L. Griffith, and Malachi GriffithCancer Immunol Res Mar 1, 2020 8:3 409–20; doi: 10.1158/2326-6066.CIR-19-0401

To read a full-text article within this pdf, please click on its title above. While viewing the full-text article, you may access it onlineby clicking its title on the article’s title page.

Mathematical Modeling & AI

Page 5: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

Review

Use of Natural Language Processing to ExtractClinical Cancer Phenotypes from ElectronicMedical RecordsGuergana K. Savova1,2, Ioana Danciu3, Folami Alamudun3, Timothy Miller1,2,Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5

Abstract

Current models for correlating electronic medical recordswith -omics data largely ignore clinical text, which is animportant source of phenotype information for patients withcancer. This data convergence has the potential to reveal newinsights about cancer initiation, progression, metastasis, andresponse to treatment. Insights from this real-world data willcatalyze clinical care, research, and regulatory activities. Nat-ural language processing (NLP)methods are needed to extract

these rich cancer phenotypes fromclinical text.Here,we reviewthe advances of NLP and information extraction methodsrelevant to oncology based on publications from PubMed aswell as NLP and machine learning conference proceedings inthe last 3 years. Given the interdisciplinary nature of the fieldsof oncology and information extraction, this analysis serves asa critical trail marker on the path to higher fidelity oncologyphenotypes from real-world data.

IntroductionDataproducedduring theprocesses of clinical care and research

in oncology are proliferating at an exponential rate. In the pastdecade, use of electronic medical records (EMR) has increasedsignificantly in the United States (1), driven at least in part byincentivization from the Health Information Technology forEconomic and Clinical Health (HITECH) Act of 2009 (2). Inparallel, large databases such as the NCI's Surveillance, Epidemi-ology, and End Results program (SEER; ref. 3), the NationalCancer Database (NCDB; ref. 4), The Cancer Genome Atlas(TCGA; ref. 5), and the Human Tumor Atlas Network (HTAN;ref. 6) are increasingly important avenues for clinical and trans-lational oncology research. However, significant nuanced phe-notype data are locked in clinical free-text, which remains theprimary form of documenting and communicating clinical pre-sentations, provider impressions, procedural details, and man-agement decision-making (7). Despite the proliferation of EMRand -omics data, critical and precise phenotype information isoften detailed only in these clinical texts. Natural languageprocessing (NLP), broadly defined as the transformation of lan-guage into computable representations, is key to large-scaleextraction of nuanced data within clinical texts. As a subfield ofartificial intelligence, clinical NLP (cNLP), which refers to the

analysis of clinical or health care texts (as opposed to clinicalapplication, per se) has been around for decades. However, onlyin recent years have compute power and algorithms advancedsufficiently to demonstrate its power toward broadening onco-logic investigation.

There are excellent prior review articles of cNLP. Spyns (8)covers the period before 1995. Meystre and colleagues (9) surveythe 1998 to 2008 developments. Yim and colleagues (10) providean overviewwith a special emphasis on oncology for the period of2008 to 2016. Neveol and colleagues (11) offer the first broadoverview of cNLP for languages other than English. These surveyscapture three distinct methodology phases in NLP, from exclu-sively rule-based systems through the shift toward probabilisticmethods to the dominance of machine learning. Kreimeyer andcolleagues (12) review existing cNLP systems. Somepopular cNLPsystems are MetaMap (concept mapping; refs. 13,,14), ApachecTAKES (classic NLP components, concept mapping, entities andattributes, relations, temporality; refs. 15, 16), YTex (entity andattributes; ref. 17), OBO annotator (concept mapping; ref. 18),TIES (linking of pathology reports to tissue bank data; ref. 19),MedLEE (entities and attributes, relations; ref. 20), CLAMP (enti-ties and attributes; ref. 21), and NOBLE (entities and attributes;ref. 22).

Themid-2010smark a transformational milestone for the fieldwhere plentiful digitized textual data and hardware advances metpowerful mathematical abstractions in a super connected worldthat led to the explosive interest in general artificial intelligence(e.g., autonomous cars) and NLP in particular (e.g., Googletranslator, Apple Inc.'s Siri, movie recommenders). Herein, wereview major recent developments in cNLP methods for cancersince that watershed point. We discuss their applications fortranslational investigation and future directions. We cover pub-lications since the 2016 review by Yim and colleagues (10), whichare: (i) focused on cNLP of EMR text related to cancer; (ii) peer-reviewed; (iii) published in English and use English EMR text; (iv)sourced fromMEDLINE andmajor computational linguistics and

1Computational Health Informatics Program, Boston Children's Hospital, Boston,Massachusetts. 2Harvard Medical School, Boston, Massachusetts. 3Oak RidgeNational Lab, Knoxville, Tennessee. 4Dana Farber Cancer Institute, Boston,Massachusetts. 5Vanderbilt University Medical Center, Nashville, Tennessee.

Corresponding Author: Guergana K. Savova, Boston Children's Hospital andHarvard Medical School, 401 Park Avenue, East-5523.3, Boston, MA 02215.Phone: 617-919-2972; Fax: 617-730-0817; E-mail:[email protected]

Cancer Res 2019;79:5463–70

doi: 10.1158/0008-5472.CAN-19-0579

�2019 American Association for Cancer Research.

CancerResearch

www.aacrjournals.org 5463

Page 6: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

machine learning venues: the annual conferences of the Associ-ation of Computational Linguistics, North American Associationof Computational Linguistics, European Association of Compu-tational Linguistics, Empirical Methods for Natural LanguageProcessing, International Conference on Machine Learning, Neu-ral Information Processing Systems Conference, Machine Learn-ing for Healthcare, SemEval, International Conference for HighPerformance Computing, and IEEE International Conference onBiomedical Health Informatics. Our goal is to highlight recentexceptional articles with implications for the broader cancerresearch community; thus, this survey is not a systematic meta-review. We acknowledge that much work is taking place outsidetraditional academic environments (i.e., industry), and weattempt to include it to the extent it meets this survey's inclusioncriteria. For ease of reading, terms and definitions are presentedin Table 1.

We highlight results measured in either accuracy, harmonicmean of recall/sensitivity and precision/positive predictive value(F1 score), or AUC (trade-off between true positive and falsepositive rates). These performance metrics reflect a comparisonagainst human-generated data (referred to as gold-standard anno-tations); thus, they capture agreement between NLP systems andhumans. Gold-standard annotations are also used for trainingalgorithms (supervised learning). The interannotator agreement(IAA) measures human performance and serves as a systemperformance target.

Major NLP Algorithmic AdvancesThe past 3 years have shown the development of a variety of

methodologies for NLP with a general shift toward a particularmachine learning category: deep learning (DL; ref. 23). DL tech-niques were initially conceived in the 1980s but not operationa-lized until the convergence of three critical elements: massivedigital text corpora, novel but compute and data intensive algo-rithms, and powerful, massively parallel computing architecturescurrently using graphics processing units (GPU; ref. 24). Formany tasks, DL is considered state-of-the-art in artificialintelligence (25–27). The key differentiator between DL andfeature-rich machine learners is the concept of representationlearning (28). Feature-rich algorithms require expert knowledge,linguistic, semantic, biomedical, or world, to determine theinformation of interest. Some examples of feature-rich learnersare support vector machines (SVM) and random forests (RF;ref. 29). In the clinical domain, the engineered features are oftenguided by biomedical dictionaries, clinical ontologies, or bio-medical knowledge from domain experts. Instead, DL modelsautomatically discovermathematically and computationally con-venient abstractions from raw data needed for classificationwithout the need for explicitly defined features (23, 25). Theserepresentations can range from simple word representations andword embeddings (30) to complex hierarchies that capture con-textual meaning and relationships between words, phrases, and

Table 1. Terms and definitions

Term Definition

Accuracy ðTPþTNÞðTPþFPþFNþTNÞ Where TP is true positive; TN is true negative; FP is false positive; and FN is false negative.

Artificial intelligence A process through which machines mimic "cognitive" functions that humans associate with other human minds, such as languagecomprehension.

Area under the curve (AUC) A metric of binary classification; range from 0 to 1, 0 being always wrong, 0.5 representing random chance, and 1, the perfect score.Artificial neural network Computing systems that are inspired by, but not necessarily identical to, the biological neural networks that constitute human brain.Attribute Facts, details, or characteristics of an entity.Autoencoder A class of artificial neural networks.Concept mapping A diagram that depicts suggested relationships between concepts.Convolutional neural network A class of artificial neural networks.Decision tree A tree-like graph or model of decisions and their possible consequences, including chance event outcomes, resource costs, and

utility.Deep learning A subclass of a broader family of machine learning methods based on artificial neural networks. The designation "deep" signifies

multiple layers of the neural network.Entities A person, place, thing, or concept about which data can be collected. Examples in the clinical domain include diseases/disorders,

signs/symptoms, procedures, medications, anatomical sites.F1 score ð2�Recall�PrecisionÞ

ðRecallþPrecisionÞ Values range from 0 to 1 (perfect score).

Graphics processing unit A specialized electronic circuit designed to perform very fast calculations needed for training artificial neural networks.K-nearest neighbors A nonparametric method used for classification and regression in pattern recognition.Latent representation Word representations that are not directly observed but are rather inferred through a mathematical model.Machine learning The scientific study of algorithms and probabilistic models that computer systems use in order to perform a specific task effectively

without using explicit instructions, relying on patterns and inference instead.Precision ðTPÞ

ðTPþFPÞ Where TP is true positive, and FP is false positive.

Probabilistic methods A nonconstructive method, primarily used in combinatorics, for proving the existence of a prescribed kind of mathematical object.Recall ðTPÞ

ðTPþFNÞ Where TP is true positive, and FN is false negative.

Recurrent neural network A class of artificial neural networks.Rule-based system Systems involving human-crafted or curated rule sets.Semantic representation Ways in which the meaning of a word or sentence is interpreted.Supervised learning Machine learning method that infers a function from labeled training data consisting of a set of training examples.Support vector machine Supervised learning models with associated learning algorithms that analyze data used for classification and regression analysis.tensor Amathematical object analogous to butmore general than a vector, represented by an array of components that are functions of the

coordinates of a space.Transfer learning A machine learning technique where a model trained on one task is repurposed on a second related task.Unsupervised learning Self-organized Hebbian learning that helps find previously unknown patterns in data set without pre-existing labels.Word embedding The collective name for a set of language modeling and feature learning techniques in natural language processing (NLP), where

words or phrases from the vocabulary are mapped to vectors of real numbers.

Savova et al.

Cancer Res; 79(21) November 1, 2019 Cancer Research5464

Page 7: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

other compositional derivatives. This capability of DL algorithmscan potentially unmask unknown relationships buried withinlarge quantities of data, which can be particularly advantageous incancer research andpractice (25). Furthermore,DL algorithms canuniquely take advantage of transfer learning (26), the ability tolearn from data not in the target domain, and then apply thisknowledge to other domains. For example, oneDLmodelmay betrained on large, openly available nonmedical text data (e.g.,Wikipedia), and then this model's knowledge is applied effec-tively in cNLP tasks through fine tuning the model's parameterson smaller but directly relevant clinical text corpora.

Most DL architectures are built on the artificial neural networkwith interconnected nodes (neurons) arranged in layers (23). Thevariations in the arrangement and interconnections of these layersresult in various elaborate networks, or architectures, suitable foraddressing a variety of tasks. The most popular among theseinclude: convolutional neural networks (CNN), optimal for datawhere spatial relationships encode critical information; recurrentneural networks (RNN), advantageous for sequentially ordereddata (e.g., time-series data); and autoencoders, suitable for learn-ing problems from noisy data, or data where prior informationabout data are partially or entirely unknown (23). There is asubstantial amount of research in the general (as opposed toclinical) application of DL, demonstrating its potential inNLP (31).

Linguistic variability, combinedwith the abundance ofmedicalterminology, abbreviations, synonyms, jargon, and spellinginconsistencies prevalent in clinical text,make cNLP a particularlychallenging problem. DL has shown remarkable results in extract-ing low- and high-level abstractions from raw text data withsemantic and syntactic capabilities. This ability is often accom-panied by excellent performance across translational scienceapplications (25, 32) and as highlighted below.

Latest cNLP Application DevelopmentsTask: extracting temporality and timelines

Longitudinal representations of patients' cancer journeys area cornerstone of translational research enabling rich studiesacross variables (e.g., tumor molecular profile) and outcomes(e.g., treatment efficacy). Extracting timelines from the EMR free-text has become a line of cNLP research on its own. Since 2016,under the auspices of SemEval, Clinical TempEval shared tasks

have challenged the NLP research community to establish state-of-the-art methods and results for temporal relation extractionwith a focus on oncology. The dataset for these shared tasksconsists of 400 patients with cancer distributed evenly betweencolon and brain cancers, each represented by pathology, radiol-ogy, and clinical notes (the THYMEcorpusdescribed in ref. 33 andavailable from ref. 34). The tasks consisted of identifying eventexpressions, time expressions, and temporal relations (see Fig. 1for an example). The relation between the event and the docu-ment creation time is called DocTimeRel with values of BEFORE,OVERLAP, BEFORE-OVERLAP, and AFTER, which provide acourse-level temporal positioning on a timeline.

Clinical TempEval 2016 (35) focused on developing methodsfrom colon cancer EMR data and testing on colon cancer data(within-domain evaluation). The results suggest that currentstate-of-the-art systems perform extremely well on most event-and time expression- related tasks, gap between system perfor-mance and IAA (or human performance) < 0.05 F1. However, thetemporal relation tasks remained a challenge. Systems that predictDocTimeRel relation lagged about 0.09 F1 behind IAA. For othertypes of temporal relations, systems lagged about 0.25 F1 behindIAA.

Clinical TempEval 2017 (36) addressed the question of howwell systems trained on one cancer medical domain (coloncancer) perform in predicting timelines in another cancermedicaldomain (brain cancer). The results showed that is an openresearch question with a 0.20þ F1 drop across domains. Provid-ing a small amount of target domain training data improvedperformance.

Methods employed by the Clinical TempEval participantsrange from classic methods (logistic regression, conditional ran-dom fields, SVMs, pattern matching) to various architectures oflatest DL techniques (RNNs, CNNs with inputs of word andcharacter embeddings). Clinical TempEval 2017 showed therewas no one specific method that provides the best results,although the combination of various approaches appeared apromising path.

Outside of Clinical TempEval, experimentation with advancedDL architectures and various data streams for timeline extractionof cancer patient EMRs has intensified. Tourille and colleaguesexplored neural networks and domain adaptation strategies (37).Chen and colleagues (38) and Dligach and colleagues (39) dealtwith simplifications of time expression representations in aneural

© 2019 American Association for Cancer Research

Event1

DocTimeRel: Before

Temporal relation2(Contains)

Temporal relation1(Contains)

scheduledwassurgeryA

Event2

March 11, 2014.on

Time expression1

DocTimeRel: Before

Figure 1.

Clinical TempEval example: two events, one time expression, twotemporal relations, two relations to the document creation time(DocTimeRel).

Natural Language Processing for Cancer Phenotypes from EMRs

www.aacrjournals.org Cancer Res; 79(21) November 1, 2019 5465

Page 8: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

approach. Some latest trends include DL models that combine asmall portion of labeled data with unlabeled publicly availabledata [Google News (30) and social media] to achieve resultsabout 0.02 F1 below IAA (40). The current best reported result is0.684 F1 (41).

Open source systems for timeline extraction include ApachecTAKES temporal module (42), Heidel–Time (for temporalexpressions and their normalization; ref. 43), and rule-basedextensions of Stanford CoreNLP (44).

The task of extracting temporality from EMR clinical narrativehas advanced dramatically since 2016. In the last 3 years, theperformance on the Clinical TempEval test set moved from 0.573to 0.684 F1 forfiner grained temporal relations and reached 0.835F1 for DocTimeRel. This last result enables exploring selecttemporally sensitive applications such as outcomes extraction,which was pointed out as one of the most challenging yet to beaddressed use cases in the 2016 survey article.

Application: extracting tumor and cancer characteristicsInformation extraction from pathology reports, which have a

more consistent structure than other free text EMR documents,presents a tractable challenge to the field of cNLP (45). Since the2016 survey, the oncology NLP field has moved beyond cancerstage and TNM extraction into the extraction of more compre-hensive cancer and tumor attributes. Qiu and colleagues (46)presented a CNN for information abstraction of primary cancersite topography from breast and lung cancer pathology reportsfrom the Louisiana Cancer Registry, reporting 0.72 F1. Using thesame corpus,Gao and colleagues (47)boostedperformance usinga more elaborate DL architecture (hierarchical attention neuralnetwork). The authors reported 0.80 F1 for cancer site topographyand 0.90 F1 for histologic grade. However, the authors notedsignificant computational demands of their DL solution.

Alawad and colleagues (48) showed that for extraction ofcancer primary site, histologic grade, and laterality, training CNNto make multiple predictions simultaneously (multi-task learn-ing) outperformed single taskmodels. In a later study, the authorsexplored the computational demands of CNN cNLP models andthe role of high-performance computing for achieving popula-tion-level automated coding of pathology documents to achievenear real-time cancer surveillance for cancer registry develop-ment (49). Using a corpus of 23,000 pathology reports, theyreported 0.84F1 for primary cancer site extraction across 64 cancersites using their CNN model, significantly outperforming a ran-dom forest classifier with 0.76 F1.

Yala and colleagues (50) used boosting (51) to extract tumorinformation from breast pathology reports and achieved 90%accuracy for extracting carcinoma and atypia categories. Becausegold-standard datasets are a necessary but resource-intensiverequirement of ML algorithms, this study also investigated theminimum number of annotations needed to maintain at least0.9 F1 without the system being pretrained. They reported thisto be approximately 400. Using similar methods, Acevedoand colleagues (52) found the rate of abnormal findings inasymptomatic patients to be 7%, and to increase with age. Theseresults are higher than previously reported, suggesting theclinical value of these algorithms over current epidemiologicmethods to measure cancer incidence and prevalence. In a studyof multiple diseases, Gehrmann and colleagues (25) reported animprovement in F1 score and AUC for advanced cancer usingCNNs over rule-based systems.

The open source DeepPhe platform (53, 54) is a hybrid systemfor extracting a number of tumor and cancer attributes. It imple-ments a variety of artificial intelligence approaches, rules, domainknowledge bases, machine learning (feature-rich and DL), tocrawl the entire cancer patient chart (not restricted to pathologynotes), extract, and summarize the information related to tumorsand cancers and their characteristics. The IAA ranged from 0.46 to1.00 F1, and system agreement with humans ranged from 0.32 to0.96 F1. System highest result is on primary site extraction (0.96F1); lowest: PR method extraction (0.32 F1).

Castro and colleagues (55) developed an NLP system to anno-tate and classify all BI-RADS mentions present in a single radi-ology report, which can serve as the foundation for future studiesthat will leverage automated BI-RADS annotation, providingfeedback to radiologists as part of a learning health systemloop (56).

Application: clinical trials matchingClinical trials determine safety and effectiveness of new med-

ical treatments; with the successes of recent years including newclasses of therapies (e.g., immunotherapy; CAR-T cells), theclinical trial landscape has exploded. Nevertheless, adult patientparticipation in clinical trials remains low, especially amongunderrepresented minorities. This limits trial completion, gener-alizability, and interpretation of trial findings. Thus, there is agreat deal of interest in clinical trial matching. This is not a simpleproblem, given the need to extract information from trial proto-cols written in natural language and match the findings withcharacteristics from individual EMRs.

Since the 2016 survey article (10), researchers have exploredDLtechnology to identify relevant information found in patients'EMRs to establish eligibility for clinical trials. Bustos and collea-gues developed a CNN, leveraging its representation learningcapability, to extract medical knowledge reflecting eligibilitycriteria from clinical trials (57). They reported promising resultsusing CNNs compared with state-of-the-art classification algo-rithms including FastText (58), SVM, and k-Nearest Neighbors(kNN). Shivade and colleagues (59) and Zhang and collea-gues (60) developed SVMs to automate the classification ofeligibility criteria to facilitate trial matching for specific patientpopulations.

Yala and colleagues (50) and Osborne and colleagues (61)used Boostexter (62) and MetaMap (13, 14) respectively on rule-based regular expressions to automatically extract relevant patientinformation fromEMRs, predominantly free-text reports, to iden-tify patient cohorts with characteristics of interest for clinical trialsor other relevant reporting. There are also a panoply of commer-cial solutions emerging in this space, but our search did not revealany publications by these commercial entities.

Application: pharmacovigilance and pharmacoepidemiologyPharmacovigilance, drug–safety surveillance, and factors asso-

ciated with nonadherence play an important role in improvingpatient outcomes by personalizing cancer treatments, monitor-ing, and understanding adverse drug events (ADE) as well asminimizing risks associated with different therapies. The 2016survey article identifies outcomes extraction as one of the chal-lenges for cNLP because temporality extraction plays a key role.With the advances in temporality extraction in the last three years(see section Extracting Temporality and Timelines), methods foroutcomes extraction have also improved.

Savova et al.

Cancer Res; 79(21) November 1, 2019 Cancer Research5466

Page 9: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

A variety of methods have been explored including logisticregression, SVM, random forest, decision tree, and DL to analyzeEMR data to predict treatment prescription, quality of care, andhealth outcomes of patients with cancer. Using data from theSEER (3) cancer registry as gold-standard for cancer stages, andvariables extracted from linked Medicare claims data, Bergquistand colleagues (63) classified patients with lung cancer receivingchemotherapy into different stages of severity, with a hybridmethod of rules and ensemble ML algorithms. This systemachieved 93% accuracy demonstrating its potential applicationsto study thequality of care for patientswith lung cancer andhealthoutcomes.

Survival analysis plays an important role for clinical decisionsupport. In oncology care, the choice of treatment dependsgreatly on prognosis, sometimes difficult for physicians todetermine. Gensheimer and colleagues (64) proposed a hybridpipeline that combines semantic data mining with neuralembeddings of sequential clinical notes and outputs a proba-bility of >3 months life expectancy.

Yang and colleagues (65) applied a tensorized RNNon sequen-tial clinical records to extract a latent representation from theentire patient history, and used it as the input to an AcceleratedFailure Time model to predict the survival time of metastaticbreast cancer patients. Yin and colleagues (66) applied wordembeddings to discover topics in patient-provider communica-tions associated with an increased likelihood of early treatmentdiscontinuation in the adjuvant breast cancer setting. Overall,treatment toxicity extraction remains an open research area.

Shareable Resources for NLP in OncologyRecent years have seen cancer cNLP tasks tackled occasionally at

mainstream NLP conferences and affiliated workshops (in open-domain NLP, top research is preferentially presented at confer-ences). Although still relatively rare, this has the potential togreatly benefit cancer cNLP research, with a larger community ofNLP researchersworkingdirectly on these problems in addition tothe more specialized cNLP community. The prerequisite for thistrend to continue is access to shareable data resources as alsopointed out in the 2016 survey article. The colon and brain cancerTHYME corpus was used in several general domain conferenceand workshop articles (37, 38, 40, 67–69), whereas a radiologyreport dataset from a 2007 challenge (available from ref. 70) wasused in another (71), and SEER-provided (although unsharedthus not available for distribution) corpus was used in yetanother (72). Other work using ad hoc resources has been usedfor methods development but this is a less sustainable model dueto the rarity of expertise in both cancer and NLP (73–75). Arecently developed resource created gold-standard annotations ofthe semantics of sentences in notes describing patients withcancer (76). More shared resources, community challenges, andpublicity for both, will likely lead to more focused developmentof new methods for cancer information extraction, a challengethat the community needs to address.

Application at the Point of CareThe focus of our survey article is onNLP technologies for cancer

translational studies. However, we briefly review the applicationsof these technologies for direct patient care, which has rightfullyproceeded with caution given that even small system error rates

could lead to harm. Lee and colleagues (77) studied concordanceof IBMWatson for Oncology, a commercial NLP-based treatmentrecommendation system, with the recommendations of localexperts and it was 48.9%. Similar results are reported in (78, 79).Furthermore, such applications are treated as Software as MedicalDevice (SaMD) by the FDA, which, justifiably, is a high bar toclear (80, 81). Some cautious use cases provide assistance tophysicians (82, 83) in the form of question-answering andsummarization. Voice tools in health care, which represent adistinct subdomain of NLP, are primarily used for (i)documentation; (ii) commands; and (iii) interactive responseand navigation to patients (84).

Implications and Future DirectionsAs discussed above,NLP technology for cancer hasmade strides

since the 2016 article paper, which states that at that time"oncology-specific NLP is still in its infancy." Given the breadthand depth of the research we surveyed in the current article, webelieve the field has expanded enabled by state-of-the-art meth-ods and abundant digital EMR data. We observe more collabora-tions betweenNLPers and oncologists, whichwas one of the take-away lessons from Yim and colleagues.

State-of-the-art machine learning methods require significantamounts of human-labeled data to learn from, which is expensiveand time-consuming. This presents a methodologic challengetoward learning paradigms from vast unlabeled datasets (lightlysupervised or unsupervised methods). Another challenge lies inthe portability of the machine learners as they represent thedistributions of the data they learned from. If translated to adomain with a different distribution (e.g., colorectal to braincancer), there is a substantial drop in performance (see sectionExtracting Temporality and Timelines). Thus, domain adaptationremains an unsolved and hot scientific problem. Large-scaletranslational science is likely to cross country boundaries andharvest data from EMRs written in a variety of languages. There-fore, the cNLP research community needs to think about multi-lingual machine learning to enable such bold studies. On thehardware side, DLmethods require vast computational resourcesavailable only to a very few andnot necessarily solvable by a cloudcomputing environment. Last but not least, ethical considerationsof the application of these powerful technologies should bediscussed, at the bare minimum whether the underlying data onwhich machine learners are trained represents the whole ofhuman diversity.

In research, real-world big data have great potential to improvecancer care. Gregg and colleagues present a risk stratificationresearch for prostate cancer (85). The utilization of real-worldbig data is a key focus area of the NCI (86). SEER and NCDB, thetwo major cancer registry databases in the United States, havelimitations in terms of coverage, accuracy, and granularity thatintroduce bias (3, 4, 87, 88, 89, 90). Currently, database buildingrequires manual annotation of clinical free-text, which is resourceintensive and prone to human error. cNLP can support morerapid, large-scale, and standardized database development. Auto-mated, semiautomated, and accurate identification of cancercases will be particularly helpful in studying underrepresentedpatient populations and rare cancers. In addition, cNLP canfacilitate analysis of unstructured data that are poorly documen-ted in databases but widely accepted to be critical for prognos-tication and management decision-making, most notably

Natural Language Processing for Cancer Phenotypes from EMRs

www.aacrjournals.org Cancer Res; 79(21) November 1, 2019 5467

Page 10: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

patient-reported outcomes (91). Our hope is that larger, moreaccurate, and granular clinical databases can be integrated with-omics databases to enable translational research to better under-stand oncologic phenotype relationships. This data convergencehas the potential to enable new insights about cancer initiation,progression, metastasis, and response to treatment.

Although NLP has yet to make major inroads in the clinicalsetting, some of the potential applications are clear. Direct extrac-tion of cancer phenotypes from source data (pathology andradiology reports) could reduce redundancy and prevent ambi-guity within a patient's chart, minimizing confusion andmedicalerrors. Summarization and information retrieval applications canreduce search burden and enable clinicians to spend more timewith their patients. Clinical decision support tools could helpreduce the increasingly burdensome cognitive load placed onclinicians, although the results reported thus far by efforts such asIBM Watson for Oncology raise serious concerns about what thebar for accuracy of clinical recommendations should be forroutine use. In fact, these results are a cautionary tale of thechallenges of domain adaptation; the software was widelyreported to have been trained on hypothetical cases at a highlyspecialized cancer center, leading to incorrect and possibly unsaferecommendations (92). At this time, NLP technology is not yetripe for direct patient care except in carefully observed scenarios.

ConclusioncNLP has the potential to affect almost all aspects of the cancer

care continuum, and multidisciplinary collaboration is necessaryto ensure optimal advancement of the field. As there are fewindividuals with expertise in both oncology and NLP, clinicaloncologists, basic and translational scientists, bioinformaticians,and epidemiologists should work with computer scientists toidentify and prioritize the most important clinical questions andtasks that can be addressed with this technology. Furthermore,oncology subject matter experts will be needed to create gold

datasets. Once an NLP technology is developed, oncologists andcancer researchers should take a primary role in evaluating it todetermine its utility for research and their clinical value. Althoughstandards for clinical evaluation of software, including artificialintelligence systems, are evolving (93), NLP tools that directlyaffectmanagement decisions should be considered for evaluationin a trial setting by clinical investigators familiar with the tech-nology and FDA guidelines (80). In partnership, computer scien-tists, oncology researchers, and clinicians can take full advantageof the recent advances in NLP technology to fully leverage thewealth of data stored and rapidly accumulating in our EMRs.

Disclosure of Potential Conflicts of InterestNo potential conflicts of interest were disclosed.

AcknowledgmentsThe work was supported by funding from U24CA184407 (NCI),

U01CA231840 (NCI), R01 LM 10090 (LM), and R01GM114355 (NIGMS).This work has been supported in part by the Joint Design of AdvancedComputing Solutions for Cancer (JDACS4C) program established by theU.S. Department of Energy (DOE) and the National Cancer Institute (NCI)of National Institutes of Health. This work was performed under the auspicesof the U.S. Department of Energy by Argonne National Laboratory underContract DE-AC02-06-CH11357, Lawrence Livermore National Laboratoryunder Contract DE-AC52-07NA27344, Los Alamos National Laboratory underContract DE-AC5206NA25396, and Oak Ridge National Laboratory underContract DE-AC05-00OR22725. This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Depart-ment of Energy. The United States Government retains and the publisher, byaccepting the article for publication, acknowledges that the United StatesGovernment retains a non-exclusive, paid-up, irrevocable, world-wide licenseto publishor reproduce the published formof themanuscript, or allowothers todo so, for United States Government purposes. The Department of Energywill provide public access to these results of federally sponsored research inaccordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan).

Received February 15, 2019; revised June 17, 2019; accepted July 29, 2019;published first August 8, 2019.

References1. Cohen MF. Impact of the HITECH financial incentives on EHR

adoption in small, physician-owned practices. Int J Med Inf 2016;94:143–54.

2. GovTrack.us. H.R. 1 (111th): American Recovery and Reinvestment Act of2009 –House Vote #46– Jan 28, 2009. [cited 2019 Feb11]. Available from:https://www.govtrack.us/congress/votes/111-2009/h46.

3. National Cancer Institute. Surveillance, Epidemiology, and End ResultsProgram. SEER. [cited 2019 Feb 11]. Available from: https://seer.cancer.gov/index.html.

4. National Cancer Database. American College of Surgeons. [cited 2019 Feb11]. Available from: https://www.facs.org/quality-programs/cancer/ncdb.

5. The Cancer Genome Atlas Home Page. The Cancer Genome Atlas -National Cancer Institute. 2011 [cited 2019 Feb 11]. Available from:https://cancergenome.nih.gov/.

6. National Cancer Institute. Human Tumor Atlas Network (HTAN). [cited2019 Feb 11]. Available from: https://www.cancer.gov/research/key-initiatives/moonshot-cancer-initiative/implementation/human-tumor-atlas.

7. Rosenbloom ST, Denny JC, Xu H, Lorenzi N, Stead WW, Johnson KB.Data from clinical notes: a perspective on the tension between structureand flexible documentation. J Am Med Inform Assoc 2011;18:181–6.

8. Spyns P. Natural language processing in medicine: an overview.Methods Inf Med 1996;35:285–301.

9. Meystre SM, Savova GK, Kipper-Schuler KC, Hurdle JF. Extracting infor-mation from textual documents in the electronic health record: a review ofrecent research. Yearb Med Inform 2008;128–44.

10. YimWW, YetisgenM,HarrisWP, Kwan SW.Natural language processing inoncology: a review. JAMA Oncol 2016;2:797–804.

11. N�ev�eol A, Dalianis H, Velupillai S, Savova G, Zweigenbaum P. Clinicalnatural language processing in languages other than English: opportu-nities and challenges. J Biomed Semant 2018;9:12. doi: 10.1186/s13326-018-0179-8.

12. Kreimeyer K, FosterM, Pandey A, AryaN,HalfordG, Jones SF, et al. Naturallanguage processing systems for capturing and standardizing unstructur-ed clinical information: a systematic review. J Biomed Inform 2017;73:14–29.

13. Aronson AR. Effective mapping of biomedical text to the UMLSMetathesaurus: the MetaMap program. Proc AMIA Symp 2001;17–21.

14. Aronson AR, Lang FM. An overview ofMetaMap: historical perspective andrecent advances. J Am Med Inform Assoc 2010;17:229–36.

15. Savova GK, Masanz JJ, Ogren PV, Zheng J, Sohn S, Kipper-Schuler KC, et al.Mayo clinical text analysis and knowledge extraction system (cTAKES):architecture, component evaluation and applications. J Am Med InformAssoc 2010;17:507–13.

16. ctakes.apache.org. [homepage on the Internet]. The Apache SoftwareFoundation. [cited 2019 Feb 11]. Available from: ctakes.apache.org.

17. Garla V, Lo Re V, Dorey-Stein Z, Kidwai F, Scotch M, Womack J, et al. TheYale cTAKES extensions for document classification: architecture andapplication. J Am Med Inform Assoc 2011;18:614–20.

18. www.obofoundry.org. [homepage on the Internet]. [cited 2019 Feb 11].Available from: www.obofoundry.org.

Savova et al.

Cancer Res; 79(21) November 1, 2019 Cancer Research5468

Page 11: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

19. TIES v5; clinical text search engine. [cited 2019 Feb 11]. Available from:http://ties.dbmi.pitt.edu/.

20. Friedman C. A broad-coverage natural language processing system.Proc AMIA Symp 2000;270–4.

21. Soysal E,Wang J, JiangM,WuY, PakhomovS, LiuH, et al. CLAMP - a toolkitfor efficiently building customized clinical natural language processingpipelines. J AmMed InformAssoc 2017Nov 24 [Epub ahead of print]. doi:10.1093/jamia/ocx132.

22. Tseytlin E, Mitchell K, Legowski E, Corrigan J, Chavan G, Jacobson RS.NOBLE – Flexible concept recognition for large-scale biomedical nat-ural language processing. BMC Bioinformatics 2016;17:32.

23. Goodfellow I, Bengio Y, Courville A. Deep learning. MIT Press; 2016[cited 2019 Feb 12]. Available from: http://www.deeplearningbook.org.

24. Rumelhart DE, Hinton GE, Williams RJ. Learning representations by back-propagating errors. Nature 1986;323:533.

25. Gehrmann S, Dernoncourt F, Li Y, Carlson ET, Wu JT, Welt J, et al.Comparing deep learning and concept extraction based methods forpatient phenotyping from clinical narratives. PLoS ONE 2018;13:e0192360.

26. Young T, Hazarika D, Poria S, Cambria E. Recent trends in deep learningbased natural language processing. Ieee Comput Intell Mag 2018;13:55–75.

27. Goldberg Y. A primer on neural network models for natural languageprocessing. J Artif Intell Res 2016;57:345–420.

28. Bengio Y, Courville A, Vincent P. Representation learning: a review andnew perspectives. ArXiv12065538 Cs. 2012 Jun 24 [cited 2019 Feb 13].Available from: http://arxiv.org/abs/1206.5538.

29. Manning CD, Raghavan P, Sch€utze H. Introduction to information retriev-al. Cambridge University Press; 2008.

30. Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J. Distributed repre-sentations of words and phrases and their compositionality. In: BurgesCJC, Bottou L, Welling M, Ghahramani Z, Weinberger KQ, editors.Advances inNeural Information Processing Systems 26. Curran Associates,Inc.; 2013 [cited 2019 Jan 3]. p. 3111–9. Available from: http://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf.

31. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature 2015;521:436–44.32. Banerjee I, Ling Y, Chen MC, Hasan SA, Langlotz CP, Moradzadeh N,

et al. Comparative effectiveness of convolutional neural network(CNN) and recurrent neural network (RNN) architectures for radiologytext report classification. Artif Intell Med 2019;97:79–88.

33. Styler WF, Bethard S, Finan S, Palmer M, Pradhan S, de Groen PC, et al.Temporal annotation in the clinical domain. Trans Assoc Comput Linguist2014;2:143–54.

34. THYME corpus (available through hNLP Center membership). Avail-able from: https://healthnlp.hms.harvard.edu/center/pages/data-sets.html.

35. Bethard S, Savova G, Chen W-T, Derczynski L, Pustejovsky J, Verhagen M.SemEval-2016 Task 12: clinical TempEval. In: Proceedings of the 10thInternational Workshop on Semantic Evaluation (SemEval-2016). SanDiego, CA: Association for Computational Linguistics; 2016 [cited 2019Jan 3]. p. 1052–62. Available from: http://www.aclweb.org/anthology/S16-1165.

36. Bethard S, Savova G, Palmer M, Pustejovsky J. SemEval-2017 Task 12:Clinical TempEval. In: Proceedings of the 11th InternationalWorkshop onSemantic Evaluation (SemEval-2017). Vancouver, Canada: Association forComputational Linguistics; 2017 [cited 2019 Jan 2]. p. 565–72. Availablefrom: http://www.aclweb.org/anthology/S17-2093.

37. Tourille J, Ferret O, Neveol A, Tannier X. Neural architecture fortemporal relation extraction: a Bi-LSTM approach for detecting narra-tive containers. In: Proceedings of the 55th Annual Meeting of theAssociation for Computational Linguistics (volume 2: short papers).Vancouver, Canada: Association for Computational Linguistics; 2017[cited 2019 Jan 3]. p. 224–30. Available from: http://aclweb.org/anthology/P17-2035.

38. Lin C, Miller T, Dligach D, Bethard S, Savova G. Representations of timeexpressions for temporal relation extraction with convolutional neuralnetworks. In: BioNLP 2017. Vancouver, Canada: Association for Compu-tational Linguistics; 2017 [cited 2019 Jan 3]. p. 322–7. Available from:http://www.aclweb.org/anthology/W17-2341.

39. Dligach D, Miller T, Lin C, Bethard S, Savova G. Neural temporal relationextraction. In: Proceedings of the 15th Conference of the EuropeanChapter of the Association for Computational Linguistics (volume 2:short papers). Valencia, Spain: Association for Computational Linguistics;2017 [cited 2019 Jan 3]. p. 746–51. Available from: http://www.aclweb.org/anthology/E17-2118.

40. Lin C, Miller T, Dligach D, Amiri H, Bethard S, Savova G. Self-trainingimproves recurrent neural networks performance for temporal relationextraction. In: Proceedings of the Ninth International Workshop on HealthText Mining and Information Analysis. Brussels, Belgium: Associationfor Computational Linguistics; 2018 [cited 2019 Jan 3]. p. 165–76.Available from: http://www.aclweb.org/anthology/W18-5619.

41. Lin C, Miller T, Dligach D, Bethard S, Savova G. A BERT-based universalmodel for both within- and cross-sentence clinical temporal relationextraction. In: Clinical NLP Workshop. Minneapolis, MN; 2019.

42. Lin C, Dligach D, Miller TA, Bethard S, Savova GK. Multilayered temporalmodeling for the clinical domain. J Am Med Inform Assoc 2016;23:387–95.

43. Str€otgen J, Gertz M. Multilingual and cross-domain temporal tagging.Lang Resour Eval 2013;47:269–98.

44. Manning C, Surdeanu M, Bauer J, Finkel J, Bethard S, McClosky D. TheStanford CoreNLP natural language processing toolkit. In: Proceedings of52nd Annual Meeting of the Association for Computational Linguistics:System Demonstrations. Baltimore, Maryland: Association for Computa-tional Linguistics; 2014 [cited2019 Jan3]. p. 55–60. Available from: http://aclweb.org/anthology/P14-5010.

45. Liu K, Hogan WR, Crowley RS. Natural language processing methods andsystems for biomedical ontology learning. J Biomed Inform 2011;44:163–79.

46. Qiu JX, Yoon HJ, Fearn PA, Tourassi GD. Deep learning for automatedextraction of primary sites from cancer pathology reports. IEEE J BiomedHealth Inform 2018;22:244–51.

47. Gao S, Young MT, Qiu JX, Yoon H-J, Christian JB, Fearn PA, et al.Hierarchical attention networks for information extraction from cancerpathology reports. J Am Med Inform Assoc 2017 Nov 16 [Epub ahead ofprint]. doi: 10.1093/jamia/ocx131.

48. Alawad M, Yoon H, Tourassi GD. Coarse-to-fine multi-task training ofconvolutional neural networks for automated information extraction fromcancer pathology reports. In: 2018 IEEEEMBS International Conference onBiomedical Health Informatics (BHI); 2018. p. 218–21.

49. HPC-Based hyperparameter search of MT-CNN for information extractionfrom cancer pathology reports. [cited 2019 feb 12]. available from: https://sc18.supercomputing.org/proceedings/workshops/workshop_pages/ws_cafcw107.html.

50. Yala A, Barzilay R, Salama L, Griffin M, Sollender G, Bardia A, et al. Usingmachine learning to parse breast pathology reports. Breast Cancer Res Treat2017;161:203–11.

51. Schapire RE. The boosting approach to machine learning: an overview.Nonlinear Estimation and Classification. Springer; 2003 [cited 2019Feb 11]. Available from: https://www.cs.princeton.edu/courses/archive/spring07/cos424/papers/boosting-survey.pdf.

52. Acevedo F, Armengol VD, Deng Z, Tang R, Coopey SB, Braun D, et al.Pathologic findings in reduction mammoplasty specimens: a surrogate forthe population prevalence of breast cancer and high-risk lesions.Breast Cancer Res Treat 2019;173:201–7.

53. Savova GK, Tseytlin E, Finan S, Castine M, Miller T, Medvedeva O, et al.DeepPhe: a natural language processing system for extracting cancerphenotypes from clinical records. Cancer Res 2017;77:e115–8.

54. Public release of the DeepPhe analytic software. DeepPhe; 2019 [cited2019 Feb 14]. Available from: https://github.com/DeepPhe/DeepPhe-Release.

55. Castro SM, Tseytlin E, Medvedeva O, Mitchell K, Visweswaran S,Bekhuis T, et al. Automated annotation and classification of BI-RADS assessment from radiology reports. J Biomed Inform 2017;69:177–87.

56. Chandran UR,Medvedeva OP, BarmadaMM, Blood PD, Chakka A, LuthraS, et al. TCGA expedition: a data acquisition and management system forTCGA Data. PLoS ONE 2016;11. [cited 2019 May 29]. Available from:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5082933/.

57. Bustos A, Pertusa A. Learning eligibility in cancer clinical trials using deepneural networks. Appl Sci 2018;8:1206.

Natural Language Processing for Cancer Phenotypes from EMRs

www.aacrjournals.org Cancer Res; 79(21) November 1, 2019 5469

Page 12: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

58. Joulin A, Grave E, Bojanowski P, Mikolov T. Bag of tricks for efficient textclassification. ArXiv160701759 Cs. 2016 Jul 6 [cited 2019 Feb 15]. Avail-able from: http://arxiv.org/abs/1607.01759.

59. Shivade C, Hebert C, Regan K, Fosler-Lussier E, Lai AM. Automatic datasource identification for clinical trial eligibility criteria resolution.AMIA Annu Symp Proc 2017;2016:1149–58.

60. Zhang K, Demner-Fushman D. Automated classification of eligibilitycriteria in clinical trials to facilitate patient-trial matching for specificpatient populations. J Am Med Inform Assoc 2017;24:781–7.

61. Osborne JD,Wyatt M,Westfall AO,Willig J, Bethard S, Gordon G. Efficientidentificationof nationallymandated reportable cancer cases using naturallanguage processing and machine learning. J AmMed Inform Assoc 2016;23:1077–84.

62. Schapire RE, Singer Y. BoosTexter: a boosting-based system for text cate-gorization. Mach Learn 2000;39:135–68.

63. Bergquist SL, Brooks GA, Keating NL, Landrum MB, Rose S. Classifyinglung cancer severity with ensemble machine learning in health care claimsdata. Proc Mach Learn Res 2017;68:25–38.

64. Gensheimer MF, Henry AS, Wood DJ, Hastie TJ, Aggarwal S, Dudley SA,et al. Automated survival prediction in metastatic cancer patients usinghigh-dimensional electronic medical record data. J Natl Cancer Inst 2018Oct 21 [Epub ahead of print].

65. Yang Y, Fasching PA, Tresp V. Modeling progression free survivalin breast cancer with tensorized recurrent neural networks and accel-erated failure time models. Proceedings of Machine Learning forHealthcare 2017. [cited 2019 Feb 11]. Available from: http://mucmd.org/CameraReadySubmissions/37%5CCameraReadySubmission%5CPFS_TTRNN_AFT_CameraReady.pdf.

66. Yin Z, Harrell M, Warner JL, Chen Q, Fabbri D, Malin BA. The therapy ismakingme sick: howonline portal communications between breast cancerpatients and physicians indicate medication discontinuation. J Am MedInform Assoc 2018;25:1444–51.

67. Lin C, Miller T, Dligach D, Bethard S, Savova G. Improving temporalrelation extractionwith training instance augmentation. In: Proceedings ofthe 15th Workshop on Biomedical Natural Language Processing. Berlin,Germany: Association for Computational Linguistics; 2016. p. 108–13.

68. Galvan D, Okazaki N, Matsuda K, Inui K. Investigating the challengesof temporal relation extraction from clinical text. In: Proceedings of theNinth International Workshop on Health Text Mining and InformationAnalysis. Brussels, Belgium: Association for Computational Linguistics;2018. p. 55–64.

69. Leeuwenberg A, Moens MF. Word-Level loss extensions for neural tem-poral relation classification. In: Proceedings of the 27th InternationalConference on Computational Linguistics. Santa Fe, NM: Association forComputational Linguistics. 2018. p. 3436–47.

70. ICD-9 radiology corpus (available through hNLP Center membership).[cited 2019 Feb 11]. Available from: https://healthnlp.hms.harvard.edu/center/pages/data-sets.html.

71. Karimi S, Dai X, HassanzadehH,NguyenA. Automatic diagnosis coding ofradiology reports: a comparison of deep learning and conventional clas-sification methods. BioNLP 20172017;328–32.

72. Zamaraeva O, Howell K, Rhine A. Improving feature extraction for pathol-ogy reports with precise negation scope detection. In: Proceedings of the27th International Conference on Computational Linguistics. 2018.p. 3564–75. Available from: https://www.aclweb.org/anthology/C18-1302/.

73. Jagannatha A. Structured prediction models for RNN based sequencelabeling in clinical text. In: Proceedings of the 2016 Conference onEmpirical Methods in Natural Language Processing. 2016. p. 856–65.Available from: https://www.aclweb.org/anthology/D16-1082/.

74. Jagannatha AN, Yu H. Bidirectional RNN for medical event detection inelectronic health records. In: Proceedings of the 2016 Conference of theNorth American Chapter of the Association for Computational Linguistics:Human Language Technologies. San Diego, California: Associationfor Computational Linguistics; 2016: p. 473–82. [cited 2019 Jan 18].Available from: http://aclweb.org/anthology/N16-1056.

75. Shivade C, de Marneffe M-C, Fosler-Lussier E, Lai AM. Identification,characterization, and grounding of gradable terms in clinical text. In:Proceedings of the 15th Workshop on Biomedical Natural Language

Processing. Berlin, Germany: Association for Computational Linguistics;2016. p. 17–26.

76. Roberts K, Si Y,Gandhi A, BernstamE.A framenet for cancer information inclinical narratives: schema and annotation. In: Proceedings of the EleventhInternational Conference on Language Resources and Evaluation (LREC-2018). Miyazaki, Japan: European Language Resource Association; 2018[cited 2019 Jan3]. Available from: http://aclweb.org/anthology/L18-1041.

77. Lee WS, Ahn SM, Chung JW, Kim KO, Kwon KA, Kim Y, et al. Assessingconcordance with watson for oncology, a cognitive computing decisionsupport system for colon cancer treatment in Korea. JCO Clin CancerInform 2018;2:1–8.

78. Kim EJ, Woo HS, Cho JH, Sym SJ, Baek JH, Lee WS, et al. Early experiencewith Watson for oncology in Korean patients with colorectal cancer.PLoS One 2019;14:e0213640.

79. Choi YI, Chung JW,KimKO,KwonKA,KimYJ, ParkDK, et al. Concordancerate between clinicians and watson for oncology among patients withadvanced gastric cancer: early, real-world experience in Korea. Can JGastroenterol Hepatol 2019;2019:8072928.

80. U.S. Food and Drug Administration. Artificial intelligence andmachine learning in software as a medical device. 2019 Apr 2 [cited2019 Jun 6]. Available from: https://www.fda.gov/medical-devices/software-medical-device-samd/artificial-intelligence-and-machine-learning-software-medical-device.

81. U.S. Food and Drug Administration. Proposed regulatory framework formodifications to artificial intelligence/machine learning (AI/ML)-basedsoftware as a medical device (SaMD). [cited 2019 Jun 6]. Available from:https://www.fda.gov/media/122535/download.

82. Schuler A, Callahan A, Jung K, Shah NH. Performing an informaticsconsult: methods and challenges. J Am Coll Radiol JACR 2018;15:563–8.

83. Hirsch JS, Tanenbaum JS, Lipsky Gorman S, Liu C, Schmitz E, Hashorva D,et al. HARVEST, a longitudinal patient record summarizer. J Am MedInform Assoc 2015;22:263–74.

84. Kumah-Crystal YA, Pirtle CJ, Whyte HM, Goode ES, Anders SH, LehmannCU.Electronic health record interactions through voice: a review. ApplClinInform 2018;9:541–52.

85. Gregg JR, Lang M, Wang LL, Resnick MJ, Jain SK, Warner JL, et al. Auto-mating the determination of prostate cancer risk strata from electronicmedical records. JCO Clin Cancer Inform 2017;1. doi: 10.1200/CCI.16.00045.

86. National Cancer Institute. Hope and challenge: the NCI annual plan andbudget proposal for fiscal year 2020. 2018 [cited 2019 Feb 11]. Availablefrom: https://www.cancer.gov/news-events/cancer-currents-blog/2018/sharpless-nci-annual-plan-2020.

87. Giordano SH, Kuo YF, Duan Z, Hortobagyi GN, Freeman J, Goodwin JS.Limits of observational data indeterminingoutcomes fromcancer therapy.Cancer 2008;112:2456–66.

88. Noone AM, Lund JL, Mariotto A, Cronin K, McNeel T, Deapen D, et al.Comparison of SEER treatment datawithmedicare claims.MedCare 2016;54:e55–64.

89. Baldwin LM, Adamache W, Klabunde CN, Kenward K, Dahlman C,L Warren J. Linking physician characteristics and medicare claimsdata: issues in data availability, quality, and measurement. Med Care2002;40(8 Suppl):IV-82–95.

90. Lerro CC, Robbins AS, Phillips JL, Stewart AK. Comparison of casescaptured in the national cancer data base with those in population-based central cancer registries. Ann Surg Oncol 2013;20:1759–65.

91. Hernandez-Boussard T, Tamang S, Blayney D, Brooks J, Shah N. Newparadigms for patient-centered outcomes research in electronic medicalrecords: an example of detecting urinary incontinence following prosta-tectomy. EGEMS (Wash DC) 2016;4:1231.

92. STAT. IBM's Watson recommended "unsafe and incorrect" cancertreatments. 2018 [cited 2019 Jun 13]. Available from: https://www.statnews.com/2018/07/25/ibm-watson-recommended-unsafe-incorrect-treatments/.

93. U.S. Food andDrugAdministration.Developing a softwareprecertificationprogram: a working model. [cited 2019 Feb 11]. Available from: https://www.fda.gov/downloads/MedicalDevices/DigitalHealth/DigitalHealthPreCertProgram/UCM605685.pdf.

Cancer Res; 79(21) November 1, 2019 Cancer Research5470

Savova et al.

Page 13: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

Precision Medicine and Imaging

AI-Assisted In Situ Detection of Human GliomaInfiltration Using a Novel Computational Methodfor Optical Coherence TomographyRonald M. Juarez-Chambi1, Carmen Kut2, Jose J. Rico-Jimenez1, Kaisorn L. Chaichana3,Jiefeng Xi2, Daniel U. Campos-Delgado4, Fausto J. Rodriguez5,Alfredo Quinones-Hinojosa3, Xingde Li2, and Javier A. Jo6

Abstract

Purpose: In glioma surgery, it is critical to maximize tumorresectionwithout compromising adjacent noncancerous braintissue. Optical coherence tomography (OCT) is a noninvasive,label-free, real-time, high-resolution imaging modality thathas been explored for glioma infiltration detection. Here, wereport a novel artificial intelligence (AI)-assisted method forautomated, real-time, in situ detection of glioma infiltration athigh spatial resolution.

Experimental Design: Volumetric OCT datasets wereintraoperatively obtained from resected brain tissue speci-mens of 21 patients with glioma tumors of different stagesand labeled as either noncancerous or glioma-infiltrated onthe basis of histopathology evaluation of the tissue specimens(gold standard). Labeled OCT images from 12 patients wereused as the training dataset to develop the AI-assisted OCT-based method for automated detection of glioma-infiltratedbrain tissue. Unlabeled OCT images from the other 9 patients

were used as the validation dataset to quantify the methoddetection performance.

Results:Ourmethod achieved excellent levels of sensitivity(�100%) and specificity (�85%) for detecting glioma-infiltrated tissue with high spatial resolution (16 mm laterally)and processing speed (�100,020 OCT A-lines/second).

Conclusions: Previous methods for OCT-based detectionof glioma-infiltrated brain tissue rely on estimating the tissueoptical attenuation coefficient from the OCT signal, whichrequires sacrificing spatial resolution to increase signal quality,and performing systematic calibration procedures usingtissue phantoms. By overcoming these major challenges,our AI-assisted method will enable implementing practicalOCT-guided surgical tools for continuous, real-time, andaccurate intraoperative detection of glioma-infiltrated braintissue, facilitating maximal glioma resection and superiorsurgical outcomes for patients with glioma.

IntroductionGliomas are the most common and aggressive primary brain

cancers in adults (1, 2). It is well established that maximal gliomasurgical resection can lead to both prolonged survival and delayedcancer recurrence (1, 3–6). The challenge, however, lies in thelimited ability of neurosurgeons to differentiate cancerous versusnoncancerous brain tissue during resection surgery. The standardof care, which is interpreted as the surgeon's perception of cancer

based on gross appearance and all available intraoperative sur-gical navigational systems, has shown to have 100% sensitivityand 40% to 50% specificity (7). Overcoming this surgical chal-lenge will enable both maximizing cancer resection and mini-mizing damage of healthy brain tissue, thus significantly improv-ing both the overall survival rate (OS) and progression-freesurvival (PFS; refs. 8–10). Several imaging techniques are cur-rently being evaluated or already adopted as image-guided sur-gical tools to assist with brain cancer resection. MRI providesexcellent visualization of soft tissue, but it is not sensitive atdetecting microscopic diseases at tumor margin, even when usedintraoperatively (11). Intraoperative CT (iCT) allows assessing forresidual cancer, but has low resolution at the tumor periph-ery (12). In addition, these imaging modalities are time-consum-ing, costly (upwards of $1 million dollars to adopt), and do notprovide continuous real-time intraoperative guidance. Intrao-perative ultrasound imaging (iUS) enables real-time imaging,but it has limited contrast and spatial resolution for brain cancerdetection (13). Intraoperative fluorescence imaging of 5-amino-levulinic acid (5-ALA) induced protoporphyrin-IX (PpIX) hasshown a good correlation between fluorescence distribution andthe presence of high-grade glioma (14), but it has shown limitedsensitivity and specificity for detecting cancer-infiltrated braintissue and low-grade gliomas (15, 16). Raman spectroscopy andimaging have been broadly applied for brain tissue biochemicaldifferentiation (17) and glioma infiltration detection by

1Department of Biomedical Engineering, Texas A&M University, College Station,Texas. 2Department of Biomedical Engineering, Johns Hopkins University,Baltimore, Maryland. 3Department of Neurologic Surgery, Mayo Clinic, Jackson-ville, Florida. 4Facultad de Ciencias, Universidad Aut�onoma de San Luis dePotosí, San Luis de Potosí, Mexico. 5Division of Neuropathology, Department ofNeurosurgery, Johns Hopkins University, Baltimore, Maryland. 6School of Elec-trical and Computer Engineering, University of Oklahoma, Norman, Oklahoma.

Note: Supplementary data for this article are available at Clinical CancerResearch Online (http://clincancerres.aacrjournals.org/).

Corresponding Author: Javier A. Jo, University of Oklahoma, StephensonResearch and Technology Center, Suite 1108, 101 David L. Boren Blvd., Norman,OK 73019. Phone: (405) 325-9600; Fax: (405) 325-6029; E-mail: [email protected]

Clin Cancer Res 2019;25:6329–38

doi: 10.1158/1078-0432.CCR-19-0854

�2019 American Association for Cancer Research.

ClinicalCancerResearch

www.aacrjournals.org 6329

Page 14: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

providing subcellular resolution and label-free imagingcapabilities (18–22). Unfortunately, several limitations are asso-ciated to these techniques, including the intrinsic weakness of theRaman signal, limited imaging depth andfield of view (FOV), andslow imaging speed (23–25). In addition, the capability of Ramanspectroscopy and/or imaging for detecting cancer-infiltrated braintissue intraoperatively has not yet fully demonstrated (26, 27).More recently, coherent Raman Scattering (CRS) and StimulatedRaman Scattering Microscopy (SRS) have been explored for braintumor margin differentiation; however, a definite intraoperativecomputer-aided diagnosis (CAD) system for human brain tissuedifferentiation has not been reported (23, 26, 28, 29). In sum-mary, there is still an urgent need for image-guided tools capableof providing continuous, in situ and accurate assessment of braincancer infiltration during brain tumor resection surgery.

Optical coherence tomography (OCT) is a noninvasivemedical imaging technique capable of continuous, label-free,high-resolution, 2D and 3D imaging of biological tissues (30).Because the imaging depth of OCT (1.5–3 mm) is similar to theresection depth of cancer-infiltrated brain regions, OCT has beenevaluated as an image-guided tool for brain tumor resectionsurgery (7, 31–35). One common limitation of previous studies,however, is the lack of adequate computational methods forrapid, automated, and accurate intraoperative detection of can-cer-infiltrated brain tissue at high spatial resolution, particularlyfor glioma resection.

Previous computational methods for OCT-based detection ofglioma-infiltrated brain tissue rely on estimating the tissueoptical attenuation coefficient from theOCT signal, which requireaveraging multiple A-line signals to reduce the noise and thussacrificing spatial resolution. In addition, previous methods alsorequire performing calibration procedures using tissue phan-toms (7, 31, 33, 36). To overcome these major challenges, wehave developed a novel artificial intelligence (AI)-based compu-tational method in which each depth-dependent OCT intensity

measurement (or A-line) is modeled as a linear combination ofunderlying characteristic intensity-depth profiles. As a result, thismethod enables identifying and quantifying intensity-depth sig-natures specific to A-lines from glioma-infiltrated brain tissue,which can be utilized as discriminative features within machinelearning algorithms to detect glioma-infiltrated brain tissue. Themethod was successfully developed using a database of OCT 3Dimages taken from freshly resected human noncancerous andglioma-infiltrated brain tissue samples, and its performance wasrobustly quantified using an independent validation database.Owing to its demonstrated accuracy, low computational cost, andhigh spatial resolution, this method has the potential to enablethe development of OCT-guided surgical tools for continuous,real-time, and accurate in situ intraoperative detection of gliomainfiltration.

Materials and MethodsDatabase of OCT scans from fresh brain tissue surgical samples

Intraoperative, fresh brain tissue samples were obtained fromthe edge of the surgical cavity based on neurosurgeon visualinterpretation and image-guided navigation in 21 surgical gliomapatients. The imaging protocol was approved by the InstitutionalReview Board at Johns Hopkins University (Baltimore, MD),which follows the Belmont Report ethical guidelines. Informedwritten consent was obtained from each subject or each subject'slegal guardian. The tissue samples corresponded to either non-cancerous or glioma-infiltrated brain regions. A number of OCTvolumes were acquired fromdifferent locations within each braintissue sample. The measured OCT lateral (i.e., horizontal) andaxial (i.e., vertical or depth) resolutions were approximately16.0 mm and 6.4 mm (in tissue), respectively (7). Each volumeconsists of a series of 10 OCT cross-sectional images or B-scans of1,024 pixels (2 mm) laterally by 2,048 pixels (2.5 mm) in depth,where each B-scanwas acquired at 0.5-mm intervals, resulting in avolumeof 5�2�2.5mm3 (W� L�D). Todivide thedatabase ofOCT volumes into training and validation sets, the 21 patientswere randomly divided into two groups, one with 12 and anotherwith 9 patients. All the volumes from the group of 12 patientswere assigned to the training set, while all the volumes from thegroup of 9 patients were assigned to the validation set. All thesamples underwent histopathologic processing and evaluation bya neuropathologist (7). The histopathologic distributions of theOCT volumes in the training and validation sets are summarizedin Table 1.

OCT B-scans preprocessingEach original OCT B-scan was preprocessed following the

procedure described in Fig. 1. First, the original B-scan(Fig. 1A) was cropped to remove artifacts from above the tissuesurface using a predefined fixed crop (Fig. 1B). Then, the tissuesurface was detected from the cropped B-scan using the CannyEdge Detection algorithm (Fig. 1C; ref. 37), and the B-scan waswarped using a circle-shifting upward method to flatten thesurface (Fig. 1D). Finally, to eliminate reflection artifacts withinthe tissue region caused by the cover glass or the saline surface(Fig. 1D, arrows), a peak detection algorithmwas applied and theregions of the A-line around the detected peaks were smoothedusing a 2D entropy filter of order 5 � 5. Although, these pre-processing steps do not guarantee the absolute elimination ofall artifacts, the resulting preprocessed B-scans (Fig. 1E) were

Translational Relevance

Maximal tumor resection improves overall survival anddelays cancer recurrence in patients with glioma; however,the margins of highly infiltrating gliomas are often very dif-ficult to delineate during glioma resection surgery. Variousmedical imaging modalities are used pre- and/or intraopera-tively to assist in the delineation of glioma margins. Unfor-tunately, none of these technologies can provide quantitative,real-time, accurate, and continuous guidance during gliomaresection surgery. Optical coherence tomography (OCT) is anoninvasive, label-free, real-time, high-resolution volumetricimagingmodality. Previous computational methods for OCT-based detection of glioma infiltration require sacrificing sig-nificantly spatial resolution and performing cumbersomecalibration procedures. We have developed and validated analternative accurate and fast artificial intelligence (AI)-assistedcomputational method that overcomes these major limita-tions. Our method can be implemented within generic OCTinstruments to enable real-time, high-resolution, automated,accurate, in situ, intraoperative detection of glioma infiltration,facilitating maximal tumor resection and improved surgicaloutcomes for patients with glioma.

Juarez-Chambi et al.

Clin Cancer Res; 25(21) November 1, 2019 Clinical Cancer Research6330

Page 15: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

adequate for the application of the A-line modeling methoddescribed in the following section.

Model-based OCT A-line feature extractionThemain idea behind ourmethod for automated classification

of noncancerous versus glioma-infiltrated brain regions is tomodel every A-line yk of any OCT B-scan as a linear combinationof N profiles or end-members pn ðn ¼ 1; . . . ;NÞ:

yk ¼XN

n¼1

ak;npn 8k ¼ 1; . . . ;K: ðAÞ

The profiles pn are assumed to be the same for all the A-lines ofthe available data, while the linear coefficients or abundancesak;n; are assumed to be unique to each A-line yk. The profiles pn

were first estimated from the training set, consisting of 1,940B-scans (Table 1). To accelerate the estimation of the profiles, onlyevery other A-line in each B-scan of the training set were used(512 out of 1,024 A-lines per B-scan). All the selected trainingA-lines (1,940 � 512 ¼ 993,280) were arranged into a matrixY ¼ ½y1 ::: yK � of size L� K, where L is the length of each A-line(1,024) and K is the total number of A-lines in the training set(993,280). TheNunknownprofiles pnwere arranged into amatrixP ¼ ½p1 ::: pN � of size L�N, where L is the length of each profile(equal to the A-line length) and N is the number of profiles.The unknown abundances were arranged into a matrixA ¼ ½a1 : :: aK � of size N � K , respectively, where the abundancecolumn vector at the kth A-line is denoted as ak ¼ ½ak;1 ::: ak;N �'.Using this matrix notation, the modeling of all the A-lines, basedon Eq. (A), can be expressed as: Y ¼ PA.

The simultaneous estimation of the unknown profile P andabundance A matrices from the training OCT A-line data Y canbe formulated as a nonlinear quadratic optimization problem,with the following specific constrains: (i) the profiles canhave positive values only (P � 0), (ii) the abundances canhave positive values only (A � 0), and (iii) the values of theabundances for a given A-line should add to one, since theyrepresent the relative contribution of each profile to that A-line(AT1 ¼ 1). The resolution of this constrained nonlinear qua-dratic optimization problem can be performed by applying ourrecently developed and validated blind end-member and abun-dance estimation (BEAE) method (38, 39), which minimizesthe following cost function:

minP;A

12jjY � PAjj2F þ r

XN�1

i¼1

XN

j¼iþ1jjpi � pjjj2 �mjjAjj2F : ðBÞ

The first term is directly related to the quadratic optimizationapproach in Eq. (A). The second term is a regularization term thatpenalizes the distance between profiles by using a regularizationparameter r > 0. The third term is a regularization parameterfor the abundances that ensures low entropy conditions amongA-lines by using m > 0. Once the profiles P have been estimatedfrom the training data, the abundances for any new set of A-linescan be directly estimated by solving Eq. (A) using a constrainedlinear least square approach. This estimation is computationallyfast, as it only involves solving a system of linear equations withpositivity constrain on the abundances.

Classifier trainingThe abundances ak;n estimated for each A-line can be used

as discriminative features within a machine learning algorithmdesigned to classify each A-line as from either a noncancerousor a glioma-infiltrated brain region. Because of the number offeatures ðN � 1Þ versus the number of training data (K), asimple logistic regression classifier was chosen over othermore complex methods, such as support-vector machines andneural networks (40). Because each whole OCT volume inthe training set was annotated as either noncancerous orglioma-infiltrated brain tissue (Table 1), all the A-lines in agiven volume were labeled based on their volume annotation.The resulting abundances A from the 993,280 labeled A-linesin the training set (194 volumes � 10 B-Scans/volume � 512A-lines/B-Scan, see Table 1) were then used to optimize thelogistic regression classifier. Because the logistic regression

Table 1. Training set (12 patients, 194 OCT volumes, 1,940 B-scans) andvalidation set (9 patients, 295 OCT volumes, 2,950 B-scans)

Patient # Location# NoncancerousOCT volumes

# Glioma-infiltratedOCT volumes

Grade ofcancer

Training setPatient-1 1A 3 — —

Patient-2 2A 4 — —

Patient-3 3A 3 — —

Patient-4 4A 13 — —

4B 8 — —

Patient-5 5A 10 — —

Patient-6 6A — 8 Grade II6B — 9 Grade II6C — 11 Grade II6D — 13 Grade II6E — 16 Grade II

Patient-7 7A — 5 Grade II7B — 11 Grade II

Patient-8 8A — 8 Grade II8B — 10 Grade II8C — 11 Grade II

Patient-9 9A 18 — —

Patient-10 10A — 5 Grade IV10B — 5 Grade IV10C — 5 Grade IV

Patient-11 11A — 8 Grade IVPatient-12 12A 10 — —

Total in training set 69 125Validation setPatient-13 13A — 13 Grade II

13B — 17 Grade II13C — 14 Grade II

Patient-14 14A 18 — —

Patient-15 15A 15 — —

15B 9 — —

15C 9 — —

Patient-16 16A 16 — —

16B 8 — —

16C 16 — —

Patient-17 17A 23 — —

17B 16 — —

Patient-18 18A 21 — —

Patient-19 19A — 22 Grade IV19B — 14 Grade IV

Patient-20 20A 12 — —

20B — 17 Grade IIPatient-21 21A 5 — —

21B — 15 Grade IV21C — 15 Grade IV

Total in validationset

168 127

AI-Assisted OCT-Guided Glioma Surgical-Margin Detection

www.aacrjournals.org Clin Cancer Res; 25(21) November 1, 2019 6331

Page 16: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

classifier was trained to classify each A-line of a new OCTvolume as either from a noncancerous or a glioma-infiltratedbrain region, not all the A-lines from the new OCT volumewould necessarily be classified to the same class. Therefore, toclassify a whole new OCT volume as either from a noncan-cerous or a glioma-infiltrated brain region, a threshold on the

percentage of A-lines classified as from a glioma-infiltratedregion in that volume was used. This threshold was deter-mined by performing a ROC analysis following a Leave-One-Patient-Out-Cross-Validation (LOPOCV) classification perfor-mance estimation strategy with the OCT volumes of thetraining set.

Figure 1.

Preprocessing steps applied to every B-scan in both the training and validation sets. A, Unprocessed B-scan. B, Cropped B-scan. C, Surface detection using theCanny Edge Detection algorithm. D,Warped B-scan to generate a flat surface. Arrows indicate reflection artifacts. E, Peak detection to identify locations ofreflection artifacts and entropy filtering around the detected peaks to obtain a completely preprocessed B-scan. F, Results of the BEAE analysis applied to thetraining set. Top, all the A-lines (Y) from the training set included in the BEAE analysis. Middle, Estimated abundances (A) for each A-line analyzed. Bottom,estimated profiles (P) common to all the A-lines included in the BEAE analysis. G, Sample A-line from a noncancerous volume and its fit modeled as the linearcombination of the estimated common profiles (P). H, Sample A-line from a glioma-infiltrated volume and its fit modeled as the linear combination of theestimated common profiles (P).

Juarez-Chambi et al.

Clin Cancer Res; 25(21) November 1, 2019 Clinical Cancer Research6332

Page 17: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

Classification performance estimationThe overall machine learning classification computational

scheme optimized with the training set was applied to thevalidation set. First, the same profiles P estimated from thetraining set were used to directly estimate the abundances of eachof the A-lines in the validation set. Then, the resulting abundancesA from the 3,020,800 A-lines in the validation set (295 volumes�10 B-Scans/volume � 1,024 A-lines/B-Scan, see Table 1) wereused to classify each A-line as either from a noncancerous or aglioma-infiltrated brain region using the same logistic regressionclassifier optimized with the training set. Finally, each of the 295OCT volumes in the validation set were classified as either from anoncancerous or a glioma-infiltrated brain region using the samethreshold on the volume percentage of A-lines classified as from aglioma-infiltrated region, previously optimized with the trainingset. The classification performance obtained from the validationset was quantified in terms of overall classification accuracy,sensitivity, and specificity.

ResultsOCT A-line model performance

The BEAE method was applied using a model order of n ¼ 3(number of profiles) to the A-lines of the OCT volumes in thetraining set, as illustrated in (Fig. 1F). A detailed description of themethod used to determine the optimal BEAE order value of n¼ 3is provided as Supplementary Material. All the 1,940 B-scansanalyzed are shown stacked up next to each other horizontallyin the top panel. No clear distinction between A-lines from

noncancerous and glioma-infiltrated brain regions (separatedby the red dashed line) can be observed from the stackedB-scans. The estimated abundances A of the 993,280 A-linesfrom the training set included in the BEAE analysis are shownin the middle panel. The three profiles P estimated from thetraining set, shown in the bottom panel, have complementaryshapes and positive amplitude values as expected due to thepositivity optimization constrain. To illustrate the capability ofthe BEAE method to model A-lines as a linear combination ofthe three estimated common profiles (P), sample A-lines andtheir model fits are shown in (Fig. 1G) (noncancerous volume)and (Fig. 1H; glioma-infiltrated volume). It can be observedthat the model fits capture the shape of the A-line withoutoverfitting the noise in the OCT signal.

Model-based OCT A-line feature extractionAn important consequence of modeling any A-line yk as a

linear combination of a set of commons profiles pn is theresulting unique representation of each A-line in terms of itsabundances ak;n. Because these unique sets of abundancesparameterize the unique shape of each A-line, they can beutilized as feature vectors within a machine learning classifi-cation algorithm. Because the abundances of each A-line add to1, only (N�1) abundances are independent. Because the OCTtraining A-lines were modeled using three abundances (N ¼ 3),only two of them could be used as classification features. Thedistributions of each of the three abundances for the noncan-cerous and glioma-infiltrated A-lines from the training set areshown in (Fig. 2). It can be observed that the first abundance is

Figure 2.

Distributions of the threeabundances of the noncancerousand glioma-infiltrated A-lines in thetraining set. A, The first abundance(ak;1) is distributed at lower valuesfor the glioma-infiltrated A-lines.B, The distribution of the secondabundance (ak;2) roughly mirrorsthat of the first abundance. C, Thethird abundance (ak;3) isdistributed at higher values for theglioma-infiltrated A-lines. D,Distributions of the feature vectorsxk ¼ ½ak;1;ak;3� for thenoncancerous and glioma-infiltrated A-lines (only 0.02% ofthe total training set is shown forclarity).

AI-Assisted OCT-Guided Glioma Surgical-Margin Detection

www.aacrjournals.org Clin Cancer Res; 25(21) November 1, 2019 6333

Page 18: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

distributed at lower values for the glioma-infiltrated A-lines(Fig. 2A), while the opposite can be observed for the thirdabundance (Fig. 2C). In addition, the distributions of the firstand second abundances are roughly the reflection of each other,providing redundant information (Fig. 2A and B). Following anexhaustive feature selection approach, the first and third abun-dances ðak;1 ;ak;3Þ were chosen for the optimal feature vectorxk ¼ ½ak;1 ;ak;3� used for training the logistic regression classifieraiming to identify an A-line as either from a noncancerous or aglioma-infiltrated brain region. In Fig. 2D, the distributions ofthese feature vectors for the noncancerous and glioma-infiltrated A-lines of the training set are shown in the two-dimensional space ðak;1 ;ak;3Þ. A detailed description of themethod used for feature selection is provided as SupplementaryMaterial.

Classifier trainingThe logistic regression classifier was trained to classify any

A-line as either from a noncancerous or a glioma-infiltratedbrain region using the 194 OCT volumes in the training set. Toclassify a whole new OCT volume as either noncancerous orglioma-infiltrated, the threshold on the percentage of A-linesclassified as from a glioma-infiltrated region in that volume wasdetermined by applying ROC analysis following a LOPOCVperformance estimation strategy in all the 194 OCT volumes ofthe training set. The corresponding area under the ROC curve(AUC) was AUC ¼ 0.96, which indicates a very promisingclassification performance. Because the clinical emphasis is toobtain maximal glioma-infiltrated tissue resection while pre-serving as much healthy tissue as possible, sensitivity fordetecting glioma-infiltrated tissue was prioritized over specific-ity. A threshold of 80% was selected to maximize the sensitivityfor detecting glioma-infiltrated region (99.15%) while main-taining as much noncancerous tissue as possible (86.21%). Theresults of the ROC analysis are shown in (Fig. 3A).

Classification performance estimationThe performance of the trained logistic regression classifier was

estimated blindly on the totally independent validation set asfollows. First, the abundances of each A-line in the validation setwere estimated using the same profiles already estimated from thetraining set (Fig. 1F, bottom). The abundance estimation is com-putationally fast, due to the matrix operation approach, allowingcomputing a B-Scan (1024 A-lines) in 30 milliseconds usingMATLAB in aCore i74790K4GHzprocessor.Once the abundanceshave been estimated, each A-line in the validation set was classifiedusing the independently trained logistic regression classifier. AftertheA-line level classification, eachOCTvolumewasfinally classifiedas either from a noncancerous or a glioma-infiltrated region usingthepreviously selected thresholdon thepercentageofA-lines in thatvolume classified as from a glioma-infiltrated region. The results onthe classification of the OCT volumes in the validation set aresummarized in Table 2. The applied double-blinded validationindicated promising levels of sensitivity (>90%) and specificity(>80%) for discriminating low-grade and/or high-grade glioma-infiltrated tissue from noncancerous tissue. Nevertheless, approxi-mately 15% of all volumes were misclassified, probably due tosignificant intra-class variability and extra-class similarity observedamong the OCT images, as illustrated in (Fig. 3B–E). It is worthnoting that this validation set was completely independent (col-lected from 9 different patients) and the validation was blindly

performed,meaning that the validation set was provided unlabeledfor the described performance estimation procedure.

To demonstrate the potential of our computational frameworkfor real-time accurate detection and volumetric visualization ofcancerous and noncancerous brain tissue, two unlabeled high-resolution OCT volumes (5� 2� 2.5mm3; 256� 2,048� 2,048pixels) from a noncancerous and a glioma-infiltrated brain regionwere blindly processed with our trained computational frame-work. The computational speed for processing and classifyingeach A-line was >100,000 A-lines per second. To visualize theclassification results, the OCT volumetric data was 3D rendered,and the surface of the imaged brain region was color-coded usinga colormapproportional to the estimatedpost-probability of eachA-line being from a glioma-infiltrated brain region (Fig. 3F). FortheOCT volumeof a glioma-infiltrated brain region (Fig. 3F, left),100 % of the A-lines were correctly classified as from glioma-infiltrated brain region. For the OCT volume of a noncancerousbrain region (Fig. 3F, right), 92 % of the A-lines were correctlyclassified as from noncancerous brain region.

DiscussionMaximal tumor resection both improves the overall survival

and delays cancer recurrence in patients with low-grade and high-grade glioma (6, 41–43). The limited ability of neurosurgeons todifferentiate noncancerous versus cancer-infiltrated brain tissuesduring resection surgery is the main challenge preventing higherrates of maximal tumor resection. Although several imagingtechniques have been utilized routinely to assist brain cancersurgeries (7, 13, 29, 44, 45), there are significant limitations tothese modalities. An effective image-guided tool for brain cancerresection surgery should be capable of providing high-resolution,accurate, continuous, and real-time in situdiscrimination betweennoncancerous and cancer-infiltrated brain tissue from intraopera-tive volumetric brain images.

Optical imaging modalities are well suited to enable suchcapabilities. Among them, intraoperative imaging of 5-ALA-induced PpIX brain tissue fluorescence is perhaps the mostextensively evaluated approach. Unfortunately, its performancefor identifying glioma-infiltrated brain regions has not been fullydemonstrated, with different studies reporting a wide range ofsensitivity (50%–100%) and specificity (70%–100%; ref. 46).One major limitation of this approach is its dependency on thesufficient and specific 5-ALA uptake by the glioma tissue, whichcan be affected by many factors, including blood–brain barrierpermeability, cellular/vascular proliferation, and gliomagrade (16, 47). Another limitation is the lack of quantitativemethods to image the 5-ALA–induced PpIX fluorescence, whichhas prevented successfully moving from a subjective to a moreobjective and accurate interpretation of 5-ALA–induced PpIXbrain tissue fluorescence images (48).

OCT can be seemingly implemented as a hand-held surgicaltool and/or integrated into standard surgical microscopes toprovide label-free, high-resolution, and fast volumetric tissueimaging. These capabilities and its relatively inexpensive imple-mentation cost make OCT an ideal imaging modality to enablecontinuous real-time guidance during brain cancer resectionsurgery. However, for OCT to become an impactful image-guided tool for brain cancer resection surgery, CAD systems areneeded to enable in situ intraoperative automated, objective, andaccurate detection of cancer-infiltrated brain tissue, as well as real-

Juarez-Chambi et al.

Clin Cancer Res; 25(21) November 1, 2019 Clinical Cancer Research6334

Page 19: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

time volumetric visualization of cancerous and noncancerousbrain tissue during tumor resection surgery.

Previously reported approaches for OCT-based detection ofcancer-infiltrated brain tissue rely on estimating the tissueoptical attenuation coefficient from the OCT signal, which is

used as a discriminative feature for identifying cancerous orcancer-infiltrated brain tissue (7, 34, 49). For instance, Kut andcolleagues recently introduced a computationally efficientmethod to estimate the optical attenuation coefficient of braintissue from OCT scans, and demonstrated the potential of this

Table 2. Confusion matrix of the blind validation classification results

Low-grade vs. noncancerous High-grade vs. noncancerous Low/high-grade vs. noncancerous

Predicted Sensitivity

90.16%Specificity80.95%

Sensitivity95.45%

Specificity82.14%

Sensitivity90.55%

Specificity82.73%

þ 55 32 63 30 115 29� 6 136 3 138 12 139

NOTE: Left column, results for the classification of low-grade glioma-infiltrated versus noncancerous brain tissue; middle, results for the classification of high-gradeglioma-infiltrated versus noncancerous brain tissue; right column, results for the classification of low/high-grade glioma-infiltrated versus noncancerous brain tissue.OCT volumes were predicted as being either positive (þ) or negative (�) for the presence of glioma infiltration.

Figure 3.

A, Results of the ROC analysis performed on the training set to select the threshold in the volume percentage of A-lines classified as from a glioma-infiltratedregion used to classify the whole volume as either from a noncancerous or a glioma-infiltrated region. The AUC¼ 0.96 indicates a very promising classificationperformance. A threshold of 80%was selected to maximize the sensitivity (99.15%) at the best specificity level possible (86.21%) for classifying volumes from aglioma-infiltrated region. Sample classified B-scans from the validation set, in which the surface was color-coded based on each A-line classification (red: glioma-infiltrated; green: noncancerous): true positive (B), false positive (C), false negative (D), and true negative (E). F, Sample OCT 3D rendered images of brainregions in which the surface of the imaged brain tissue is color-coded using a colormap proportional to the estimated post-probability of each A-line being from aglioma-infiltrated brain region. For the sample OCT volume of a glioma-infiltrated brain region (left), 100% of the A-lines were correctly classified as from glioma-infiltrated brain region. For the sample OCT volume of a noncancerous brain region (right), 92% of the A-lines were correctly classified as from noncancerousbrain region.

AI-Assisted OCT-Guided Glioma Surgical-Margin Detection

www.aacrjournals.org Clin Cancer Res; 25(21) November 1, 2019 6335

Page 20: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

estimated optical parameter to detect glioma-infiltrated braintissue (sensitivity �90%, specificity �80%) using an indepen-dent validation set of 59 brain tissue samples (7).

One major limitation of all these OCT quantitative methods,however, is the need to spatially average neighboringOCTA-linesto attain signal-to-noise ratio levels that are adequate for estimat-ing the tissue optical attenuation coefficient. For example, in thestudy by Kut and colleagues, a B-scan of 1024 A-lines was dividedin three adjacent regions, and all the �341 A-lines in each regionwere averaged to estimate three optical attenuation coefficientvalues per B-scan; thus, only three regions per B-scan could beclassified using their approach (7). In contrast, A-line averaging isnot needed in our method; thus, each A-line can be classified,resulting in an improvement in spatial resolution of �341:1compared to the method by Kut and colleagues (7). Moreover,because the spatial resolution of the classification map providedby ourmethod is equal to the lateral optical resolution of theOCTinstrument used (16.0 mm for this study), the resolution of theclassification map can be as good as the best optical resolutionpossible with the available state-of-the-art OCT instrumentationtechnology. The demonstrated superior spatial resolutionenabled by our method is particularly relevant for the accuratedetection of glioma-infiltrated brain tissue, which is characterizedby showing a wide range in the degree of cancer infiltration at thetumor margins.

Furthermore, the previously reported approaches for OCT-based detection of cancer-infiltrated brain tissue, including themethod by Kut and colleagues (7), is the need to performcalibration procedures that could be cumbersome. In compari-son, our method reported here only requires a training set oflabeledOCTbrain tissue scans obtainedwith the sameor a similarOCT instrument to estimate the depth profiles pn used to modeleach A-line and train the logistic regression classifier.

The classification performance of our computational frame-work for detecting glioma-infiltrated brain tissue (sensitivity:>90%; specificity: >82%) was quantified following a robustand unbiased double-blinded validation strategy using anindependent validation set of 295 brain tissue samples.Moreover, the methods adopted at each stage of our compu-tational framework (preprocessing, feature extraction, classi-fication) were also strategically chosen and designed to enablereal-time processing of OCT volumetric images. Thus, anotherrelevant feature of our computational framework is itshigh processing speed (100,020 A-lines per second, usingMATLAB in a Core i7 4790K 4 GHz processor), whichwould enable processing an arbitrary tissue volume of 5 �5 � 2.5 mm3 (256 � 256 � 2,048 pixels) in approximately0.7 seconds. It should be noted that the processing speed ofthis novel computational framework can be significantlyincreased by implementing it using object-oriented program-ming languages and parallel programing and computing.Altogether, the demonstrated classification accuracy and pro-cessing speed of our computational framework, once embed-ded within intraoperative OCT imaging instruments, wouldenable developing clinically relevant CAD systems for auto-mated, accurate, real-time in situ detection of glioma infiltra-tion during tumor resection surgery.

Study limitationsAlthough the results of this preliminary study are quite

encouraging, it still has some limitations. First, our com-

putational framework showed a misclassification rate ofapproximately 15%, in part, due to the noticed intra-classvariability and extra-class similarity observed among theOCT images (Fig. 3B–E), which might indicate the need forusing additional OCT features and/or more sophisticatedclassification methods. Furthermore, from the computationalpoint of view, noisy labeling is expected in medical imagingdata, and in our OCT brain tissue datasets, in particular,due to the heterogeneity of the glioma tissue samples. Toovercome this problem, the use of classification methods withhigher tolerance to noisy labels (e.g., weakly supervised learn-ing models) might improve the performance of the model.Moreover, the current OCT databases, when analyzed at theA-line level, could be sufficiently large to allow exploringstate-of-the-art classifiers such as those based on Convolu-tional Neural Networks (CNN), which might also improve thedetection of glioma-infiltrated brain tissue.

In addition, it can be quite possible that a classificationmodel designed and trained to detect infiltration of a specifictumor type and grade could outperform a more general modeldesigned and trained to detect infiltration of a plurality ofbrain tumor types and/or grades. To investigate this alternativeapproach, two additional classification models were trainedusing either the available low-grade or high-grade glioma-infiltrated samples, and their results are reported as Supple-mentary Material.

Compared with the original more general classificationmodel, these more targeted classifiers performed with lowersensitivity but better specificity. These observations suggestthat once more comprehensive databases become available,more specific classification models can be developed andcompared against more general ones to determine an optimalapproach.

Another limitation is that the OCT datasets used for train-ing and validating our computational framework wereacquired from freshly resected noncancerous, low-grade gli-oma-infiltrated and high-grade glioma–infiltrated brain tissuesamples from glioma patients undergoing tumor resectionsurgery. Because the training and validation datasets used inthis study cannot be considered representative of the "uni-verse" of tumor, the training of our computational frameworkneeds to be repeated using more comprehensive databases ofin vivo OCT volumes that include other tumor types. Onceoptimized, the computational framework would be embed-ded into an intraoperative OCT imaging instrument, and theperformance of the resulting OCT-guided surgical tool forautomated real-time in situ intraoperative detection of braintumor infiltration will have to be quantified in a prospectiveclinical study, as depicted in Fig. 4.

ConclusionsIn conclusion, we have introduced a novel computational

method for OCT-based automated detection of glioma-infiltrated from noncancerous brain tissue. Our method appliesa modeling approach to parametrize the information encoded inthe shape of each depth-dependent OCT intensity signal (A-line)and uses the A-line model parameters as features within amachine learning classification scheme. Because our method canprocess OCT images at their original high spatial resolution anddoes not require performing calibration procedures using tissue

Juarez-Chambi et al.

Clin Cancer Res; 25(21) November 1, 2019 Clinical Cancer Research6336

Page 21: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

phantoms, it overcomes major challenges of previously reportedmethods. Because of its demonstrated detection accuracy, robust-ness and low computational cost, this method could enabledeveloping faster, and more accurate OCT-guided surgical toolsfor continuous, real-time and accurate in situ intraoperative detec-tion of any stage glioma infiltration, facilitating extensive gliomaresection and improved surgical outcomes for patients withglioma.

Disclosure of Potential Conflicts of InterestX. Li reports receiving commercial research grants from MicroTech LLC,

holds ownership interest (or patents) in Insight Photonics and MicroTechLLC, and is a consultant/advisory board member for SIBET, ChineseAcademy of Science. No potential conflicts of interest were disclosed bythe other authors.

Authors' ContributionsConception and design: R.M. Juarez-Chambi, J.J. Rico-Jimenez, J. Xi,A. Quinones-Hinojosa, X. Li, J.A. JoDevelopment of methodology: R.M. Juarez-Chambi, J.J. Rico-Jimenez,K.L. Chaichana, J. XiAcquisition of data (provided animals, acquired and managed patients,provided facilities, etc.): C. Kut, F.J. Rodriguez, A. Quinones-Hinojosa,X. Li, J.A. Jo

Analysis and interpretation of data (e.g., statistical analysis, biostatistics,computational analysis): R.M. Juarez-Chambi, C. Kut, K.L. Chaichana,A. Quinones-Hinojosa, X. Li, J.A. JoWriting, review, and/or revision of the manuscript: R.M. Juarez-Chambi,C. Kut, J.J. Rico-Jimenez, K.L. Chaichana, F.J. Rodriguez, A. Quinones-Hinojosa, X. Li, J.A. JoAdministrative, technical, or material support (i.e., reporting or organizingdata, constructing databases): R.M. Juarez-Chambi, X. LiStudy supervision: X. Li, J.A. JoOther (design of BEAE modeling): D.U. Campos-Delgado

AcknowledgmentsThis research was partially supported by grants from the NIH (grants

R01CA218739, R01CA200399), the Cancer Prevention and Research Instituteof Texas (grant RP180588), the Coulter H. Wallace Foundation, NSF-NCSAXSEDE ASC170017, FONDECYT-CONCYTEC Fellowship. A. Quinones-Hinojosa acknowledged the support by the William J. and Charles H. MayoProfessorship and a Mayo Clinician Investigator award. K.L. Chaichanaacknowledged the support by the Mayo RACER award.

The costs of publication of this article were defrayed in part by the paymentof page charges. This article must therefore be hereby marked advertisementin accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

Received March 14, 2019; revised May 24, 2019; accepted July 12, 2019;published first July 17, 2019.

References1. Almeida JP, Chaichana KL, Rincon-Torroella J, Quinones-Hinojosa A. The

value of extent of resection of glioblastomas: clinical evidence and currentapproach. Curr Neurol Neurosci Rep 2015;15:517.

2. Marko NF, Weil RJ, Schroeder JL, Lang FF, Suki D, Sawaya RE. Extentof resection of glioblastoma revisited: personalized survival modelingfacilitates more accurate survival prediction and supports a maximum-safe-resection approach to surgery. J Clin Oncol 2014;32:774–82.

3. Chaichana KL, Zadnik P, Weingart JD, Olivi A, Gallia GL, Blakeley J, et al.Multiple resections for patients with glioblastoma: prolonging survival.J Neurosurg 2013;118:812–20.

4. Lara-Velazquez M, Al-Kharboosh R, Jeanneret S, Vazquez-Ramos C,Mahato D, Tavanaiepour D, et al. Advances in brain tumor surgery forglioblastoma in adults. Brain Sci 2017;7. doi: 10.3390/brainsci7120166.

5. Eseonu CI, Eguia F, ReFaey K, Garcia O, Rodriguez FJ, Chaichana K, et al.Comparative volumetric analysis of the extent of resection of molecularlyand histologically distinct low grade gliomas and its role on survival.J Neurooncol 2017;134:65–74.

6. Chaichana KL, Cabrera-Aldana EE, Jusue-Torres I, Wijesekera O, Olivi A,Rahman M, et al. When gross total resection of a glioblastoma is possible,how much resection should be achieved? World Neurosurg 2014;82:E257–65.

7. Kut C, Chaichana KL, Xi J, Raza SM, Ye X, McVeigh ER, et al. Detection ofhuman brain cancer infiltration ex vivo and in vivo using quantitativeoptical coherence tomography. Sci Transl Med 2015;7:292ra100.

8. McGirtMJ,MukherjeeD, Chaichana KL, Than KD,Weingart JD,Quinones-Hinojosa A. Association of surgically acquired motor and languagedeficits on overall survival after resection of glioblastoma multiforme.Neurosurgery 2009;65:463–9.

9. Chaichana KL, Jusue-Torres I, Lemos AM, Gokaslan A, Cabrera-Aldana EE,Ashary A, et al. The butterfly effect on glioblastoma: is volumetric extent ofresection more effective than biopsy for these tumors? J Neurooncol 2014;120:625–34.

10. Rahman M, Abbatematteo J, De Leo EK, Kubilis PS, Vaziri S, Bova F,et al. The effects of new or worsened postoperative neurological deficits

Figure 4.

Methodology for an OCT-based CAD system.A, After a preoperative MRI for evaluation, we will use a OCT probe (B) for intraoperatively image guidance duringsurgery (C). Finally, after preprocessing, feature extraction and classification (D), the automated real-time classification using a volumetric color-coded map ofthe tissue will be displayed for real-time guidance.

AI-Assisted OCT-Guided Glioma Surgical-Margin Detection

www.aacrjournals.org Clin Cancer Res; 25(21) November 1, 2019 6337

Page 22: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

on survival of patients with glioblastoma. J Neurosurg 2017;127:123–31.

11. Mehranian A, Arabi H, Zaidi H. Quantitative analysis of MRI-guidedattenuation correction techniques in time-of-flight brain PET/MRI.Neuroimage 2016;130:123–33.

12. Spivak CJ, Pirouzmand F. Comparison of the reliability of brain lesionlocalization when using traditional and stereotactic image-guided techni-ques: a prospective study. J Neurosurg 2005;103:424–7.

13. Rygh OM, Selbekk T, Torp SH, Lydersen S, Hernes TA, Unsgaard G.Comparison of navigated 3D ultrasound findings with histopathology insubsequent phases of glioblastoma resection. Acta Neurochir 2008;150:1033–41.

14. Valdes PA, Kim A, Brantsch M, Niu C, Moses ZB, Tosteson TD, et al. delta-aminolevulinic acid-induced protoporphyrin IX concentration correlateswith histopathologic markers of malignancy in human gliomas: the needfor quantitative fluorescence-guided resection to identify regions ofincreasing malignancy. Neuro Oncol 2011;13:846–56.

15. Ando T, Kobayashi E, Liao H, Maruyama T, Muragaki Y, Iseki H, et al.Precise comparison of protoporphyrin IX fluorescence spectra with path-ological results for brain tumor tissue identification. Brain Tumor Pathol2011;28:43–51.

16. Montcel B, Mahieu-Williame L, Armoiry X, Meyronet D, Guyotat J. Two-peaked 5-ALA-induced PpIX fluorescence emission spectrumdistinguishesglioblastomas from low grade gliomas and infiltrative component ofglioblastomas. Biomed Opt Express 2013;4:548–58.

17. Tashibu K. Analysis of water content in rat brain using Raman spectros-copy. No To Shinkei 1990;42:999–1004. [article in Japanese].

18. Koljenovic S, Choo-Smith LP, Bakker Schut TC, Kros JM, van den Berge HJ,Puppels GJ. Discriminating vital tumor from necrotic tissue in humanglioblastoma tissue samples by Raman spectroscopy. Lab Invest 2002;82:1265–77.

19. Kalkanis SN, Kast RE, RosenblumML, Mikkelsen T, Yurgelevic SM, NelsonKM, et al. Raman spectroscopy to distinguish grey matter, necrosis, andglioblastoma multiforme in frozen tissue sections. J Neurooncol 2014;116:477–85.

20. Kast R, Auner G, Yurgelevic S, Broadbent B, Raghunathan A, Poisson LM,et al. Identification of regions of normal greymatter andwhitematter frompathologic glioblastoma and necrosis in frozen sections using Ramanimaging. J Neurooncol 2015;125:287–95.

21. Kast RE, Auner GW, Rosenblum ML, Mikkelsen T, Yurgelevic SM, Raghu-nathan A, et al. Raman molecular imaging of brain frozen tissue sections.J Neurooncol 2014;120:55–62.

22. Ji M, Orringer DA, Freudiger CW, Ramkissoon S, Liu X, Lau D, et al. Rapid,label-free detection of brain tumors with stimulated Raman scatteringmicroscopy. Sci Transl Med 2013;5:201ra119.

23. Desroches J, JermynM,PintoM,Picot F, TremblayMA,Obaid S, et al. Anewmethod using Raman spectroscopy for in vivo targeted brain cancer tissuebiopsy. Sci Rep 2018;8:1792.

24. Zhang J, Fan Y, He M, Ma X, Song Y, Liu M, et al. Accuracy of Ramanspectroscopy in differentiating brain tumor from normal brain tissue.Oncotarget 2017;8:36824–31.

25. Evans CL, Xu X, Kesari S, Xie XS, Wong ST, Young GS. Chemically-selectiveimaging of brain structures with CARS microscopy. Opt Express 2007;15:12076–87.

26. Jermyn M, Mok K, Mercier J, Desroches J, Pichette J, Saint-Arnaud K, et al.Intraoperative brain cancer detection with Raman spectroscopy inhumans. Sci Transl Med 2015;7:274ra19.

27. Hollon TC, Lewis S, Pandian B, Niknafs YS, Garrard MR, Garton H, et al.Rapid intraoperative diagnosis of pediatric brain tumors using stimulatedRaman histology. Cancer Res 2018;78:278–89.

28. Fu Y,Huff TB,WangHW,WangH,Cheng JX. Ex vivo and in vivo imaging ofmyelin fibers in mouse brain by coherent anti-Stokes Raman scatteringmicroscopy. Opt Express 2008;16:19396–409.

29. Hollon T, Lewis S, Freudiger CW, Sunney Xie X, Orringer DA. Improvingthe accuracy of brain tumor surgery via Raman-based technology.Neurosurg Focus 2016;40:E9.

30. Fujimoto JG, Pitris C, Boppart SA, Brezinski ME. Optical coherencetomography: an emerging technology for biomedical imaging and opticalbiopsy. Neoplasia 2000;2:9–25.

31. Boppart SA, Brezinski ME, Pitris C, Fujimoto JG. Optical coherencetomography for neurosurgical imaging of human intracortical melanoma.Neurosurgery 1998;43:834–41.

32. Chong SP, Merkle CW, Cooke DF, Zhang T, Radhakrishnan H, Krubitzer L,et al. Noninvasive, in vivo imaging of subcorticalmouse brain regions with1.7 mum optical coherence tomography. Opt Lett 2015;40:4911–4.

33. BohringerHJ, Lankenau E, Stellmacher F, Reusche E, H€uttmannG,Giese A.Imaging of human brain tumor tissue by near-infrared laser coherencetomography. Acta Neurochir 2009;151:507–17.

34. Kut C, Xi J, Chaichana KL, Rincon-Torroella J, Rodriguez F,McVeigh E, et al.Real-time, label-free optical property mapping for detecting glioma inva-sion with SSOCT for potential guidance of surgical intervention. Acceptedfor Oral Presentation at: 2015 SPIE Photonics West BIOS. February 2015.The Moscone Center, San Francisco, CA.

35. Bohringer HJ, Boller D, Leppert J, Knopp U, Lankenau E, Reusche E, et al.Time-domain and spectral-domain optical coherence tomography in theanalysis of brain tumor tissue. Lasers Surg Med 2006;38:588–97.

36. Bizheva K, Unterhuber A, Hermann B, Povazay B, Sattmann H, Fercher AF,et al. Imaging ex vivo healthy and pathological human brain tissue withultra-high-resolution optical coherence tomography. J Biomed Opt 2005;10:11006.

37. Canny J. A computational approach to edge detection. IEEE Trans PatternAnal Mach Intell 1986;8:679–98.

38. Gutierrez-Navarro O, Campos-Delgado DU, Arce-Santana ER, MendezMO, Jo JA. Blind end-member and abundance extraction for multispectralfluorescence lifetime imaging microscopy data. IEEE J Biomed HealthInform 2014;18:606–17.

39. Rico-Jimenez JJ, Campos-Delgado DU, Villiger M, Otsuka K, Bouma BE, JoJA. Automatic classification of atherosclerotic plaques imaged with intra-vascular OCT. Biomed Opt Express 2016;7:4069–85.

40. Huang HH, Xu T, Yang J. Comparing logistic regression, support vectormachines, and permanental classification methods in predicting hyper-tension. BMC Proc 2014;8:S96.

41. Nickel K, Renovanz M, K€onig J, St€ockelmaier L, Hickmann AK, Nadji-OhlM, et al. The patients' view: impact of the extent of resection, intraoperativeimaging, and awake surgery on health-related quality of life in high-gradeglioma patients-results of a multicenter cross-sectional study. NeurosurgRev 2018;41:207–19.

42. Chaichana KL, Jusue-Torres I, Navarro-Ramirez R, Raza SM, Pascual-Gallego M, Ibrahim A, et al. Establishing percent resection and residualvolume thresholds affecting survival and recurrence for patients withnewly diagnosed intracranial glioblastoma. Neuro Oncol 2014;16:113–22.

43. Chaichana KL, Garzon-Muvdi T, Parker S, Weingart JD, Olivi A, Bennett R,et al. Supratentorial glioblastomamultiforme: the role of surgical resectionversus biopsy among older patients. Ann Surg Oncol 2011;18:239–45.

44. Petrecca K, Guiot MC, Panet-Raymond V, Souhami L. Failure patternfollowing complete resection plus radiotherapy and temozolomide is atthe resection margin in patients with glioblastoma. J Neurooncol 2013;111:19–23.

45. Coburger J, Scheuerle A, Pala A, Thal D, Wirtz CR, K€onig R. Histopatho-logical insights on imaging results of intraoperative magnetic resonanceimaging, 5-aminolevulinic acid, and intraoperative ultrasound in glio-blastoma surgery. Neurosurgery 2017;81:165–74.

46. Mansouri A, Mansouri S, Hachem LD, Klironomos G, Vogelbaum MA,Bernstein M, et al. The role of 5-aminolevulinic acid in enhancing surgeryfor high-grade glioma, its current boundaries, and future perspectives: asystematic review. Cancer 2016;122:2469–78.

47. Utsuki S, Oka H, Sato S, Suzuki S, Shimizu S, Tanaka S, et al. Possibility ofusing laser spectroscopy for the intraoperative detection of nonfluorescingbrain tumors and the boundaries of brain tumor infiltrates - technical note.J Neurosurg 2006;104:618–20.

48. Cordova JS, Gurbani SS, Holder CA, Olson JJ, Schreibmann E, Shi R, et al.Semi-automated volumetric and morphological assessment of glioblas-toma resection with fluorescence-guided surgery. Mol Imaging Biol 2016;18:454–62.

49. Yuan W, Kut C, Liang W, Li X. Robust and fast characterization of OCT-based optical attenuation using a novel frequency-domain algorithm forbrain cancer detection. Sci Rep 2017;7:44909.

Clin Cancer Res; 25(21) November 1, 2019 Clinical Cancer Research6338

Juarez-Chambi et al.

Page 23: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

Cancer "-omics"

The Clonal Evolution of Metastatic Osteosarcomaas Shaped by Cisplatin TreatmentSamuel W. Brady1, Xiaotu Ma1, Armita Bahrami2, Gryte Satas3, Gang Wu1,Scott Newman1, Michael Rusch1, Daniel K. Putnam1, Heather L. Mulder1,Donald A.Yergeau4, Michael N. Edmonson1, John Easton1, Ludmil B. Alexandrov5,Xiang Chen1, Elaine R. Mardis6, Richard K.Wilson6, James R. Downing2,Alberto S. Pappo7, Benjamin J. Raphael3, Michael A. Dyer8, and Jinghui Zhang1

Abstract

To investigate the genomic evolution ofmetastatic pediatricosteosarcoma, we performed whole-genome and targeteddeep sequencing on 14 osteosarcoma metastases and twoprimary tumors from four patients (two to eight samples perpatient). All four patients harbored ancestral (truncal) somaticvariants resulting in TP53 inactivation and cell-cycle aberra-tions, followed by divergence into relapse-specific lineagesexhibiting a cisplatin-induced mutation signature. In three ofthe four patients, the cisplatin signature accounted for >40%of mutations detected in the metastatic samples. Mutationspotentially acquired during cisplatin treatment included NF1missense mutations of uncertain significance in two patientsand a KIT G565R activating mutation in one patient. Three offour patients demonstrated widespread ploidy differences

between samples from the sample patient. Single-cell seedingof metastasis was detected in most metastatic samples. Cross-seeding between metastatic sites was observed in one patient,whereas in another patient a minor clone from the primarytumor seeded both metastases analyzed. These results revealextensive clonal heterogeneity in metastatic osteosarcoma,much of which is likely cisplatin-induced.

Implications: The extent and consequences of chemotherapy-induced damage in pediatric cancers is unknown. We foundthat cisplatin treatment can potentially double the muta-tional burden in osteosarcoma, which has implications foroptimizing therapy for recurrent, chemotherapy-resistantdisease.

IntroductionIncreased understanding of intrapatient tumor heterogeneity

has fueled progress in many cancers (1). For example, analysis ofheterogeneity can reveal subclonal drug resistance mechan-isms (2) and early mutation events that can be preferentiallytargeted (3). However, the clonal heterogeneity of metastaticpediatric solid tumors such as osteosarcoma is not wellunderstood.

Osteosarcoma is a cancer arising from bone (4) and occurs inchildren and adolescents during active bone growth (5). Currenttherapy includes chemotherapy with the MAP (methotrexate,doxorubicin, and cisplatin) regimen and surgery (4). Five-yearsurvival from time of diagnosis is 60% to 70%, with littleimprovement in the last three decades (5–7). Osteosarcomamostfrequently metastasizes to the lungs, which accounts for mostdeaths (4, 5).

To better understand tumor heterogeneity and clonal evolutionin osteosarcoma, we analyzed 14 metastatic samples and twoprimary tumors from four patients as part of the St. Jude/Washing-ton University Pediatric Cancer Genome Project (PCGP), whichrevealed substantial intrapatient heterogeneity associated withcisplatin treatment. These results demonstrate the impact ofchemotherapy on shaping the clonal architecture of metastaticosteosarcoma.

Materials and MethodsSample information

Samples were used under institutional review board approval,in accordance with the Declaration of Helsinki, at St. JudeChildren's Research Hospital and Washington University in St.Louis, and written informed consent and/or assent from patientsand/or guardians was obtained. SJOS001101_M1 was obtainedduring thoracotomy of the left lung�48weeks postdiagnosis andwas a lung slicewith twodiscretemetastatic nodules. SJOS001101samples M2-M8 were obtained 15 hours postmortem (Supple-mentary Table S1). Five of the 16 samples were included in aprevious study focused on identifying significantly mutated

1Department of Computational Biology, St. Jude Children's Research Hospital,Memphis, Tennessee. 2Department of Pathology, St. Jude Children's ResearchHospital, Memphis, Tennessee. 3Department of Computer Science, PrincetonUniversity, Princeton, New Jersey. 4UB Genomics and Bioinformatics Core,University at Buffalo, Buffalo, New York. 5Department of Cellular and MolecularMedicine, University of California, San Diego, La Jolla, California. 6Institute forGenomic Medicine, Nationwide Children's Hospital and The Ohio StateUniversity College of Medicine, Columbus, Ohio. 7Department of Oncology,St. Jude Children's Research Hospital, Memphis, Tennessee. 8Department ofDevelopmental Neurobiology, St. Jude Children's Research Hospital, Memphis,Tennessee.

Note: Supplementary data for this article are available at Molecular CancerResearch Online (http://mcr.aacrjournals.org/).

S.W. Brady and X. Ma contributed equally to this article.

Corresponding Authors: Jinghui Zhang, St. Jude Children's Research Hospital,262 Danny Thomas Place, Memphis, TN 38105. Phone: 901- 595-7069; E-mail:[email protected]; and Michael A. Dyer, [email protected]

doi: 10.1158/1541-7786.MCR-18-0620

�2019 American Association for Cancer Research.

MolecularCancerResearch

www.aacrjournals.org 895

Page 24: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

genes in osteosarcoma: SJOS001107_M1, SJOS001107_M2,SJOS001105_D1, SJOS010_D, and SJOS010_M (8).

Whole-genome sequencing and capture validationWhole-genome sequencing (WGS) was performed as

described (8). Capture validation was performed using customNimbelgen Seqcap EZ solution bait sets (Roche) and sequencingwas performed as described (8). In SJOS001101, 91% of WGSsomatic SNVs were analyzed by capture validation; in SJOS010,84% were analyzed. Thus, in Fig. 1 heatmaps capture validationdata were used as it approached whole-genome level. InSJOS001105 and SJOS001107, a lesser percentage of SNVs wereanalyzed by capture validation (31% and 78%, respectively).Thus, Fig. 1 heatmaps for these patients rely on WGS but laterclonal evolution analysis relies on capture validation. Whole-genome coverage was measured for WGS; for capture validation,the number of counts (mutant or wild-type) at the target site wasused to quantify coverage. Indels, CNVs, and structural variants(SV) were computed from WGS.

SNV and indel variant identification and clusteringWGS reads were aligned to GRCh37-lite with BWA (9) and

somatic SNVs/indels were called with Bambino (10) followed bya postprocess which removes paralogous variants and sequencingartifacts caused by poor quality or alignment artifacts (11). Onlyexonic indels were analyzed. Capture validation datawere alignedto GRCh37-lite and mutant and wild-type counts for each SNVwere determined using an in-house pipeline which determinesmutant and total counts of a pre-identified mutation list, whiletaking into account read quality (unpublished). Validation rate ofWGS SNVswas determined by comparingWGSVAFswith capturevalidation VAFs for each variant with >15 coverage in both plat-forms and a positive WGS call in the sample. Variants werevalidated if Fisher exact test comparing capture validationmutantreads versus nonmutant reads in germline versus tumor yieldedP < 0.05 in at least one sample from the patient. We also excludedvariants with germline VAF� 0.01, low germline coverage (�20),or low tumor coverage (�15). Of the 21,963 high-quality SNVsselected for validation, 21,779 (99.2%) were validated. ForSJOS001105 and SJOS001107, heatmaps in Fig. 1 were generatedfromWGS variants after post-processing while remaining analysisused capture validation sequencing. Kataegis was analyzed asdescribed (8).

VAFs were adjusted for tumor purity as follows. Let p representtumor purity as a proportion between 0 and 1, and c represent theinteger copy number of the region containing the variant (deter-mined by rounding copy number to nearest integer). The pro-portion of total reads contributed by tumor cells at the mutationsite (t) is:

t ¼ c pð Þ2 1� pð Þ þ c pð Þ

2(1 � p) represents normal contribution and c(p) representstumor contribution. The adjusted VAF is:

VAFadj ¼ VAFt

SNV clusters were identified by determining which samples theSNVwas present in at adjusted VAF�0.05 (12). SNV clusters with<250 SNVs, except private clusters, were excluded from branching

evolution analysis, to ignore SNV clusters caused by SNV dropoutfrom private copy losses, but were included in SupplementaryFigs. S9 and S11 ("Undefined" SNV cluster).

Copy number variant identificationCNVs were identified using CONSERTING (13) from WGS.

Segmented CNV data were adjusted for normal contamination asdescribed (12), which also determined tumor purity. Tumorpurity was corroborated by B-allele frequencies. Regions with1.8 to 2.2 copies were considered diploid and copy-altered oth-erwise. Cancer Gene Census (14) genes were downloaded fromhttps://cancer.sanger.ac.uk/census. To perform CNV Euclideandistance clustering, we sampled segmented copy data at every10,000th genomic position and used dist and hclust functions inR. Absolute copies reported for individual genes in manuscripttext are rounded to nearest integer.

Mutational signature analysisTrinucleotide context for SNVs was determined with an in-

house script (12). NMF was used to extract mutational signaturesfrom PCGP osteosarcomas (WGS; ref. 8), including the fourpatients on which this study focuses, using SigProfiler (15). Threesignatures was optimal. Signature definitions for 30 COSMICsignatures and our extracted cisplatin signature were used as inputto SigProfilerSingleSample to query signature strengths in sam-ples (or SNV clusters), with the following parameters: signatures1, 3, and 5 were included as signatures "included in all samplesregardless of rules or sparsity," with signature 3 added to thedefault signatures 1 and 5 due to strong presence in osteosar-coma (16); signatures 2 and 13 (APOBEC-associated) considered"connected"; and an improvement of accuracy of 0.02 (ratherthan default 0.05) in add_all_single_signatures. For evolutionarytree construction, signature weights in SNV clusters were super-imposed on evolutionary trees (17).

Probabilities of being cisplatin-induced for NF1 and KITvariants

To calculate probabilities thatNF1 andKIT SNVswere cisplatin-induced, we used an approach described previously (18). SNVclusters containing NF1 and KIT variants were >0.95 explained(cosine similarity) by the signatures queried.

Evolutionary tree constructionEvolutionary trees were constructed using principles described

previously (3, 19). Each branch represents an SNV cluster. Branchlengths are proportional to number of SNVs. Trees were organizedsuch that each sample (tree bottom) possesses the SNVs on theclusters attached to it above the sample. For SJOS001101 M4, noprivate SNVswere identified due to low tumor purity (<20%). ForSJOS001101 M1, multiple lineages were present, and the mostclearly defined lineage (the M1-M5-M6 cluster) was represented.

Subclone analysisSubclone analysis was performed using principles described

previously (12, 20). SNVs that were diploid (copy number1.8–2.2) in a single sample (Supplementary Fig. S8) or across allsamples when possible (Supplementary Figs. S9, S10, and S11C)were used for clonal analysis. SJOS001107 had <30 such variantsandwasnot included indensity analysis, andSNVs in regionswithone to four copies were analyzed by two-dimensional and three-sample VAF plots (Supplementary Fig. S11B) whereas in

Brady et al.

Mol Cancer Res; 17(4) April 2019 Molecular Cancer Research896

Page 25: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

SJOS001105

M1

D2

A

Diagnosis48

Autopsy

Cisplatin

SJOS001101Surgery

Diagnosis Surgery0 88 95

Cisplatin11

Diagnosis0 488

Platinum

Surgery10

SJOS010407

CarboCis

Diagnosis0 73

Cisplatin

Surgery11

SJOS001107

R1Femur

metastasis

Tibiametastasis

Week 0 13 70

M2

M3

M4 M5M6

M7

M8

Not sequencedWGS

D1Humerusprimary

Somatic SNVs (n = 6,753)

D1 D2 R1

Somatic SNVs (n = 3,749)

Somatic SNVs (n = 2,714)

M1 M2

D M

Femurprimary

D1Humerusprimary

M1

M2

Femurprimary

MD

TP53 delCCND3 amp

MYC ampCDKN2A del3q13.31 del

TP53 R248QARID1A R1074Q

CCNE1 amp3q13.31 del

NF1 S2684I(splice)

TP53 SVsCCND3 amp

CDK4 amp3q13.31 del

DLG2 del

Week

Week

Week

M1M5M6M7M4M8M2M3

Somatic SNVs (n = 15,664)

Truncal

Private

KIT G565R

NF1 G722E

TP53 SVsATRX N294fs

RB1 W78*MYC amp

VAF (adj)

D1

POLH R167*

Non-M2-M3

M2-M3

M1-M5-M6M4-M7-M8M4-M7

00.20.40.60.81

D1

D2

R1

Relapse

Pretreatment

D1

M1

M2Relapse

DM

B

C

D

TP53 R248QARID1A R1074QCCNE1 amp3q13.31 del

TP53 delCCND3 ampMYC amp

POLH R167*

NF1 S2684I(splice)

TP53 SVsCCND3 ampCDK4 amp

3q13.31 del

M5

M6

M7

M4

M8

M3

M2

M1

Truncal

TP53 SVsATRX N294fsRB1 W78*MYC amp

KIT G565R

NF1G722E

3q13.31 del

Private

D1-D2

Truncal

Private

M1-M2

Truncal

Private

Truncal

3q13.31 delDLG2 del

CDKN2A del3q13.31 del

DLG2 delDLG2 del

Pretreatment

Figure 1.

Osteosarcoma treatment history and evolution. Treatment history andmutational heterogeneity in patients with osteosarcoma SJOS001101 (A), SJOS001105(B), SJOS001107 (C), and SJOS010 (D). Left side shows platinum treatment time periods and locations and times (in weeks postdiagnosis) of sample acquisition.Red samples indicate samples on whichWGSwas performed, whereas gray indicates nonsequenced samples. Right shows heatmap of somatic SNVs, with colorindicating VAF adjusted for tumor purity. Possible driver SNVs along with indels, SVs, and CNVs are indicated next to SNV cluster; clusters are colored to theright, and match tree branch colors at far right. Evolutionary tree branches are proportional to the number of mutations, with truncal variants at top and privatevariants at bottom, and driver variants indicated. In SJOS001101 (A), at week 70 autopsy the lung was filled with numerous contiguous bulky metastases which isindicated by diffuse gray color, and sample M1 at week 48 was two adjacent metastatic lesions (dotted lines). Most lesions sampled at autopsy were in the lungexcept M2 (right lateral chest wall), M5 (superior mediastinal mass), and M8 (diaphragmatic and left lower lobe of lung). SJOS001101 (A) and SJOS010 (D)heatmaps are based on capture sequencing, whereas SJOS001105 (B) and SJOS001107 (C) are based onWGS.

Cisplatin Shapes Osteosarcoma Clonal Evolution

www.aacrjournals.org Mol Cancer Res; 17(4) April 2019 897

Page 26: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

SJOS001105 variants diploid in D1 and D2 and in three-copyregions in R1 were analyzed (Supplementary Fig. S11A), due towidespread copy gains in R1 leading to fewer than 10 pan-diploidSNVs in this patient. Overlapping VAF density peaks indicated co-occurring SNV clusters (Supplementary Fig. S8). Private SNVclusters with two peaks, including a minor peak with VAF <0.4,revealed descendant clone(s). Pairwise sample VAF comparisonsclarified minor admixture of clones between sites. For cross-seeding analysis between SJOS001101 samples M5 and M6, themedianM5VAF (ofM6-private SNVs detected inM5 at VAF >0, allof whichwere <0.05 due to our clusteringmethod)was calculatedand multiplied by two to determine CCF of the M6 site-specificfounder clone in M5. Descendant clone(s) revealed by secondaryprivate VAF density peaks (Supplementary Fig. S8) were shown asa single descendant clone in Fig. 4A by a subcircle, although it ispossible that descendants-of-descendants or multiple indepen-dent descendants were present. SJOS001107 metastasizing cloneCCF was estimated <5% but could not be determined withcertainty due to lack of diploid variants.

PyClone (21) and MACHINA (22) were used in SJOS001101.We clustered SNVs diploid in all samples thus: (1) we classifiedvariants based on presence or absence in each sample, and (2)within these groups, clustered on VAF using PyClone. Mutationswere "present" if the posterior probability of the variant's presencewas�0.95. This procedure yielded 35 clusters; we analyzed the 17clusters with �4 mutations.

We then performed clonal analysis using MACHINA in treeinferencemode.We calculated 95% confidence bounds on clusterfrequency (22), assuming heterozygosity except (1) clusters 1 and2, which were likely homozygous in M1 and M4-M8 followingLOH; (2) cluster 13,whichwas interpreted asmutations lost in theLOH event.

We used a Bayesian model to determine the posterior proba-bility of a variant's presence. Let X ¼ 1 indicate presence of avariant in and X¼ 0 indicate absence. The posterior probability ofa variant's presence is:

Pr X ¼ x jV ;Tð Þ / Pr V j T;X ¼ xð ÞPr X ¼ xð Þ; ð1Þ

where V is the number of variant reads, and T is the total readscovering the locus.Wemodeled sequencing as a binomial processwith probability of success f ¼ 0 if the mutation is absent andf � Beta(a ¼ 1, b ¼ 1) if present. We used a uniform priorprobability Pr(X ¼ x) ¼ 0.5. Thus, we have

Pr X ¼ 0 jV ;Tð Þ / 0:5 � Binomial V j f ¼ 0; Tð Þ; ð2Þ

Pr X ¼ 1 jV; Tð Þ / 0:5 � Beta� Binomial V j a; bð Þ: ð3Þ

For a mutation j, we assigned sample profile �xj ¼ [�x1,j, . . .,�xm,j] such that for sample i, �xi,j ¼ 1 provided that Pr(Xi,j ¼ x | Vi,j,Ti,j) > 0.95. We ran PyClone using beta-binomial density anddefault parameters on each set of mutations Vx, such that allmutations i 2 Vx have the same sample profile �xi, ¼ x. Thisyielded a set of cluster assignments Cx for each sample profile x.To obtain input for MACHINA, we took the union of sampleprofiles C ¼ [x Cx.

To test whether SJOS001107 cluster M1-M2 variants, whichhad low VAFs in D1, represented true signal, we used CleanDeep-Seq (Ma and colleagues, under review). CleanDeepSeq removesreads which have problematic alignment, discordant bases in

regions overlapped by forward and reverse read pairs, or a highpercentage (�5%) of bases with low quality. Overlapping por-tions of the surviving read pairs are collapsed to prevent readcount duplication.M1-M2SNVswith�3mutant reads inD1wereconsidered to "bleed" into D1. This was compared with M2private SNVs (the other nontruncal SNV cluster >100 mutations)with �3 mutant reads in D1 or M1, by Fisher exact test.

Data and code availabilityWGS is available through St. Jude Cloud (https://stjude.cloud)

and EGA (EGAS00001000263). Supplementary Tables S2 to S5report capture validation. Custom code is available upon request.

ResultsOsteosarcoma patient history and samples

We performed WGS on osteosarcoma samples from fourpatients for which multiple tumor samples were available (Fig. 1;Supplementary Fig. S1). The first three patients, SJOS001101,SJOS001105, and SJOS001107 (Fig. 1A–C) received standardMAP chemotherapy (methotrexate, doxorubicin, and cisplatin)combined with bevacizumab (23) followed by resection of theprimary tumor and postoperative MAP. The fourth, SJOS010(Fig. 1D) was treated with a carboplatin-containing regimen (24)and cisplatin years later.

In SJOS001101, one lung metastasis from 48 weeks postdiag-nosis and seven autopsy samples from the lung and adjacenttissues were analyzed, but the primary tumor was unavailable(Fig. 1A). In SJOS001105, the primary tumor (left proximalhumerus) along with a femur and a lung metastasis >1 year laterwere sequenced (Fig. 1B). For SJOS001107, we analyzed theprimary tumor (left proximal humerus) along with a lung metas-tasis also detected at diagnosis, and a recurrent lung metastasisfrom >1 year later (Fig. 1C). For SJOS010, we sequenced bone andlung metastases which appeared 8 to 9 years after diagnosis(Fig. 1D). Together, this dataset enabled analysis of temporalevolution through drug treatment and spatial evolution associ-ated with metastatic spread (Supplementary Table S1).

Branching evolution of osteosarcoma metastasesTo evaluate tumor heterogeneity between sites, we performed

WGS on the 16 osteosarcoma samples at 42-62Xmedian genomecoverage and 35-44X for germline (Supplementary Fig. S2) toidentify single-nucleotide variants (SNV) and indels (MaterialsandMethods), SVs (25), and copynumber variants (CNV; ref. 13).We initially focused on SNVs for analysis of clonal evolution.More than 30,000 somatic SNVs were identified in nonrepetitivegenomic regions. To validate these variants and enable robustclonal analysis, we performed deep capture validation sequenc-ing (8) on all samples (Materials and Methods; SupplementaryTables S2–S5) with median coverage of 149-1145X per sample(Supplementary Fig. S3) with an overall validation rate of 99.2%(Materials and Methods).

To determine relationships between metastatic sites, weclustered SNVs based on their presence [defined by variantallele frequency (VAF) �0.05 after adjusting for tumor purity]or absence in each sample, excluding clusters with relativelyfew SNVs (heatmaps in Fig. 1; Materials and Methods). In eachpatient, variants present in all samples, referred to as truncalvariants, accounted for <20% of all somatic SNVs (Fig. 1).Sample-specific private variants were present in all except

Brady et al.

Mol Cancer Res; 17(4) April 2019 Molecular Cancer Research898

Page 27: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

SJOS001101_M4, which was estimated to have <20% tumorpurity (Fig. 1A; Supplementary Table S2). Several samples hadrelatively high numbers of private SNVs, including M2, M3, andM8 in SJOS001101 (Fig. 1A),D2 andR1 in SJOS001105 (Fig. 1B),M2 in SJOS001107 (Fig. 1C), and both samples in SJOS010(Fig. 1D). Of the multiregion autopsy samples analyzed forSJOS001101, M2 and M3 clustered separately from other sam-ples, consistent with the anatomical proximity of these twometastases on the right side, whereas most other lesions were onthe left (Fig. 1A).

Truncal driver variants included SVs or SNVs causing TP53inactivation in each patient (Fig. 1), consistent with previousosteosarcoma studies (8, 26). Each patient also harbored at leastone cell-cycle–related truncal variant (Supplementary Fig. S4 andS5), including RB1 W78� in SJOS001101 (Fig. 1A), CCND3 andCDK4 amplification in SJOS001105 (Fig. 1B), CCND3 amplifi-cation and CDKN2A homozygous deletion in SJOS001107(Fig. 1C), andCCNE1 amplification in SJOS010 (Fig. 1D). Finally,a 0.5 to 1.0 Mb region on 3q13.31 containing the noncodinggenes TUSC7, MIR4447, and LINC00901 underwent deletions inall four patients (27); the minimal overlapping region of homo-zygous deletion in one sample (SJOS010) included onlyLINC00901 (Supplementary Fig. S4A), although in some osteo-sarcomas this gene is not in the minimally deleted region (8).3q13.31 deletions were truncal in three patients, SJOS001105,SJOS001107, and SJOS010 (Fig. 1B–D) and "shared" (present insome but not all samples) in a fourth, SJOS001101 (Fig. 1A).

Nontruncal variants were acquired later, and they included ashared heterozygous NF1 G722E mutation in SJOS001101 sam-ples M4, M7, and M8, (Fig. 1A); a privateNF1 S2684I variant at asplice acceptor site, which may affect splicing, in SJOS001105sample R1 (Fig. 1B); and a private KIT G565R mutation inSJOS001101 sample M3 (Fig. 1A). Although NF1 variants havebeen reported in osteosarcoma (26), these twoNF1mutations arevariants of uncertain significance due to lack of functional vali-dation data. The KIT G565R mutation was likely activating as ithas been reported to confer sensitivity to imatinib in a gastroin-testinal stromal tumor (28) and has been found in mucosalmelanomas (29) and a gastric adenocarcinoma (30). SJOS010harbored a private DLG2 deletion, likely an important diseasedriver in osteosarcoma (8, 31), in sample D (Fig. 1D). Osteosar-coma evolutionary trees displayed short trunks and long branchesin each case (Fig. 1, far right), in contrast to some other cancertypes such as breast and lung cancer, where in most patients 50%or more of mutations were truncal (3, 12, 17). However, theknowndriver events in osteosarcomawere usually truncal, as seenin some other cancer types (32), notwithstanding the high level ofintrapatient heterogeneity.

Mutational signature analysis reveals cisplatin-inducedmutagenesis

We analyzed trinucleotide mutation context signatures todetermine potential causes of mutagenesis at different stages oftumor evolution (33). Rather than comparing mutational signa-tures between samples directly, we analyzed each SNV cluster,representing a distinct branch of evolution (Fig. 1) to determinethemutational processes giving rise to truncal, shared, and privatevariants (Fig. 2A). This revealed a unique mutational signature inmany shared and private SNV clusters but absent in truncal SNVs,implying a late mutational process. This signature was character-ized by C[C>T]C, C[C>T]T, C[T>A]N and secondarily C[C>A]T

variants (Fig. 2A, arrows), and did not match any COSMICmutational signature (33, 34). To determine the etiology of thissignature, we checked for the signature in WGS data of 29additional PCGP osteosarcomas (Supplementary Fig. S6A, whichincludes the four multisample patients for comparison; ref. 8).The signature was detected exclusively in relapsed but not pre-treatment samples, suggesting it may be therapy-induced (Sup-plementary Fig. S6A, arrows). Indeed, of the two patients(SJOS001105 and SJOS001107)withmatched pretreatment sam-ples and relapsed samples, the signature could be detected inrelapsed samples but notmatched pretreatment samples (Fig. 2A;Supplementary Fig. S6A).

To quantitatively determine the spectrum of trinucleotidecontext SNVs in this signature, we performed nonnegative matrixfactorization (NMF; refs. 15, 33) on SNVs from the PCGP oste-osarcoma cohort, including the four patients on which this studyfocuses (Supplementary Fig. S6B and S6C). This revealed threemutational signatures, two of which resembled known APOBEC,homologous recombination deficiency, and clock-like signa-tures (33), whereas the third represented the novel signaturedescribed above (Supplementary Fig. S6C). The novel signatureshowed high cosine similarity (0.940; ref. 15) to a cisplatinsignature from Boot and colleagues generated from extendedexposure of the MCF 10A breast cell line to cisplatin (35) andalso resembled a recently identified cisplatin signature in bladdercancer (cosine similarity 0.801; ref. 36), thus indicating that it is acisplatinmutational signature (Supplementary Fig. S6C). Further,our novel cisplatin signature was similar to a novel mutationalsignature of unknownorigin in a recent pan-pediatric cancer studythat included osteosarcoma (cosine similarity 0.958; Supplemen-tary Fig. S6C; ref. 37). The cisplatin signature showed no evidenceof localized hypermutation (kataegis; Supplementary Fig. S7),indicating that cisplatin does not affect specific regions but maymutate globally.

Further, we measured the strength of our cisplatin signature ineach PCGP osteosarcoma, including the four multisamplepatients, alongside reportedCOSMICmutational signatures (33).The cisplatin signature was detected in the four patients understudy, although not in two additional relapsed cases; all six ofthese patients had received DNA-damaging (38) platinum treat-ment, usually in the form of cisplatin (Supplementary Fig.S6D, Fig. 2B; Supplementary Table S1). None of the 30 pretreat-ment samples possessed the cisplatin signature, confirming itsspecificity to treated samples (Fig. 2B). In three of six posttherapyosteosarcoma patients (SJOS001101, SJOS001105, andSJO001107), the cisplatin signature accounted for >40% of SNVs(Fig. 2B), and in SJOS001101 >60% inmost samples. Among theother three patients, the signature was absent in two patients(SJOS001 and SJOS001112) and detected in one of two lesionsfrom SJOS010 (19% of SNVs in D, and not detected in M). Thisindicates that platinum therapy's mutational effects are variable.Although these three signature-low patients had all receivedplatinum treatment, SJOS001112 had received carboplatin ratherthan cisplatin, and SJOS010 had received carboplatin primarilyand cisplatin as a secondary treatment (Supplementary Table S1).Whether the use of carboplatin rather than cisplatin accounts forthe lower signature strength cannot be determined with thissample size.

Finally, where matched pre- and posttherapy samples wereavailable (SJOS001105 and SJOS001107), the cisplatin signaturewas exclusive to posttherapy samples, reaffirming its specificity

Cisplatin Shapes Osteosarcoma Clonal Evolution

www.aacrjournals.org Mol Cancer Res; 17(4) April 2019 899

Page 28: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

A

Trun

cal

M2-

M3

Non-

M2-

M3

M1-

M5-

M6

M4-

M7-

M8

M4-

M7

M1

M2

M3

M8

M5

M6

M7

Mut

atio

n sit

e

C>A

C>G

C>T

T>A

T>C

T>G

5′ C

onte

xt3′

Con

text

ACGT

ACGTACGT

ACGT

ACGTACGT

SN

V T

rinuc

leot

ide

cont

ext

SJOS001101

Private

Proportionof SNVs

00.050.10.150.2

C[C>A]T

C[C>T]CC[C>T]T

C[T>A]N

A[C>A]AA[C>A]CA[C>A]GA[C>A]TC[C>A]AC[C>A]CC[C>A]GC[C>A]TG[C>A]AG[C>A]CG[C>A]GG[C>A]TT[C>A]AT[C>A]CT[C>A]GT[C>A]TA[C>G]AA[C>G]CA[C>G]GA[C>G]TC[C>G]AC[C>G]CC[C>G]GC[C>G]TG[C>G]AG[C>G]CG[C>G]GG[C>G]TT[C>G]AT[C>G]CT[C>G]GT[C>G]TA[C>T]AA[C>T]CA[C>T]GA[C>T]TC[C>T]AC[C>T]CC[C>T]GC[C>T]TG[C>T]AG[C>T]CG[C>T]GG[C>T]TT[C>T]AT[C>T]CT[C>T]GT[C>T]TA[T>A]AA[T>A]CA[T>A]GA[T>A]TC[T>A]AC[T>A]CC[T>A]GC[T>A]TG[T>A]AG[T>A]CG[T>A]GG[T>A]TT[T>A]AT[T>A]CT[T>A]GT[T>A]TA[T>C]AA[T>C]CA[T>C]GA[T>C]TC[T>C]AC[T>C]CC[T>C]GC[T>C]TG[T>C]AG[T>C]CG[T>C]GG[T>C]TT[T>C]AT[T>C]CT[T>C]GT[T>C]TA[T>G]AA[T>G]CA[T>G]GA[T>G]TC[T>G]AC[T>G]CC[T>G]GC[T>G]TG[T>G]AG[T>G]CG[T>G]GG[T>G]TT[T>G]AT[T>G]CT[T>G]GT[T>G]T

SNV Cluster

Trun

cal

SJOS001105

D1-D

2D1

(pre

-tx)

R1 (r

el)

D2 (r

el)

Private

Trun

cal

M1-

M2

D1 (p

re-tx

)M

1 (p

re-tx

)M

2 (re

l)SJOS001107 SJOS10

Trun

cal

**

*

*

**

*

*

***

**

*

*

***

**

*

*

***

**

*

*

***

Private

Private

XX

X

Cisplatin sig. weight(proportion SNVs)

PretreatmentRelapse

X CisplatinCarboplatin

Ost

eosa

rcom

a sa

mpl

es

(Patient_Sample)

SJOS001103_D1SJOS001106_D1SJOS001117_D1SJOS001118_D1SJOS001119_D1SJOS001120_D1SJOS001123_D1SJOS001125_D1SJOS001126_D1SJOS001127_D1

SJOS002_DSJOS003_DSJOS004_DSJOS005_DSJOS006_DSJOS007_DSJOS008_DSJOS009_DSJOS011_DSJOS012_DSJOS013_DSJOS014_DSJOS015_DSJOS016_DSJOS017_DSJOS018_DSJOS019_D

DM

SJOS001112_M2

0.0 0.2 0.4 0.6 0.8 1.0

B

XX

X

D1D2R1D1M1M2

SJOS001105

SJOS001107

XX

XX

XX

XX

M1M2M3M4M5M6M7M8

SJOS001101

SJOS010

Patient

SJOS001_M

Scale

500 SNVs

1 (clock-like)2 (APOBEC)3 (HR-def)5 (clock-like)13 (APOBEC)171830Cisplatin

Mutational signature

Constantsigs. 3 and 5

Transientcisplatin

SJOS001101

C

SJOS001105

Mutational signature evolution

SJOS001107

D2R1

D1Pretreatment

Relapse

D1

M2Relapse

M1Pretreatment

D

M

Relapse

SJOS010

COSMIC

Trinucleotide SNV context by evolution cluster

TP53 SVs

NF1 S2684I(splice)

TP53 del

POLH R167*

TP53 R248QARID1A R1074Q

DLG2del

M3M2

M8

M5M6

M4

M1

TP53 SVsATRX N294fsRB1 W78*

KITG565R

NF1G722E

M7

Figure 2.

Osteosarcoma branching evolution shaped by cisplatin treatment.A,Mutation trinucleotide context of SNV clusters shown in Fig. 1. Red color indicates theproportion of SNVs falling into the indicated trinucleotide context. Asterisks indicate mutation contexts that the novel signature has a predilection to mutate."Pre-tx" indicates pretreatment samples; "rel" indicates relapsed samples. B, Cisplatin signature weight of novel cisplatin signature in PCGP osteosarcomasbased on SNVs identified from capture validation for SJOS001101 and SJOS010 andWGS for other patients. PCGP sample nomenclature is of the format"SJOS002_D," where SJOS002 indicates patient ID and D indicates sample ID. C, Branching evolution of osteosarcomas with mutational signature data fromSupplementary Fig. S6E, right, superimposed. SJOS001101 M4 private variants were not detected due to low tumor purity and M1 private SNVs were not includeddue to unresolvable clonal composition. Potentially pathogenic variants are indicated on the appropriate evolutionary branches. Branch length is proportional tonumber of SNVs in branch as indicated by scale (bottom-right).

Brady et al.

Mol Cancer Res; 17(4) April 2019 Molecular Cancer Research900

Page 29: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

(Fig. 2B). These data indicate that cisplatin causes significantdamage to osteosarcoma genomes, in some cases potentiallydoubling the number of somatic SNVs.

Mutational signature tree reveals cisplatin-associated evolutionTo determine the evolutionary changes potentially induced by

cisplatin, we next determined the robustness of the cisplatinsignature and COSMIC mutational signatures (39) in eachpatient's evolutionary SNV clusters, such as truncal, shared orprivate clusters (Supplementary Fig. S6E). This analysis confirmedthe cisplatin signature's strong presence in variants of later(private and shared) but not truncal clusters (SupplementaryFig. S6E).We superimposed thesemutational signature data ontothepreviously-constructed evolutionary trees to createmutationalsignature evolutionary trees (Fig. 2C; ref. 17). Truncal variantswere enriched for COSMIC signature 1 (a ubiquitous clock-likesignature caused by 5-methylcytosine deamination; ref. 39), sig-nature 3 (indicating homologous recombination deficiency;refs. 16, 40), and signature 5 (another clock-like signature; ref. 39).Shared variants, in contrast, were enriched for the cisplatin sig-nature, suggesting that cisplatin may have induced branchingevents (Fig. 2C, left, SJOS001101). Private SNVs in some relapsesamples (SJOS001101 samples M2, M3, M8; SJOS001105 sam-ples D2 and R1; SJOS001107 sample M2) were also enriched forthe cisplatin signature (Fig. 2C). However, mutations on privatebranches of several relapsed samples were relatively free of thecisplatin signature (SJOS001101 samples M5, M6, M7), suggest-ing that these mutations were acquired after, or near cessation of,cisplatin treatment (Fig. 2C, left). Regional heterogeneity of thecisplatin signaturewas found in SJOS010as the cisplatin signaturewas present in only one of two posttherapy (post-carboplatin andpost-cisplatin) samples—tibia metastasis D had 19% of SNVsplatinum-induced and the signaturewas absent in lungmetastasisM (Fig. 2C, right).

Variants potentially induced by cisplatinThe SJOS001101 KIT and NF1 variants, and SJOS001105 NF1

variant, appeared in branches enriched for the cisplatin signature(Fig. 2C), suggesting they may have been cisplatin-induced. Tocalculate the probability that these mutations were cisplatin-induced, we used an approach described previously (18). Thisrevealed a 100% likelihood that the NF1 G722E variant wascisplatin-induced, consistent with its presence at cisplatin hotspotsite C[C>T]C. For KIT G565R, the probability of being cisplatin-inducedwas60%. This variantwas found atC[C>T]A,which is nota cisplatin hotspot but still a potential target (Supplementary Fig.S6C). SJOS001105 had anNF1 S2684I variant (potentially affect-ing splicing) with a 70% likelihood of being cisplatin-induced(at A[C>A]C, which is not a strong cisplatin hotspot). No path-ogenic SNVs were detected in cisplatin-enriched branches in thetwo additional patients with the cisplatin signature (SJOS001107and SJOS010; Fig. 2C). Given the uncertain significance of the twoNF1mutations, and the 60% probability that the KIT variant wascisplatin-induced, further investigation is needed to determinewhether cisplatin induces functional, cancer-promoting variants.

Clonal heterogeneity of copy number variationTo understand clonal heterogeneity from the perspective

of copy number variation (CNV), we compared intrapatientCNV profiles (Fig. 3). We first clustered CNV profiles inSJOS001101 (Fig. 3A), the patient with the most samples, which

mirrored SNV-based clustering as M2 and M3, both collectedfrom the right side (Fig. 1A), clustered separately from othersamples. Although M2 and M3 had relatively diploid copy pro-files, other samples harbored widespread copy gains (Fig. 3A,bottom). Specifically, although M2 and M3 had copy gains in38% to 44% of the genome, the remaining samples had gains in78% to 85% (Fig. 3B, left), indicating that the tumor cellsexperienced widespread gains of multiple chromosomes.

To identify CNV driver genes, we identified Cancer GeneCensus genes (14) with <0.5 copies (homozygous deletions) or>7 copies (gains; Fig. 3B, right). This revealed nonfocal truncalcopy gains in MYC (5–9 copies per sample; 13–16 Mb region),AKT1 (5–9 copies; 5Mb region); branch-specific nonfocal gains inCCND3 (6–10 copies in non-M2-M3 in a 24–27 Mb region, butonly 3 copies inM2andM3), and 3q13.31deletion (0–1 copies inM2-M3; 0.5–1 Mb region); and a private TERT gain (7 copies inM7; 2 Mb region; Fig. 3B, right). Thus, CNVs show substantialintrapatient heterogeneity in SJOS001101.

We also compared intrapatient CNV profiles in SJOS001105(Fig. 3C), SJOS001107 (Fig. 3D), and SJOS010 (Fig. 3E). Thesepatients harbored truncal gains in CCND3, CDK4, CCNE1, andMYC, although the number of copies varied. Each patient alsoharbored truncal homozygous deletions on 3q13.31 (includingLINC00901) and one patient had homozygousCDKN2A deletion(Fig. 3C–E; Supplementary Fig. S4; homozygous deletions wereconfirmed by B-allele frequencies in Supplementary Fig. S5). LikeSJOS001101, SJOS001105 (Fig. 3C) and SJOS010 (Fig. 3E) hadtwo lineages with large differences in ploidy. In SJOS001105, D1andD2 had copy gains in 67% to 69%of the genome, whereas R1had copy gains in 94% due to widespread copy gains (Fig. 3C,bottom-left). In SJOS010, sample D had copy gains in 69% of thegenome whereas M had copy gains in 91% (Fig. 3E, bottom-left).Thus, three of four patients showed ploidy heterogeneity withwidespread copy gains in one lineage; such gains are associatedwith increased metastatic potential and drug resistance (41).

These results indicate that osteosarcomas, known to havehighly complex genomes (8), continue to evolve by acquiringadditional gross chromosomal alterations after disease initiation.However, most clear driver CNVs, such as those affecting the CDKcomplex (cyclins, CDKs and CDK inhibitors) and MYC, are likelyearly events.

Intra- and intersite clonal evolutionTo determine the clonal composition of each metastasis, we

first determined the cancer cell fraction (CCF; the proportion ofcancer cells belonging to a clone) of the site-specific "founder"clone (the clone defined by each sample's private SNVs) in eachsample. To do this, we identified SNVs in two-copy regions in eachsample and determined representative VAFs (from deep capturesequencing) of each SNV cluster by identifying VAF density peaks(Supplementary Fig. S8).Where VAF density peaks of private SNVclusters were very close to truncal SNV peaks, this indicated a highCCF of the site-specific founder clone.

For example, in five samples from SJOS001101 (M2, M3, M6,M7, M8), the VAF density peak for private variants was indeedclose to the VAF density peak of the truncal variants (private/truncal > 0.95), indicating that the unique site-specific founderclone in each of these sites had a CCF >95% (SupplementaryFig. S8A). Thus, the dominant population at each of these siteswas the site-specific founder clone of a single unique lineage oftumor cells harboring the truncal, shared, and private variants.

Cisplatin Shapes Osteosarcoma Clonal Evolution

www.aacrjournals.org Mol Cancer Res; 17(4) April 2019 901

Page 30: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

1

2

3

4

5

6

7

89

10

11

12

13

14

1516

17

18

19

20

2122

x

1

2

3

4

5

6

7

89

10

11

12

13

14

1516

18

19

20

2122

X

M2 M3

M7M8

M1 M5 M61

2

3

4

5

6

7

89

10

11

12

13

14

1516

17

18

19

20

2122

X 1

2

3

4

5

6

7

89

10

11

12

13

14

1516

17

18

19

20

2122

X

CNVNeutralLossGain

Copies

M2 M3 M7 M8 M1 M5 M60

20

40

60

80

100CNVNeutralLossGain

M2 M3 M7 M8 M1 M5 M6

A

B

CCND3-amp(AKT1-amp+)

MYC-ampAKT1-amp

TERT-amp

Truncal

Shared

Private

MYC

CCND3

MYCMYC

CCND3

TERT

AKT1 AKT1 AKT1

chr1

7

2 1

48M

2

M3

SJOS001101CNV Euclidean

distance

SJOS001101 SJOS001101

SJOS001105 SJOS001107

M2 M7

D1D2R1

M1M5M6

1

2

3

4

5

6

7

89

10

11

12

13

14

1516

17

18

19

20

2122

x

D1M1M2

1

2

3

4

5

6

7

89

10

11

12

13

14

1516

17

18

19

20

2122

x

SJOS010

D

C D E

D1 D2 R1 D1 M1 M2 D M0

20

40

60

80

100

0

20

40

60

80

100

0

20

40

60

80

100

% o

f Gen

ome

CNVNeutralLossGain

% o

f Gen

ome

% o

f Gen

ome

% o

f Gen

ome

Cop

ies

D1 D2 R1

010

20

CCND3

D1 M1 M2

04

8

Copies MYC

04

8

Copies AKT1

04

8Copies CCND3

03

6

Copies TERT

04

812

MYC

010

5

CDKN2ALINC00901

(hom del)

Cop

ies

025

50

D M

Cop

ies

CCNE1

M3 M8

M

SJOS001105 SJOS001107 SJOS010

CCNE1

CCND3CCND3

MYCCDKN2A

CDK4

3q13.31 3q13.31 3q13.31

3q13.31

3q31.31(LINC00901)0

36

Copies

CCND3

020

40

CDK4

010

20

LINC00901(hom del)

05

10

LINC00901(hom del)

05

10C

opie

s

Figure 3.

Osteosarcoma copy number heterogeneity.A, Top, Euclidean distance clustering of copy number data, inferred fromWGS, from each osteosarcoma metastaticsample in SJOS001101. M4 was not included due to very low tumor purity. Possible driver CNVs are indicated in red. Bottom, Circos plots showing copy numberof each sample. Chromosome numbers are indicated around outside of plot. Sample names inside circle indicate the CNV tracks from top to bottom at the 12o'clock position (see key on left plot). Red indicates copy number gain, blue copy number loss. Data are on a log2 scale and each gray line represents a log2difference of 0.5. Gray dotted line represents the split between twomajor CNV genetic lineages. B, Left, percent of genome with copy gained, lost, and neutral(2� 0.2 copies) regions in each sample. Right, number of copies (linear; non-log2 scale) of indicated genes in each sample. Gray line indicates diploid status(2 copies). C–E, Circos plots as inA and additional graphs as in B for patients SJOS001105, SJOS001107, and SJOS010. SJOS001107 (D) and SJOS010 (E) weremale and X chromosome copy scale is half of that shown in legend in A for these patients (black line with no color indicates one copy).

Brady et al.

Mol Cancer Res; 17(4) April 2019 Molecular Cancer Research902

Page 31: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

The site-specific founder clones also gave rise to descendantclone(s) in M3, M6, and M7, as evidenced by minor private VAFdensity peaks (a second peak with lower VAF; Supplementary Fig.S8A). The M5 site-specific founder clone had a slightly lower CCFof 92%, indicating admixture of�8% of cells from other clone(s)harboring the M5 site-specific founder clone's truncal and sharedSNV clusters (Supplementary Fig. S8A, red, orange peaks overlaptruncal). Finally, M1 showed high admixture of clones, with theM1 site-specific founder clone(s) having a CCF of only 42%(Supplementary Fig. S8A). Thismayhavebeendue to thepresenceof two discrete lung nodules in theM1 lung slice, and consequentsampling of two lineages (Fig. 1A). The low CCF made it impos-sible to resolve whether the M1 private variants represented oneclone or multiple independent clones.

To determine whether cross-seeding (seeding of a single site bytwo independent lineages) occurred in SJOS001101, we per-formed pairwise sample comparisons of SNVs that were diploid

in all samples (Supplementary Fig. S9). As expected, M2, M3,M6,M7, and M8 showed little evidence of crossover SNVs—privateSNVs representing the site-specific founder clone in one samplewhich were also found at low VAF in another sample, evidencingcross-seeding—consistent with their high CCF for their site-spe-cific founder clones (CCFs of >95%; Fig. 4A; Supplementary Figs.S8A, S9). This confirmed that these samples consisted of tumorcells froma single lineage (with descendant clones inM3,M6, andM7; Fig. 4A). SampleM5, in contrast, showed evidence of�4%ofcancer cells being from sample M6, a nearby but spatially distinctsite, suggesting cross-seeding between these sites (Fig. 4A; Sup-plementary Fig. S9 enlarged graphs). This is consistent with theestimated CCF of 92% for the M5 site-specific founder clone,which was lower than site-specific founder clones at other sites(Supplementary Fig. S8A). Two samples were not analyzed usingthis approach:M1 containedmultiple distinctmetastatic nodules,and M4 had <20% tumor purity.

M2

M3

M5

M6

M7

M8

Cross-seeding

Descendant

Cross-seeding

Genetic cluster 1 Genetic cluster 2

KIT G565R

CCND3-amp~80% Ploidy gains

TERT-amp

Relatively diploid

TruncalTP53 SVs

ATRX N294fsRB1 W78*

MYC-ampAKT1-amp

Cluster 2a

Cluster 2b

Homogeneous

Clones

KIT G565RNF1 G722E

A

B

SJOS001101

SJOS001107

M1

M2

D1Primarytumor

Before diagnosis

73 Weeks later

TruncalTP53 del

CDKN2A del

Metastatic clone

Figure 4.

Osteosarcoma clonal heterogeneity and evolution.A,The clonal composition of each metastatic site isrepresented by colored circles. Sites with a >95% site-specific founder clone CCF (private VAF density/truncal VAF density >0.95) are represented by onesolid circle, and those with both a site-specific founderclone and its descendant clone(s) are shown by twoconcentric circles. Cross-seeding is indicated byarrows. Circle area is proportional to CCF. Geneticcluster drivers for right vs. left anatomical side andselected shared and private drivers are indicated. M1was not included due to containingmultiple discretelung nodules and M4 was not included due to lowtumor purity. B, Clones are represented as inA.Metastatic seeding from aminor clone in theSJOS001107 primary tumor (D1) to lung metastases(M1 and M2) is indicated by arrows. Red represents M1private variants not detectable in D1; yellowrepresents M2 private variants not detected in D1.

Cisplatin Shapes Osteosarcoma Clonal Evolution

www.aacrjournals.org Mol Cancer Res; 17(4) April 2019 903

Page 32: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

To validate clonal evolution results in SJOS001101,whichhas atotal of eight samples, we used an automated approach based onPyClone (21) and MACHINA (22). This resulted in two possibleclone trees (Supplementary Fig. S10). Both trees recapitulatedcross-seeding betweenM5 andM6, and identified theM3 descen-dant clone of the M3 site-specific founder clone. The M6 and M7descendant clones could not be detected with this methodbecause pan-diploid SNVs were used, thus filtering out manysingle-sample-diploid variants used to detect these clones withdensity analysis (Supplementary Fig. S8A). Both trees also reaf-firmed that most samples (M2, M3, M6, M7, M8) consisted of asite-specific founder clone and possibly its descendants. Clonetree 1 appeared more plausible than tree 2, as tree 2 possessed abranching event in anM5precursor supported by a small subset ofcluster 10 (Supplementary Fig. S10C). MACHINA analysis wasconsistent with our previous clonal analysis and reinforced ourfindings with a rigorous mathematical approach.

The clonal evolution of SJOS001101, combiningmutation andCNVs, is summarized in Fig. 4A. All metastases shared inactiva-tionofTP53, ATRX, andRB1 and copy gains inMYC andAKT1.Weobserved two genetic lineages which correlated spatially, withM2and M3 being relatively close on the anatomical right and mostother lesions on the left (Fig. 4A). The non-M2-M3 cluster (geneticcluster 2 in Fig. 4A) developed widespread copy gains with�80%of the genome having above two copies, including CCND3 gains;an NF1 variant in a subset of samples (M4, M7, M8; cluster 2b);and TERT copy gain in a single sample (M7). Within the M2-M3cluster, we observed an activating KIT G565R private SNV in M3.Two samples (M2 and M8) consisted of a single dominanthomogeneous clone (Fig. 4A). Three samples (M3, M6, andM7) were composed of tumor cells of a single lineage, whichalso branched off to descendant clone(s) as evidenced by minorVAF peaks among private variants (Fig. 4A; Supplementary Fig.S8A) and clustering analysis (Supplementary Fig. S10). Thesedescendant clones might have developed after metastatic coloni-zation. Finally, one sample, M5, showed evidence of cross-seed-ing, wheremultiple lineages colonize the samemetastatic site. It isunclear whether an M5 population moved to M6, vice versa, orsome other site seeded both. Some of the branching evolutionmay have been induced by cisplatin treatment; indeed, muta-tional signature analysis suggests that 46%–77% of SNVs in thispatient were induced by cisplatin.

We also performed clonal evolution analysis on SJOS001105,SJOS001107, and SJOS010 (Supplementary Figs. S8B, S8C, andS11). SJOS001105 did not show evidence of cross-seedingbetween sites, and the primary tumor (D1) did not show evidenceof a minor clone that went on to seed metastases D2 and R1(Supplementary Figs. S11A). SJOS010 likewise did not showevidence of cross-seeding (Supplementary Fig. S11C). InSJOS001107, by contrast, deep capture sequencing revealed thatthe M1-M2 cluster of mutations (Fig. 1C), present in lung metas-tases M1 and M2 but not primary tumor D1 by WGS, was in factdetectable at a low level in D1 upon deeper sequencing (Supple-mentary Fig. S11B). Indeed, 250 of 458 SNVs (54.6%) in theM1-M2 cluster could be detected in D1 at a median VAF of 0.8%;other clusters did not "bleed" into other samples in this way (e.g.,M2 private variants could be detected at low frequency in othersamples for only 12 of 1,420 (0.6%) of SNVs, Fisher exact testP ¼ 2 10�16, which was likely background noise; see MaterialsandMethods), indicating this was not artifactual (SupplementaryFig. S11B). This suggests that aminor clone (<5%CCF)detected in

the primary tumor may have possessed enhanced metastaticpotential, as it gave rise to both lung metastases (Fig. 4B; redindicatesM1 private variants not detectable in the primary tumor;yellow indicates M2 private variants not detectable in the primarytumor). The identity of the variants potentially causing thismetastatic potential is unclear, as we did not observe clear drivermutations in this mutation cluster. Overall, 6 of 10 metastasesacross three patients with sufficient two-copy SNVs showedevidence of single-clone seeding (Supplementary Fig. S8;private/truncal > 95%), suggesting this is a common form ofmetastatic seeding.

DiscussionAlthough it has been known for decades that chemotherapy can

induce mutations and secondary cancers (42, 43), the extent ofcisplatin-induced mutational burden shown here—up to 77% ofSNVs—is notable. It is unclear whether some of the nontruncalcopy alterations we observed were also induced by cisplatin, butgiven cisplatin's ability to induce double-stranded DNAbreaks (38), this possibility could not be ruled out. The evolu-tionary pattern of the cisplatin signature was consistent with atransient mutational process induced by treatment, as the signa-ture appears after disease development (post-truncal variants)and disappears after treatment cessation as indicated by theshared and private branches in case SJOS001101 (Fig. 2C).

The cisplatin signature likely applies broadly tomultiple cancertypes. A recent pan-pediatric cancer study reported a signaturesimilar to our cisplatin signature in one ependymoma, multipleatypical teratoid rhabdoid tumors (ATRT), andoneosteosarcoma,though the causewasunknown (37). The ependymoma identifiedwas from thePCGP, andwehave confirmed that the specimenwasacquired post-cisplatin. The osteosarcoma sample was also aPCGP sample, SJOS010, which we analyzed in this study andhad received cisplatin. The ATRT patients' treatment history wasunknown, but patients with ATRT commonly receive cisplat-in (44). The cisplatin signature has also been found, with (35,36, 45) or without (12) recognition of the cause, in liver (35),esophageal (35), breast (12), ovarian (45), and bladder can-cer (36) after cisplatin treatment.

One weakness of our study is the lack of primary tumors in twopatients (SJOS001101 and SJOS010). Therefore, we could notrule out the possibility that some of the "truncal" variantsdescribed here may have been acquired later in evolutionaryhistory. Nevertheless, the ubiquitous absence of cisplatin signa-ture in truncal variants and in 30 untreated osteosarcoma samplesfrom PCGP provided strong support for the association betweenthe cisplatin signature and the treatment history of osteosarcoma.Further, we did not observe the signature in 19 untreatedosteosarcomas in a recent study (16). Given that our primaryanalysis only focuses on four patients, future studies are needed toinvestigate the contribution of cisplatin-induced mutations inosteosarcoma tumor progression, including functional analysis ofcisplatin-induced variants.

Our analysis of SJOS001101, the only case with multiregionallung samples, showed that cross-seeding occurred in only one ofsixmetastases (Fig. 4A),which suggests thatmultiple-seedingmaynot be common in osteosarcoma (46, 47). Our findings fit theoriginal view that metastasis arises from a single clone in mostcases (48). Additional studies are required to determine whetherthis is a general observation.

Brady et al.

Mol Cancer Res; 17(4) April 2019 Molecular Cancer Research904

Page 33: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

Our findings highlight the importance of investigation ofmutational signatures in possible chemotherapy-induced second-ary malignancies (4, 49). Indeed, patients with osteosarcoma candevelop secondarymalignanciesup to25years after treatment (4).Although this may be due to inherited germline mutations incancer predisposition genes (50), further studies on noninheritedcases are needed to investigate whether the cisplatin signaturemight be associated with the development of the secondarymalignancy. More broadly, it will be valuable to determine thepoint and structural mutation signatures of every DNA-damagingchemotherapy to better understand the implications for bothsecondary malignancies and the development of drug resistance.

Disclosure of Potential Conflicts of InterestB.J. Raphael is a consultant at and has ownership interest (including stock,

patents, etc.) in Medley Genomics. No potential conflicts of interest weredisclosed for other authors.

Authors' ContributionsConception and design: R.K. Wilson, J.R. Downing, J. ZhangDevelopment of methodology: S.W. Brady, X. Ma, J. Easton, E.R. Mardis,R.K. Wilson, B.J. Raphael, J. ZhangAcquisition of data (provided animals, acquired and managed patients,provided facilities, etc.): A. Bahrami, H.L. Mulder, J. Easton, E.R. Mardis,R.K. Wilson, A.S. Pappo

Analysis and interpretation of data (e.g., statistical analysis, biostatistics,computational analysis): S.W. Brady, X. Ma, A. Bahrami, G. Satas, G. Wu,S.Newman,M.Rusch,D.K. Putnam,M.N. Edmonson, L.B. Alexandrov, X. Chen,A.S. Pappo, B.J. Raphael, J. ZhangWriting, review, and/or revision of the manuscript: S.W. Brady, X. Ma,H.L. Mulder, E.R. Mardis, R.K. Wilson, A.S. Pappo, M.A. Dyer, J. ZhangAdministrative, technical, or material support (i.e., reporting or organizingdata, constructing databases): D.A. Yergeau, R.K. Wilson, M.A. Dyer, J. ZhangStudy supervision: J. Zhang

AcknowledgmentsThe authors thank Timothy Hammond and the Biomedical Communica-

tions Department at St. Jude Children's Research Hospital for illustrations. Weacknowledge Yu Liu for help with Circos plots, Yongjin Li for computer code,and David Ellison for information on ependymoma treatment history. Wethank the Tissue Bank at St. Jude Children's Research Hospital for managingsamples. This research was supported by the NCI through Cancer CenterSupport Grant P30 CA021765 (to J. Zhang) and by the American LebaneseSyrian Associated Charities of St. Jude Children's Research Hospital.

The costs of publication of this article were defrayed in part by thepayment of page charges. This article must therefore be hereby markedadvertisement in accordance with 18 U.S.C. Section 1734 solely to indicatethis fact.

Received June 11, 2018; revised October 17, 2018; accepted January 7, 2019;published first January 16, 2019.

References1. McGranahan N, Swanton C. Clonal heterogeneity and tumor evolution:

past, present, and the future. Cell 2017;168:613–28.2. Turke AB, Zejnullahu K, Wu YL, Song Y, Dias-Santagata D, Lifshits E, et al.

Preexistence and clonal selection of MET amplification in EGFR mutantNSCLC. Cancer Cell 2010;17:77–88.

3. de Bruin EC, McGranahan N, Mitter R, Salm M, Wedge DC, Yates L, et al.Spatial and temporal diversity in genomic instability processes defines lungcancer evolution. Science 2014;346:251–6.

4. Luetke A, Meyers PA, Lewis I, Juergens H. Osteosarcoma treatment -where do we stand? A state of the art review. Cancer Treat Rev 2014;40:523–32.

5. Ottaviani G, Jaffe N. The epidemiology of osteosarcoma. In: Jaffe N,Bruland OS, Bielack S, editors. Pediatric and adolescent osteosarcoma.Boston: Springer; 2009. p.3–13.

6. Smith MA, Seibel NL, Altekruse SF, Ries LA, Melbert DL, O'Leary M, et al.Outcomes for children and adolescents with cancer: challenges for thetwenty-first century. J Clin Oncol 2010;28:2625–34.

7. Harrison DJ, Geller DS, Gill JD, Lewis VO, Gorlick R. Current and futuretherapeutic approaches for osteosarcoma. Expert RevAnticancer Ther 2018;18:39–50.

8. Chen X, Bahrami A, Pappo A, Easton J, Dalton J, Hedlund E, et al. Recurrentsomatic structural variations contribute to tumorigenesis in pediatricosteosarcoma. Cell Rep 2014;7:104–12.

9. Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 2009;25:1754–60.

10. Edmonson MN, Zhang J, Yan C, Finney RP, Meerzaman DM, Buetow KH.Bambino: a variant detector and alignment viewer for next-generationsequencing data in the SAM/BAM format. Bioinformatics 2011;27:865–6.

11. Zhang J, Ding L, Holmfeldt L, Wu G, Heatley SL, Payne-Turner D, et al. Thegenetic basis of early T-cell precursor acute lymphoblastic leukaemia.Nature 2012;481:157–63.

12. Brady SW, McQuerry JA, Qiao Y, Piccolo SR, Shrestha G, Jenkins DF, et al.Combating subclonal evolution of resistant cancer phenotypes.Nat Commun 2017;8:1231.

13. Chen X, Gupta P, Wang J, Nakitandwe J, Roberts K, Dalton JD, et al.CONSERTING: integrating copy-number analysis with structural-variationdetection. Nat Methods 2015;12:527–30.

14. Futreal PA, Coin L, Marshall M, Down T, Hubbard T, Wooster R, et al.A census of human cancer genes. Nat Rev Cancer 2004;4:177–83.

15. Alexandrov LB, Nik-Zainal S, Wedge DC, Campbell PJ, Stratton MR.Deciphering signatures ofmutational processes operative inhuman cancer.Cell Rep 2013;3:246–59.

16. Ma X, Liu Y, Liu Y, Alexandrov LB, Edmonson MN, Gawad C, et al. Pan-cancer genome and transcriptome analyses of 1,699 paediatric leukaemiasand solid tumours. Nature 2018;555:371–6.

17. Yates LR, Knappskog S,WedgeD, Farmery JHR,Gonzalez S,Martincorena I,et al. Genomic evolution of breast cancer metastasis and relapse.Cancer Cell 2017;32:169–184.

18. Morganella S, Alexandrov LB, Glodzik D, Zou X, Davies H, Staaf J, et al. Thetopography of mutational processes in breast cancer genomes.Nat Commun 2016;7:11383.

19. Gerlinger M, Rowan AJ, Horswell S, Larkin J, Endesfelder D, Gronroos E,et al. Intratumor heterogeneity and branched evolution revealed by multi-region sequencing. N Engl J Med 2012;366:883–92.

20. Ma X, EdmonsonM, Yergeau D,Muzny DM, Hampton OA, RuschM, et al.Rise and fall of subclones from diagnosis to relapse in pediatric B-acutelymphoblastic leukaemia. Nat Commun 2015;6:6604.

21. Roth A, Khattra J, Yap D, Wan A, Laks E, Biele J, et al. PyClone: statisticalinference of clonal population structure in cancer. Nat Methods 2014;11:396–8.

22. El-KebirM, SatasG, Raphael BJ. Inferring parsimoniousmigrationhistoriesfor metastatic cancers. Nat Genet 2018;50:718–26.

23. Navid F, Santana VM, Neel M, McCarville MB, Shulkin BL, Wu J, et al.A phase II trial evaluating the feasibility of adding bevacizumab to standardosteosarcoma therapy. Int J Cancer 2017;141:1469–77.

24. Meyer WH, Pratt CB, Poquette CA, Rao BN, ParhamDM,Marina NM, et al.Carboplatin/ifosfamidewindow therapy for osteosarcoma: results of the StJude Children's Research Hospital OS-91 trial. J Clin Oncol 2001;19:171–82.

25. Wang J, Mullighan CG, Easton J, Roberts S, Heatley SL, Ma J, et al. CRESTmaps somatic structural variation in cancer genomes with base-pair res-olution. Nat Methods 2011;8:652–4.

26. Perry JA, Kiezun A, Tonzi P, Van Allen EM, Carter SL, Baca SC, et al.Complementary genomic approaches highlight the PI3K/mTOR pathwayas a common vulnerability in osteosarcoma. ProcNatl Acad Sci U SA 2014;111:E5564–73.

27. Pasic I, Shlien A, Durbin AD, Stavropoulos DJ, Baskin B, Ray PN, et al.Recurrent focal copy-number changes and loss of heterozygosity implicate

Cisplatin Shapes Osteosarcoma Clonal Evolution

www.aacrjournals.org Mol Cancer Res; 17(4) April 2019 905

Page 34: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

two noncoding RNAs and one tumor suppressor gene at chromosome3q13.31 in osteosarcoma. Cancer Res 2010;70:160–71.

28. Debiec-Rychter M, Cools J, Dumez H, Sciot R, Stul M, Mentens N, et al.Mechanisms of resistance to imatinib mesylate in gastrointestinal stromaltumors and activity of the PKC412 inhibitor against imatinib-resistantmutants. Gastroenterology 2005;128:270–9.

29. Palacios GA, Molina-Vila M�A, Gras-Cabrerizo JR, Le�on X, Mayo C, ServatJC, et al. Molecular analysis in serial biopsies in sinonasal mucosalmelanoma. Ann Oncol 2016;27(suppl_6):1005P.

30. Gluszek S, Koziel D, Kowalik A, Zięba S, Urbaniak-Wasik S, Wincewicz A,et al. KRAS, KIT and TP53 mutations in mother's and daughter's gastriccardia adenocarcinomas. Gastroenterol Rev 2018;13:76–9.

31. Shao YW, Wood GA, Lu J, Tang QL, Liu J, Molyneux S, et al. Cross-speciesgenomics identifies DLG2 as a tumor suppressor in osteosarcoma.Oncogene 2019;38:291–8.

32. Reiter JG, Makohon-Moore AP, Gerold JM, Heyde A, Attiyeh MA, KohutekZA, et al. Minimal functional driver gene heterogeneity among untreatedmetastases. Science 2018;361:1033–7.

33. Alexandrov LB,Nik-Zainal S,WedgeDC,Aparicio SA, Behjati S, BiankinAV,et al. Signatures of mutational processes in human cancer. Nature 2013;500:415–21.

34. Helleday T, Eshtad S, Nik-Zainal S. Mechanisms underlying mutationalsignatures in human cancers. Nat. Rev. Genet. 2014;15:585–98.

35. Boot A, Huang MN, Ng AWT, Ho SC, Lim JQ, Kawakami Y, et al. In-depthcharacterization of the cisplatin mutational signature in human cell linesand in esophageal and liver tumors. Genome Res 2018;28:654–65.

36. Liu D, Abbosh P, Keliher D, Reardon B, Miao D,Mouw K, et al. Mutationalpatterns in chemotherapy resistant muscle-invasive bladder cancer.Nat Commun 2017;8:2193.

37. Gr€obner SN, Worst BC, Weischenfeldt J, Buchhalter I, Kleinheinz K,Rudneva VA, et al. The landscape of genomic alterations across childhoodcancers. Nature 2018;555:321–7.

38. Huang X, Okafuji M, Traganos F, Luther E, Holden E, Darzynkiewicz Z.Assessment of histoneH2AX phosphorylation induced byDNA topoisom-erase I and II inhibitors topotecan andmitoxantrone andby theDNA cross-linking agent cisplatin. Cytometry 2004;58A:99–110.

39. Alexandrov LB, Jones PH, Wedge DC, Sale JE, Campbell PJ, Nik-Zainal S,et al. Clock-like mutational processes in human somatic cells. Nat Genet2015;47:1402–7.

40. Kovac M, Blattmann C, Ribi S, Smida J, Mueller NS, Engert F, et al. Exomesequencing of osteosarcoma reveals mutation signatures reminiscent ofBRCA deficiency. Nat Commun 2015;6:8940.

41. BakhoumSF,Cantley LC. Themultifaceted role of chromosomal instabilityin cancer and its microenvironment. Cell 2018;174:1347–60.

42. Ewig RA, Kohn KW. DNA damage and repair in mouse leukemia L1210cells treated with nitrogen mustard, 1,3-bis(2-chloroethyl)-1-nitrosourea,and other nitrosoureas. Cancer Res 1977;37:2114–22.

43. Harris CC. A delayed complication of cancer therapy—cancer.J Natl Cancer Inst 1979;63:275–7.

44. Zaky W, Dhall G, Ji L, Haley K, Allen J, Atlas M, et al. Intensive inductionchemotherapy followed by myeloablative chemotherapy with autologoushematopoietic progenitor cell rescue for young children newly-diagnosedwith central nervous system atypical teratoid/rhabdoid tumors: The headstart III experience. Pediatr Blood Cancer 2014;61:95–101.

45. Patch AM, Christie EL, Etemadmoghadam D, Garsed DW, George J, Fere-day S, et al. Whole-genome characterization of chemoresistant ovariancancer. Nature 2015;521:489–94.

46. Hoadley KA, Siegel MB, Kanchi KL, Miller CA, Ding L, Zhao W, et al.Tumor evolution in two patients with basal-like breast cancer: a retro-spective genomics study of multiple metastases. PLOS Med 2016;13:e1002174.

47. Kim MY, Oskarsson T, Acharyya S, Nguyen DX, Zhang XH, Norton L,et al. Tumor self-seeding by circulating cancer cells. Cell 2009;139:1315–26.

48. Chaffer CL, Weinberg RA. A perspective on cancer cell metastasis. Science2011;331:1559–64.

49. Bhatia S, Robison LL, Oberlin O, Greenberg M, Bunin G, Fossati-Bellani F,et al. Breast cancer and other second neoplasms after childhood Hodgkin'sdisease. N Engl J Med 1996;334:745–51.

50. Toguchida J, Yamaguchi T, Dayton SH, Beaughamp RL, Herrera GE,Ishizaki K, et al. Prevalence and spectrum of germline mutations of thep53 gene among patients with sarcoma. N Engl J Med 1992;326:1301–8.

Mol Cancer Res; 17(4) April 2019 Molecular Cancer Research906

Brady et al.

Page 35: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

CANCER EPIDEMIOLOGY, BIOMARKERS & PREVENTION | RESEARCH ARTICLE

Genetic and Circulating Biomarker Data Improve RiskPrediction for Pancreatic Cancer in the GeneralPopulationJihye Kim1,2, Chen Yuan3, Ana Babic3, Ying Bao4, Clary B. Clish5, Michael N. Pollak6,Laufey T. Amundadottir7, Alison P. Klein8,9, Rachael Z. Stolzenberg-Solomon10, Pari V. Pandharipande11,Lauren K. Brais3, Marisa W. Welch3, Kimmie Ng3, Edward L. Giovannucci2,4,12, Howard D. Sesso2,13,JoAnn E. Manson2,13, Meir J. Stampfer2,4,12, Charles S. Fuchs14,15,16, Brian M. Wolpin3, and Peter Kraft1,2

ABSTRACT◥

Background: Pancreatic cancer is the third leading cause ofcancer death in the United States, and 80% of patients presentwith advanced, incurable disease. Risk markers for pancreaticcancer have been characterized, but combined models are notused clinically to identify individuals at high risk for thedisease.

Methods: Within a nested case–control study of 500 pancreaticcancer cases diagnosed after blood collection and 1,091 matchedcontrols enrolled in four U.S. prospective cohorts, we character-ized absolute risk models that included clinical factors (e.g., bodymass index, history of diabetes), germline genetic polymorphisms,and circulating biomarkers.

Results: Model discrimination showed an area under ROCcurve of 0.62 via cross-validation. Our final integrated modelidentified 3.7%ofmen and 2.6%of womenwho had at least 3 timesgreater than average risk in the ensuing 10 years. Individualswithin the top risk percentile had a 4% risk of developing pan-creatic cancer by age 80 years and 2% 10-year risk at age 70 years.

Conclusions: Risk models that include established clinical,genetic, and circulating factors improved disease discrimina-tion over models using clinical factors alone.

Impact: Absolute risk models for pancreatic cancer may helpidentify individuals in the general population appropriate fordisease interception.

IntroductionPancreatic cancer is the third leading cause of cancer-related

mortality in the United States (1). Incidence rates of pancreatic cancercontinue to rise, and 56,770 new cases are expected in 2019, such thatpancreatic cancer is projected to become the second leading cause ofcancer death in the United States within the next 10 years (2). The highmortality from pancreatic cancer is due in large part to late diagnosis,as nearly 80% of patients present with locally advanced or metastaticdisease that is incurable (3). In contrast, patients diagnosed withlocalized, early stage pancreatic cancer can be cured using a combi-nation of surgery, chemotherapy, and radiotherapy (4). Thus,identifying individuals at high risk of pancreatic cancer is of greatimportance, so that appropriate patients can be targeted for cancerprevention and earlier diagnosis.

Epidemiologic studies from numerous distinct populations haveidentified demographic, lifestyle, and clinical factors associated withincreased risk of pancreatic cancer. Firmly established risk factors includeolder age, male gender, African-American race/ethnicity, cigarette smok-ing, obesity, family history of pancreatic cancer, history of diabetesmellitus, and history of chronic pancreatitis (5–7). In nested prospectivestudies, future pancreatic cancer risk has been associated with circulatinglevels of several biomarkers related to insulin resistance [insulin, proin-sulin, hemoglobin A1c (8–10), insulin-like growth factor binding protein1 (11), and 25-hydroxyvitaminD (12)], adipokines [adiponectin (13, 14)and leptin (15, 16)], inflammation (IL6; ref. 17), and peripheral tissuecatabolism [branched chain amino acids (BCAA; refs. 18–20)].

Inherited genetic variants have been identified that predispose todevelopment of pancreatic cancer. Medium- to high-penetrance

1Program in Genetic Epidemiology and Statistical Genetics, Department of Epide-miology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts.2Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston,Massachusetts. 3Department of Medical Oncology, Dana-Farber Cancer Institute,Boston, Massachusetts. 4Channing Division of Network Medicine, Department ofMedicine, Brigham and Women's Hospital and Harvard Medical School, Boston,Massachusetts. 5Broad Institute of Massachusetts Institute of Technology andHarvard University, Cambridge, Massachusetts. 6Cancer Prevention Research Unit,Department of Oncology, Faculty ofMedicine, McGill University, Montreal, Quebec,Canada. 7Laboratory of Translational Genomics, Division of Cancer Epidemiologyand Genetics, National Cancer Institute, National Institutes of Health, Bethesda,Maryland. 8Department of Oncology, Sidney Kimmel ComprehensiveCancer Center, Johns Hopkins School of Medicine, Baltimore, Maryland.9Department of Pathology,SolGoldmanPancreatic CancerResearchCenter, JohnsHopkins School of Medicine, Baltimore, Maryland. 10Division of Cancer Epidemiol-ogy andGenetics, National Cancer Institute,National InstitutesofHealth, Bethesda,Maryland. 11Department of Radiology and Institute for Technology Assessment,Massachusetts General Hospital, Boston,Massachusetts. 12Department ofNutrition,Harvard T.H. Chan School of Public Health, Boston, Massachusetts. 13Division of

PreventionMedicine, Department of Medicine, BrighamandWomen's Hospital andHarvard Medical School, Boston, Massachusetts. 14Department of Medical Oncol-ogy, Yale Cancer Center, New Haven, Connecticut. 15Department of Medicine, YaleSchool of Medicine, New Haven, Connecticut. 16Department of Medical Oncology,Smilow Cancer Hospital, New Haven, Connecticut.

Note: Supplementary data for this article are available at Cancer Epidemiology,Biomarkers & Prevention Online (http://cebp.aacrjournals.org/).

B.M. Wolpin and P. Kraft are co-senior authors of this article.

Corresponding Authors: Peter Kraft, Harvard T.H. Chan School of Public Health,655 Huntington Avenue, Boston, MA 02115. Phone: 617-432-4271; Fax: 617-432-1722; E-mail: [email protected]; and Brian M. Wolpin, Dana-FarberCancer Institute, 450 Brookline Avenue, Boston, MA 02215. Phone: 617-632-6942; Fax: 617-632-5370; E-mail: [email protected]

Cancer Epidemiol Biomarkers Prev 2020;29:999–1008

doi: 10.1158/1055-9965.EPI-19-1389

�2020 American Association for Cancer Research.

AACRJournals.org | 999

Page 36: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

alterations have been found in several genes (e.g., ATM, BRCA1,BRCA2, CDKN2A, and PALB2), but these alterations are present inonly 5% to 10% of patients with pancreatic cancer (21–23). Therefore,these gene mutations explain only a small fraction of the genetic riskfor pancreatic cancer in the general population (24). To identifycommon susceptibility loci, six large genome-wide association studies(GWAS) have been conducted in populations of Europeanancestry (25–30). To date, 18 susceptibility loci carrying 22 indepen-dent SNPs have been identified surpassing the genome-wide signif-icance threshold (P < 5 � 10�8).

Although risk factors have been investigated individually (31–33),their joint contribution to risk discrimination remains largelyunknown. Therefore, we examined absolute riskmodels for pancreaticcancer that incorporate established clinical factors, common geneticpredisposition variants, and circulating biomarkers in four largeprospective cohorts. To estimate lifetime risk and 10-year risk, modelswere evaluated for the full nested case–control population and casesdiagnosed within 10 years of blood collection and their matchedcontrols, respectively.

Materials and MethodsStudy population

This study included participants from four large prospective cohortstudies, the Health Professionals Follow-up Study (HPFS), Nurses'Health Study (NHS), Physicians' Health Study I (PHS I), andWomen'sHealth Initiative (WHI) Observational Study. HPFS began enrollmentof 51,529 male health professionals ages 40 to 75 years in 1986 (34). InNHS cohort, 121,701 female nurses ages 30 to 55 years began enroll-ment in 1976 (35). PHS I was a randomized clinical trial initiated in1982 to examine effects of aspirin and B-carotene among 22,071healthy male physicians ages 40 to 84 years. After trial completionin 1995, PHS I participants were followed up in an observationalcohort (36). In the WHI, 93,726 women ages 50 to 79 years enrolledbetween 1994 and 1998 to examine potential risk factors and causes ofmorbidity and mortality among postmenopausal women (37).

In this study, cases were incident patients with primary pancreaticadenocarcinoma ascertained between 1984 and 2010 through self-report, report of next-of-kin, or death certificates and confirmed bymedical record review and tumor registry data. All cases providedblood samples prior to their pancreatic cancer diagnosis, and werandomly selected controls with matching on cohort (which alsomatches on sex), year of birth, smoking status, fasting status, andtime of blood collection (month and year) with a matching ratio of 1:2or 1:3. We excluded nonwhite participants, as GWAS risk variantswere identified in subjects of European ancestry, and the strength oftheir association with pancreatic cancer in other populations requiresfurther study. We also excluded participants who had completemissing data for questionnaires or blood samples or did not havematched counterparts. This study was approved by Human ResearchCommittee at Brigham and Women's Hospital (Boston, MA), andparticipants of each cohort provided informed consent.

Lifestyle and clinical characteristicsData on individual characteristics, such as lifestyles and medical

history, were obtained by self-report on questionnaires, as previouslyreported (34–37). We used study questionnaires completed at or justprior to the blood draw in HPFS and NHS and the baseline ques-tionnaire in PHS and WHI cohorts to collect data for age, sex, bodymass index (BMI; kg/m2), waist-to-hip ratio (WHR; inch/inch),physical activity (measured by MET-hour per week), and history of

diabetes. Because WHR data were completely missing at baseline inPHS, we incorporated postbaseline questionnaire data obtained at108months of follow-up in the cohort.We includedWHRdata of PHSonly for cases (and matched controls) who were diagnosed after108 months.

Blood collection and plasma assaysBlood samples were collected from 18,225 men in HPFS (1993–

1995), 14,916men inPHS I (1982–1984), 32,826women inNHS (1989–1990), and 93,676 women in WHI (1994–1998). Details for bloodprocessing and storage have been described previously (18). All casesand matched controls in our study provided blood samples prior to thecase's diagnosis. Circulating levels of proinsulin (pmol/L), adiponectin(mg/mL), IL6 (pg/mL; ref. 38), andBCAAs (mmol/L)weremeasuredandrepresent four major categories of circulating markers related topancreatic cancer risk (insulin resistance, adipokines, inflammation,and peripheral tissue catabolism, respectively). We dichotomized cir-culating adiponectin with a cutoff of 4.4 mg/mL as done previously (13).Details for laboratory assays and coefficients of variance (CV) have beenpreviously reported (8, 11–13, 15, 18). CVs for blinded pooled plasmasamples for all circulating markers were <11%.

DNA sequencing and SNP selectionGenomic DNA was extracted from peripheral blood leucocytes of

cohort participants. Details on genotyping, variant imputation, andquality control procedures have been previously reported (30). Fromthe PanScan and PanC4 consortia GWAS (27–30), we included 22SNPs that were associated with the risk of pancreatic cancerat genome-wide significance level (P < 5 � 10�8): rs13303010,rs10919791, rs2816938, rs1486134, rs9854771, rs2736098, rs31490,rs35226131, rs78417682, rs17688601, rs6971499, rs2941471,rs10094872, rs1561927, rs687289, rs9581943, rs9543325, rs7190458,rs4795218, rs11655237, rs1517037, and rs16986825 (SupplementaryTable S1). SNP data were unavailable for participants not included inthe consortia GWAS and were predominantly matched controls(Supplementary Table S2). Because of the missing SNP data, weimputed genotypes by randomly sampling from observed genotypeswith replacement, conditional on study and case–control status. Wecalculated a weighted genetic risk score (wGRS) as the weighted sum ofrisk alleles using weights determined by the log ORs reported in thePanScan and PanC4 consortia GWAS (27–30).

Statistical analysisTo compare risk factor characteristics, we tabulated frequencies and

distributions between cases and matched controls in cohort-specificand pooled analyses.We pooled data across the four cohorts; there wasno evidence of substantial effect heterogeneity across the cohorts formost of risk factors (P > 0.05). Missing proportions of the non–geneticrisk factors were ranged from 0.01 to 0.20. To minimize the effects ofmissing data, we used conditional mean imputation: we replacedmissing values with the average value of each variable for eachindividual from the 25 imputed datasets generated using MultivariateImputation by Chained Equations. All continuous variables werestandardized with a mean of 0 and SD of 1 in each cohort.

We first examined the associations between risk factors and pan-creatic cancer in pooled univariable analyses using conditional logisticregression.Usingmultivariable conditional logistic regression, we thenbuilt three relative risk models for men and women separately includ-ing the following covariates: the first (“clinical model”) with BMI,WHR, MET-hour/week of physical activity, and history of diabetes(yes or no); the second (“clinical/genetic model”) added the wGRS to

Kim et al.

Cancer Epidemiol Biomarkers Prev; 29(5) May 2020 CANCER EPIDEMIOLOGY, BIOMARKERS & PREVENTION1000

Page 37: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

the clinical model; and the third (“clinical/genetic/biomarker model”)added proinsulin, adiponectin, IL6, and total BCAAs to theclinical/genetic model. We compared goodness of fit of the threemodels using the likelihood ratio test. Risk models were built for allparticipants in the full follow-up population (maximum 26 yearsbetween data/blood collection and case diagnosis) and limited to casesdiagnosed within 10 years of data/blood collection and their matchedcontrols to allow evaluation of “lifetime” and 10-year absolute risks,respectively.

Model discrimination was assessed using the area under ROC curveanalyses. To validate the discriminative performance of eachmodel, weperformed a 5-fold cross-validation leaving out 20% randomly selecteddata as a validation dataset and all the remaining data as a trainingdataset in our cohort data. Specifically, we randomly partitionedmatched case–control sets into five equally sized disjoint subsets,withheld each of the partitions in turn as a testing set, trained themodels in the remaining data, and evaluated the area under the ROCcurve (AUC) of thefitmodel in the testing set.We repeated this processover 20 different random partitions. We then calculated the average ofAUC for each relative risk model over the resulting 100 test sets as arepresentative AUCof eachmodel.We restricted validation samples tocases diagnosed within 10 years of blood collection and their matchedcontrols because of the differences in the follow-up time across the fourcohorts.

To calculate absolute risk for pancreatic cancer, we combined themultivariable relative risk models fit in our data with age- and sex-specific U.S. pancreatic cancer incidence rates, mortality rates, andthe joint distribution of risk factors among U.S. non-Hispanicwhites (39, 40). We included the effects of smoking and familyhistory on pancreatic cancer risk in our absolute risk models usingcovariate-adjusted relative risks for these factors taken from theliterature (41, 42).

To estimate the joint distribution of pancreatic cancer risk factorsamong U.S. non-Hispanic whites, we simulated 20,000 men and20,000 women by first sampling smoking status based on the prev-alence of smoking among white men and women (20.4% and 15.8%,respectively) in theU.S. general population (age-adjusted distributionsfor adults ages 18 and over from the National Health Interview Surveydata, 2011–2014; ref. 41).We then sampled remaining clinical, genetic,and biomarker risk factors (except family history) by drawing a riskfactor profile at random (with replacement) from male controls andfemale controls separately, conditional on smoking status. Finally, wesampled family history conditional on polygenic risk score, the sum ofrisk alleles of SNPs associated with pancreatic cancer, assuming thepopulation prevalence of positive family history of pancreatic cancer is3.6% (42).

Then we calculated individualized relative risk for each simulatedsubject on the basis of personal risk profile as follows:

RR Xð Þ ¼ expX

i¼1...k

bTi Xi

( )

where X1; X2; . . . ; Xk are an individual's risk factor values andb1; b2; . . . ; bk are the log OR for the risk factors in our risk modelsand literature estimates for current smoking and family history ofpancreatic cancer (5, 42).

We calculated absolute risks of pancreatic cancer by combining theestimated relative risk with age- and sex-specific average incidencerates for non-Hispanic whites in U.S. Surveillance, Epidemiology, andEndResults (SEER) 17 from 2001 to 2005 (http://seer.cancer.gov/) andcompeting mortality risks obtained from U.S. mortality data of white

men and women in 2007 (43). Using these data, we converted relativerisks to absolute risks (p) as follows (39):

p a; sð Þ ¼Xs�1

t¼aþ1F tð ÞRR xð Þl0 tð Þ:

Here pða; sÞ denotes the probability that a subject who is pancre-atic–cancer-free at age a will be diagnosed with pancreatic cancerbefore age s; where FðtÞ ¼ expf�Pt

x¼a ½RRðxÞl0ðtÞþ m0ðtÞ�g is theprobability of survival until age t, RR is relative risk with the given riskfactors, l0 is baseline incidence of pancreatic cancer at age t from theSEERdata, andm0 is the competingmortality risk at age t.We calculatethe baseline incidence l0ðtÞ separately in men and women by dividingthe age-specific SEER incidence rates lðtÞ by the average RR in thesimulated cohort. We calculated 10-year absolute risks [i.e.,pða; aþ10Þ for different reference age a] and cumulative absoluterisks [defined as pð50; 80Þ] by categories of risk percentile (10th to99th percentiles). All P values were two-sided, and statistical analyseswere performed using SAS (version 9.4; SAS institute Inc.) and R.

ResultsOur analysis dataset included 500 pancreatic cancer cases and 1,091

matched controls from four prospective cohorts (Table 1; Supplemen-tary Table S3; and Materials and Methods). In univariable analysisamong the full population, we found that increased risk of pancreaticcancer was significantly associated (P < 0.05) with higher BMI andWHR,historyofdiabetes, higher levels of circulatingproinsulin, IL6, andtotal BCAAs, lower levels of circulating adiponectin, andhigherwGRSof22knowncommonsusceptibility variants forpancreatic cancer (Table2;Supplementary Table S4). When we restricted our population to casesandmatched controls whowere diagnosed with pancreatic cancer in the0 to 10 years after blood collection, physical activity became a significantrisk factor, and BMI and WHR were no longer significant (Table 2).

We evaluated three prespecified multivariable-adjusted risk modelsthat included clinical variables only, clinical variables plus the wGRS,and clinical variables plus the wGRS and circulating biomarkers(Table 3). We could not include smoking status or family history ofpancreatic cancer, two important pancreatic cancer risk factors, as ourcases and controls were matched on smoking status and family historyinformation was missing in 58% of subjects. We included external riskestimates for smoking and family history in our final absolute riskmodel.

In the full population, model fit was improved with addition ofthe wGRS (P ¼ 3.24 � 10�8) to the clinical model and with theaddition of circulating biomarkers (P ¼ 6.03 � 10�5) to the modelwith clinical variables and the wGRS (Table 3). Also, we found asignificant improvement of model fit by adding circulating biomar-kers only to the clinical model (P ¼ 2.10 � 10�5; SupplementaryTable S5). For the cases diagnosed within 10 years of covariate dataand blood collection and their matched controls, model fit wasimproved with addition of the wGRS (P ¼ 2.91 � 10�7) and thecirculating biomarkers (P ¼ 2.92 � 10�3) to the clinical model(Table 3). We also observed that model fit was improved byaddition of the circulating biomarkers only to the clinical model(P ¼ 1.05 � 10�3; Supplementary Table S5).

Model discrimination was evaluated before and after cross-validation among the 10-year follow-up population (Fig. 1). Theaverage AUC estimated by cross-validation was 0.55 for the clinicalmodel, 0.61 for the clinical/genetic model, and 0.62 for the clinical/genetic/biomarker model. Figure 2 shows the population distributionof pancreatic cancer relative risk among U.S. non-Hispanic white men

Absolute Risk for Pancreatic Cancer

AACRJournals.org Cancer Epidemiol Biomarkers Prev; 29(5) May 2020 1001

Page 38: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

and women by plotting the relative risk (y axis) as a function of riskpercentile based on three risk models (x axis). These models alsoincorporate the effects and prevalence of smoking and family history of

pancreatic cancer using external risk estimates (5, 42). The riskmodelsidentified a subset of men and women at ≥3-fold higher risk forpancreatic cancer than the average risk of men and women in thegeneral population. For instance, the clinical model identified 0.2% ofmen and 1.5% of women at≥3-fold risk of pancreatic cancer during thefull follow-up period, and the clinical/genetic/biomarker model addi-tionally identified 1.8% of men and 0.7% of women (i.e., 2.0% of menand 2.3% women at ≥3-fold risk of pancreatic cancer during the fullfollow-up period).When restricting the follow-up time to 0 to 10 years,the clinical/genetic/biomarker model identified 3.7% of men and 2.6%of women at ≥3-fold risk for pancreatic cancer over the ensuing10 years.

We estimated cumulative absolute risk and 10-year risk of pancre-atic cancer using the clinical/genetic/biomarker model. We plottedabsolute risks (y axis) by the range of age (x axis) between 50 and80 years for cumulative absolute risk and between 50 and 70 years for10-year absolute risk, stratified by risk percentiles (Fig. 3). Forcumulative absolute risk, the 10th and 99th risk percentiles showed0.4% and 3.8% probabilities of developing pancreatic cancer by age80 years among men. Among women, the corresponding probabilitieswere 0.4% and 3.6% by age 80 years. The probability of developingpancreatic cancer in the next 10 years among cancer-free 70-year-oldindividuals was 0.2% at the 10th percentile in both men and women

Table 2. Univariable ORs and 95% CIs for susceptibility factorsand future pancreatic cancer risk.

Full population 0–10-year population(n ¼ 1,591) (n ¼ 956)

Variables OR (95% CI) OR (95% CI)

Lifestyle and clinical factorsBMIa 1.14 (1.03–1.27) 1.12 (0.98–1.27)WHRa 1.18 (1.06–1.31) 1.13 (0.99–1.29)Physical activitya 0.94 (0.85–1.05) 0.85 (0.73–0.99)Diagnosed diabetes (yes) 2.36 (1.32–4.21) 2.42 (1.20–4.89)

Circulating biomarkersProinsulina 1.27 (1.14–1.42) 1.21 (1.07–1.38)Adiponectin (≥4.4 mg/mL) 0.62 (0.48–0.80) 0.57 (0.40–0.80)IL6a 1.13 (1.02–1.25) 1.16 (1.02–1.33)Total BCAAsa 1.46 (1.23–1.74) 1.43 (1.18–1.74)

Genetic risk factorwGRSa 1.37 (1.23–1.53) 1.46 (1.27–1.68)

aStandardized variables with mean ¼ 0 and SD ¼ 1 within each cohort.

Table 1. Characteristics of pancreatic cancer cases and matched controls.

Full population 0–10-year populationa

(n ¼ 1,591) (n ¼ 956)Cases Controls Cases Controls

Variables (n ¼ 500) (n ¼ 1,091) (n ¼ 304) (n ¼ 652)

Matching factorsAge, mean (SD), year 63.19 (8.30) 62.67 (8.31) 65.93 (7.59) 65.54 (7.55)Gender, n (%)

Male 173 (34.60) 358 (32.81) 82 (26.97) 187 (28.68)Female 327 (65.40) 733 (67.19) 222 (73.03) 465 (71.32)

Cohort, n (%)HPFS 83 (16.60) 195 (17.87) 58 (19.08) 145 (22.24)NHS 147 (29.40) 396 (36.30) 48 (15.79) 140 (21.47)PHS 90 (18.00) 163 (14.94) 24 (7.89) 42 (6.44)WHI 180 (36.00) 337 (30.89) 174 (57.24) 325 (49.85)

Smoking, n (%)Current smoker 64 (12.90) 135 (12.45) 37 (12.29) 76 (11.76)Noncurrent smoker 432 (87.10) 949 (87.55) 264 (87.71) 570 (88.24)

Fasting status at blood collection, n (%)Fasted <8 hours 142 (28.40) 290 (26.58) 48 (15.79) 118 (18.10)Fasted ≥8 hours 358 (71.60) 801 (73.42) 256 (84.21) 534 (81.90)

Lifestyle and clinical factorsBMI, mean (SD), kg/m2 26.30 (5.03) 25.70 (4.33) 26.60 (5.50) 26.00 (4.63)WHR, mean (SD), inch/inch 0.85 (0.11) 0.84 (0.10) 0.84 (0.10) 0.83 (0.09)Physical activity, mean (SD), MET-hour/week 20.10 (32.80) 20.40 (25.80) 17.70 (24.10) 21.50 (29.00)Diagnosed diabetes (yes), n (%) 29 (5.80) 33 (3.02) 21 (6.91) 24 (3.68)

Circulating biomarkersProinsulin, mean (SD), pmol/L 16.10 (18.70) 12.90 (19.30) 15.70 (19.00) 12.90 (19.30)Adiponectin (≥4.4 mg/mL), n (%) 301 (71.84) 743 (81.20) 219 (74.74) 524 (83.71)IL6, mean (SD), pg/mL 2.38 (4.20) 1.96 (3.36) 2.60 (4.72) 2.00 (3.03)Total BCAAs, mean (SD), mmol/L 430.10 (169.89) 359.05 (200.66) 437.16 (141.29) 368.17 (179.37)

Genetic risk factorsGRS, mean (SD) 23.60 (2.75) 22.90 (2.64) 23.80 (2.75) 22.80 (2.68)wGRS,b mean (SD) 0.21 (1.01) �0.10 (0.98) 0.26 (1.00) �0.13 (0.99)

Abbreviations: BCAAs, branched-chain amino acids; GRS, genetic risk score summing the number of risk alleles; HPFS, Health Professionals Follow-up Study; NHS,Nurses’ Health Study; PHS, Physicans’ Health Study; wGRS, weighted genetic risk score; WHI, Women’s Health Initiative.a0–10-year population refers to cases (and their matched controls) diagnosed within 10 years of blood draw.bStandardized wGRS with mean ¼ 0 and SD ¼ 1 within each cohort.

Kim et al.

Cancer Epidemiol Biomarkers Prev; 29(5) May 2020 CANCER EPIDEMIOLOGY, BIOMARKERS & PREVENTION1002

Page 39: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

and 2.0% and 1.7% at the 99th percentile in men and women,respectively.

DiscussionWe developed absolute risk models for pancreatic cancer in the

general population, integrating established risk markers for pancreaticcancer, including lifestyle factors, medical comorbidities, commongermline variants, and circulating biomarkers. We found that theaddition of genetic variants and circulating markers added discrim-inatory ability beyond clinical factors that could be solicited in aphysician's office. The final integrated model identified a subset ofapproximately 2% of individuals who had 3-fold higher risk than theaverage in the general U.S. population. Furthermore, the individuals inthe top 1% of pancreatic cancer risk as determined by the finalintegrated model carried a 4% lifetime risk of pancreatic cancer anda 2% 10-year risk at age 70 years.

Screening programs for pancreatic cancer remain early in theirdevelopment, and recently updated US Preventive Services TaskForce (USPSTF) recommendation for screening of pancreatic can-cer reaffirms that potential benefits of screening do not outweigh thepotential harm in asymptomatic, average-risk individuals (44).However, the USPSTF also confirms that persons with inheritedgenetic syndromes or family history are at high risk of the disease,and their recommendation against screening does not apply to thehigh-risk populations. In the current study, the risks defined here(i.e., ≥3-fold increased RR) are within a range similar to those for

patients with germline mutations in genes such as BRCA1, BRCA2,or CDKN2A (e.g., OR ¼ 2.6, 6.2, and 12.3; refs. 23, 45) or patientswith affected family members where the disease screening for thesespecific populations is being studied (46, 47).

We previously used participant data from case–control studies andprospective cohorts in the PanScan consortium to generate a pancre-atic cancer risk model based on a small subset of the risk factorsincluded in the current study (31). The available risk factors for theprior model included smoking status, alcohol use, BMI, diabeteshistory, family history of pancreatic cancer, three common geneticsusceptibility variants (at 1q32, 5p15, and 13q22), and ABO genotype.The full model from this prior work had an in-sample AUC of 0.61[95% confidence interval (CI), 0.58–0.63] and identified 2.9% of menand 2.6% of women who had more than twice the average lifetime riskfor pancreatic cancer. In the current study, we improved upon thismodel by including 18 additional genetic risk variants discovered insubsequent GWAS and several circulating biomarkers, validating ourmodels using cross-validation. Importantly, because all our subjectswere enrolled in prospective cohorts, all risk factor data and circulatingmarkers were measured before the cases' diagnosis of pancreaticcancer. This design faithfully recapitulates the situation faced byprimary care physicians, where decisions related to disease screeningaremade in the prediagnostic setting using data collected in the severalyears prior to cancer diagnosis.

A prior case–control study developed a risk prediction model forpancreatic cancer that included current smoking, recent diagnosis ofdiabetes or pancreatitis, ABO blood type, Jewish ancestry, and use of a

Table 3. Estimated ORs and 95% CIs from multivariablea risk models for pancreatic cancer.

Clinical modelClinical/genetic

modelClinical/genetic/biomarker

model

Full follow-up periodModel comparison (P valueb) 3.24 � 10�8 6.03 � 10�5

Model AUC 0.61 0.65 0.67OR (95% CI)

BMIc 1.08 (0.97–1.21) 1.07 (0.95–1.20) 0.98 (0.86–1.10)WHRc 1.13 (1.01–1.26) 1.12 (1.00–1.26) 1.08 (0.96–1.21)Physical activityc 0.96 (0.86–1.06) 0.95 (0.85–1.06) 0.97 (0.86–1.08)Diagnosed diabetes (yes) 2.10 (1.16–3.79) 2.19 (1.19–4.02) 1.70 (0.91–3.19)wGRSc 1.37 (1.22–1.53) 1.36 (1.21–1.52)Proinsulinc 1.16 (1.03–1.31)Adiponectin (≥4.4 mg/mL) 0.76 (0.58–0.99)IL6c 1.10 (0.99–1.23)Total BCAAsc 1.25 (1.04–1.51)

0–10 years of follow-up periodModel comparison (P valueb) 2.91 � 10�7 2.92 � 10�3

Model AUC 0.61 0.67 0.69OR (95% CI)

BMIc 1.05 (0.91–1.22) 1.04 (0.90–1.21) 0.96 (0.82–1.12)WHRc 1.08 (0.93–1.25) 1.06 (0.91–1.23) 1.00 (0.86–1.17)Physical activityc 0.86 (0.74–1.00) 0.86 (0.74–1.01) 0.88 (0.75–1.03)Diagnosed diabetes (yes) 2.22 (1.09–4.54) 2.14 (1.02–4.50) 1.65 (0.77–3.56)wGRSc 1.44 (1.25–1.67) 1.43 (1.23–1.65)Proinsulinc 1.10 (0.94–1.27)Adiponectin (≥4.4 mg/mL) 0.70 (0.48–1.02)IL6c 1.13 (0.98–1.30)Total BCAAsc 1.24 (1.00–1.54)

aAdjusted for matching factors, age, cohort (also gender), race/ethnicity, smoking status, fasting status, and month/year of blood collection.bP value was estimated from the likelihood ratio test comparing the clinical/genetic model to the clinical model and the clinical/genetic/biomarker model to theclinical/genetic model.cStandardized variables with mean ¼ 0 and SD ¼ 1 within each cohort.

Absolute Risk for Pancreatic Cancer

AACRJournals.org Cancer Epidemiol Biomarkers Prev; 29(5) May 2020 1003

Page 40: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

proton pump inhibitor (32). Considering these factors, the investiga-tors identified 0.87% of controls that had 5-year absolute risks of 5% orhigher. Although risk estimates were based on a single retrospectivecase–control study from a limited geographic region and with a smallnumber of pancreatic cancer cases, this work highlights the potentialutility of including recent development of conditions such as diabetesand pancreatitis in risk models. Another risk modeling effort hasfocused specifically on developing prediction models for pancreaticcancer in patients with recently diagnosed diabetes (33, 48, 49). In thegeneral population, 0.5% to 0.85% of patients aged≥50 years with new-onset diabetes are diagnosedwith pancreatic cancer within the ensuing3 years (50).With further enrichment, this populationmay constitute ahigh-risk groupworthy of disease screening.Nevertheless, themajorityof patients with pancreatic cancer donot develop diabetes in the 3 yearsbefore diagnosis, so risk models for the general population will remainnecessary to diagnosis of this disease earlier in most individuals.

The present study has limitations that should be considered.Family history of pancreatic cancer was not collected from moststudy participants, so the relative risk for family history could not beestimated from our nested case–control data. In addition, becausesmoking status was a matching factor at study design stage, so wecould not estimate the risk of current smoking in our population.However, we used risk estimates for these factors based on the largePanScan consortium dataset to allow for their inclusion in absoluterisk models. For some genetic variants, the proportion of controlsmissing genotype data was larger than for cases. We imputedgenotypes of risk SNPs conditional on case–control status toaccount for the different missing patterns and allele frequenciesbetween cases and controls. Because cohort data were collectedprospectively from study participants using mailed questionnairesevery 1 to 2 years, we may have missed some recent-onset diabetesdiagnoses. As shown in other risk modeling efforts, recent-onset

diabetes has predictive ability for pancreatic cancer andtherefore our models might underestimate the risk discriminationcapabilities of models that incorporate this risk factor. Although weincluded participants from four separate large U.S. cohorts andperformed cross-validation, we could not examine our risk models(absolute or relative risk models) in an independent prospectivedataset, which would further validate our models and provideevidence regarding the generalizability of these models in otherpopulations and settings. So, future work for our risk models will beexternal validation and calibration in independent samples. Inparticular, because this study did not include nonwhite participantsin the current analyses, further studies that include more raciallydiverse participant populations will be needed to explore theperformance of these models in subjects of other racial and ethnicgroups.

Our study has multiple important strengths. The evaluation ofparticipants from large prospective cohorts allowed data and bloodsamples to be collected prediagnostically, minimizing recall bias andthe impact of current disease on circulating biomarkers. Our spectrumof pancreatic cancer cases was also less likely to be influenced bysurvival bias, as participants were identified years before their cancerdiagnosis. Our participants were enrolled from across the UnitedStates, enhancing the generalizability of our results to the generalpopulation, beyond those who sought care at a specific center or withina particular health care system.We used three types of data to build ourrisk models, including clinical data that could be queried or measuredin the doctor's office, genetic data that could be assessed with sequenc-ing of a germline DNA sample (e.g., peripheral blood white cells orbuccal swab), and circulating biomarkers that could bemeasured fromperipheral blood in commonly collected plasma tubes. Overall, thesedesign features are extremely well suited to simulate the data availableto providers seeing patients in general medicine clinics. If such a risk

Figure 1.

ROC curves from before (left) and after 5-fold cross-validation (right) in the 0–10-year follow-up population. Each line represents the clinical model (light gray), theclinical/genetic model (gray), and the clinical/genetic/biomarker model (dark gray). The 5-fold cross-validation leaving out 20% randomly selected dataset as a testset at a time was performed 20 times. The average AUC was calculated as a mean of 100 AUC values estimated in the test datasets.

Kim et al.

Cancer Epidemiol Biomarkers Prev; 29(5) May 2020 CANCER EPIDEMIOLOGY, BIOMARKERS & PREVENTION1004

Page 41: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

stratification tool were available to primary care providers, excesspancreatic cancer risk could trigger further biomarker testing (e.g.,specialized blood tests) or imaging-based screening tests (e.g., MRI orendoscopic ultrasound) to detect an early pancreatic cancer that couldbe treated for cure. Such risk stratification tools will become increas-ingly important as novel early detection biomarkers become availableand imaging tests are improved for detection of small tumors (51–54).

In summary, we have examined absolute risk models of pancreaticcancer that combine established clinical factors, germline genetic

variants, and circulating biomarkers. The final integrated model hasimproved risk discrimination over those that include clinical factorsalone and successfully identify a small segment of the general popu-lation at elevated risk of pancreatic cancer. Further refinement andvalidation in independent samples will be necessary to make thesemodels clinically actionable and impact survival of patients withpancreatic cancer. Given the late stage at presentation formost patientswith pancreatic cancer, earlier detection approaches are worthy ofsignificant investment as a critical means to reduce mortality from

0.0 0.2 0.4 0.6 1.00.8

Risk percentile

02

46

8

Rela

tive

risk

0.0 0.2 0.4 0.6 1.00.8

Risk percentile

02

46

8

Rela

tive

risk>_3x Average risk in men (full follow–up)

A B

C D

Clinical (0.23%)Clinical/Genetic (0.34%)Clinical/Genetic/Biomarker (2.01%)

>_3x Average risk in women (full follow–up)Clinical (1.54%)Clinical/Genetic (2.27%)Clinical/Genetic/Biomarker (2.25%)

0.0 0.2 0.4 0.6 1.00.8

Risk percentile

02

46

8

Rela

tive

risk

0.0 0.2 0.4 0.6 1.00.8

Risk percentile

02

46

8

Rela

tive

risk>_3x Average risk in men (0–10 yrs follow–up)

Clinical (0.09%)Clinical/Genetic (0.24%)Clinical/Genetic/Biomarker (3.72%)

>_3x Average risk in women (0–10 yrs follow–up)

Clinical (1.90%)Clinical/Genetic (2.22%)Clinical/Genetic/Biomarker (2.55%)

Figure 2.

Pancreatic cancer risk in the general population. The data were simulated with a total of 20,000 men and 20,000 women based on the average of our imputeddatasets using external risk estimates for smoking status and family history of pancreatic cancer. The relative risk of pancreatic cancer was plotted with a function ofthe risk percentile for (A) men in the full years of follow-up, (B) women in the full years of follow-up, (C) men in 0 to 10 years of follow-up, and (D) women in 0 to10 years of follow-up. The lines represent three risk models, including clinical factors only (light gray), clinical and genetic factors (gray), and clinical and geneticfactors as well as circulating biomarkers (dark gray).

Absolute Risk for Pancreatic Cancer

AACRJournals.org Cancer Epidemiol Biomarkers Prev; 29(5) May 2020 1005

Page 42: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

pancreatic cancer, soon to be the second leading cause of cancer deathin the United States.

Disclosure of Potential Conflicts of InterestC.S.Fuchs is a consultant forAgios, BainCapital, Taiho,UnumTherapeutics,Daiichi

Sankyo, Bayer, Celgene, Eli Lilly, Entrinsic Health, Genentech, Merck, MerrimackPharma, and Sanofi, and has ownership interest (including patents) in CytomXTherapeutics and Entrinsic Health. B.M. Wolpin is a consultant for Celgene, GRAIL,and BioLineRx, and reports receiving commercial research grants from Celgene and EliLilly. No potential conflicts of interest were disclosed by the other authors.

Authors’ ContributionsConception and design: A.P. Klein, E.L. Giovannucci, J.E. Manson, C.S. Fuchs,B.M. Wolpin, P. KraftDevelopment of methodology: M.N. Pollak, B.M. Wolpin, P. KraftAcquisition of data (provided animals, acquired and managed patients, providedfacilities, etc.): C.B. Clish, M.N. Pollak, L.T. Amundadottir, R.Z. Stolzenberg-Solomon, L.K. Brais, K. Ng, E.L. Giovannucci, H.D. Sesso, J.E. Manson, C.S. Fuchs

Analysis and interpretation of data (e.g., statistical analysis, biostatistics,computational analysis): J. Kim, C. Yuan, A. Babic, Y. Bao, M.N. Pollak,L.T. Amundadottir, R.Z. Stolzenberg-Solomon, P.V. Pandharipande, K. Ng,B.M. Wolpin, P. KraftWriting, review, and/or revision of the manuscript: J. Kim, C. Yuan, A. Babic,Y. Bao, C.B. Clish, M.N. Pollak, L.T. Amundadottir, A.P. Klein, R.Z. Stolzenberg-Solomon, P.V. Pandharipande, L.K. Brais, K. Ng, E.L. Giovannucci, H.D. Sesso,J.E. Manson, M.J. Stampfer, C.S. Fuchs, B.M. Wolpin, P. KraftAdministrative, technical, or material support (i.e., reporting or organizing data,constructing databases): J. Kim, Y. Bao, L.K. Brais, M.W. Welch, J.E. Manson,M.J. Stampfer, P. KraftStudy supervision: B.M. Wolpin, P. Kraft

AcknowledgmentsThe authors would like to thank the participants and staff of the HPFS, NHS, PHS,

and WHI for their valuable contributions as well as the following state cancerregistries for their help: AL, AZ, AR, CA, CO, CT, DE, FL, GA, ID, IL, IN, IA, KY,LA, ME, MD, MA, MI, NE, NH, NJ, NY, NC, ND, OH, OK, OR, PA, RI, SC, TN, TX,VA, WA, and WY.

A

Age (year)

50

00.

00

50.

01

0.02

0.03

50.

030.

04

0.02

50.

015

55 60 65 70 75 80

Cum

ulat

ive

abso

lute

risk

Age (year)

55 60 65 70

00.

00

50.

01

0.0

150.

02

10–y

ear a

bsol

ute

risk

Age (year)

55 60 65 70

00.

00

50.

01

0.0

150.

02

10–y

ear a

bsol

ute

risk

Age (year)

50

00.

00

50.

01

0.02

0.03

50.

030.

04

0.02

50.

015

55 60 65 70 75 80

99 percentile95 percentile90 percentile80 percentile70 percentile60 percentile50 percentile40 percentile30 percentile20 percentile10 percentile

Cum

ulat

ive

abso

lute

risk

B

C D

Figure 3.

Cumulative absolute risk and 10-year absolute risks of pancreatic cancer estimated using simulated data of 20,000men and 20,000womenwith smoking status andfamily history status based on the average of imputed datasets. Each color line represents different relative risk percentiles in each gender group, and the percentileswere estimated based on the clinical/genetic/biomarker model (including BMI, WHR, physical activity, diagnosed diabetes, wGRS, proinsulin, adiponectin, IL6, andtotal BCAAs).A,Men in the full follow-uppopulation.B,Women in the full follow-uppopulation.C,Men in the0 to 10 years of follow-up population.D,Women in the0to 10 years of follow-up population.

Kim et al.

Cancer Epidemiol Biomarkers Prev; 29(5) May 2020 CANCER EPIDEMIOLOGY, BIOMARKERS & PREVENTION1006

Page 43: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

This project was supported by cohort grants [UM1CA167552 (W. Willett) andU01CA167552 (W. Willet) for the HPFS; UM1CA186107 (M.J. Stampfer),P01CA87969 (R. Tamimi), and R01CA49449 (S. Hankinson) for the Nurses'Health Study; R01CA97193 (J.M. Gaziano), R01CA34944 (C. Hennekens),R01CA40360 (J. Buring), R01HL26490 (C. Hennekens), and R01HL34595(C. Hennekens) for the PHS; N01WH22110 (R. Prentice), N01WH24152(N. Lasser), N01WH32100 (S. Beresford), N01WH32101 (R. Grimm),N01WH32102 (R. Wallace), N01WH32105 (A. Oberman), N01WH32106(E. Paskett), N01WH32108 (P. Greenland), N01WH32109 (J. Manson),N01WH32111 (N. Watts), N01WH32112 (L. Kuller), N01WH32113 (J. Robbins),N01WH32115 (T. Bassford), N01WH32118 (K. Johnson), N01WH32119(A. Assaf), N01WH32122 (M. Travisan), N01WH42107 (A. Hubbell),N01WH42108 (J. Hsia), N01WH42109 (M. Stefanick), N01WH42110 (J. Hays),N01WH42111 (R. Schenken), N01WH42112 (R. Jackson), N01WH42113(S. Daugherty), N01WH42114 (C. Ritenbaugh), N01WH42115 (D. Lane),N01WH42116 (J. Ockene), N01WH42117 (G. Heiss), N01WH42118 (S. Hendrix),N01WH42119 (S. Wassertheil-Smoller), N01WH42120 (R. Chiebowski),N01WH42121 (B. Canne), N01WH42122 (J. Kotchen), N01WH42123 (B. Howard),

N01WH42124 (H. Black), N01WH42125 (H. Judd), N01WH42126 (J. Liu),N01WH42129 (M. Limacher), N01WH42130 (J. Curb), N01WH42131(M. O'Sullivan), N01WH42132 (C. Allen), and N01WH44221 (S. Shumaker)for the WHI program] from the NIH.

B.M. Wolpin acknowledges primary research support from Dana-Farber CancerInstitute Hale Family Center for Pancreatic Cancer Research, NIH/NCIU01CA210171, Lustgarten Foundation, and Stand Up To Cancer, with additionalsupport from Pancreatic Cancer Action Network, Noble Effort Fund, and Promisesfor Purple. K. Ng acknowledges research funding from the Broman Fund forPancreatic Cancer Research.

The costs of publication of this article were defrayed in part by the payment of pagecharges. This article must therefore be hereby marked advertisement in accordancewith 18 U.S.C. Section 1734 solely to indicate this fact.

Received November 7, 2019; revised January 31, 2020; accepted February 7, 2020;published first April 22, 2020.

References1. Siegel RL, Miller KD, Jemal A. Cancer statistics, 2018. CA Cancer J Clin 2018;68:

7–30.2. Rahib L, Smith BD, Aizenberg R, Rosenzweig AB, Fleshman JM, Matrisian LM.

Projecting cancer incidence and deaths to 2030: the unexpected burden ofthyroid, liver, and pancreas cancers in the United States. Cancer Res 2014;74:2913–21.

3. Vincent A, Herman J, Schulick R, Hruban RH, Goggins M. Pancreatic cancer.Lancet 2011;378:607–20.

4. Paniccia A, Hosokawa P, Henderson W, Schulick RD, Edil BH, McCarter MD,et al. Characteristics of 10-year survivors of pancreatic ductal adenocarcinoma.JAMA Surg 2015;150:701–10.

5. Lynch SM, Vrieling A, Lubin JH, Kraft P, Mendelsohn JB, Hartge P, et al.Cigarette smoking and pancreatic cancer: a pooled analysis from the pancreaticcancer cohort consortium. Am J Epidemiol 2009;170:403–13.

6. Michaud DS. Giovannucci E, Willett WC, Colditz GA, Stampfer MJ, Fuchs CS.Physical activity, obesity, height, and the risk of pancreatic cancer. JAMA 2001;286:921–9.

7. Silverman DT, Schiffman M, Everhart J, Goldstein A, Lillemoe KD, SwansonGM, et al. Diabetes mellitus, other medical conditions and familial history ofcancer as risk factors for pancreatic cancer. Br J Cancer 1999;80:1830–7.

8. Wolpin BM, Bao Y, Qian ZR, Wu C, Kraft P, Ogino S, et al. Hyperglycemia,insulin resistance, impaired pancreatic beta-cell function, and risk of pancreaticcancer. J Natl Cancer Inst 2013;105:1027–35.

9. Stolzenberg-Solomon RZ, Graubard BI, Chari S, Limburg P, Taylor PR, VirtamoJ, et al. Insulin, glucose, insulin resistance, and pancreatic cancer in malesmokers. JAMA 2005;294:2872–8.

10. Sadr-Azodi O, Gudbjornsdottir S, Ljung R. Pattern of increasing HbA1clevels in patients with diabetes mellitus before clinical detection of pancreaticcancer - a population-based nationwide case-control study. Acta Oncol 2015;54:986–92.

11. Wolpin BM, Michaud DS, Giovannucci EL, Schernhammer ES, Stampfer MJ,Manson JE, et al. Circulating insulin-like growth factor binding protein-1 and therisk of pancreatic cancer. Cancer Res 2007;67:7923–8.

12. Wolpin BM, Ng K, Bao Y, Kraft P, Stampfer MJ, Michaud DS, et al. Plasma 25-hydroxyvitamin D and risk of pancreatic cancer. Cancer Epidemiol BiomarkersPrev 2012;21:82–91.

13. Bao Y, Giovannucci EL, Kraft P, Stampfer MJ, Ogino S, Ma J, et al. A prospectivestudy of plasma adiponectin and pancreatic cancer risk in five US cohorts. J NatlCancer Inst 2013;105:95–103.

14. WhiteDL,Hoogeveen RC,Chen L, RichardsonP, RavishankarM, ShahP, et al. Aprospective study of soluble receptor for advanced glycation end products andadipokines in association with pancreatic cancer in postmenopausal women.Cancer Med 2018;7:2180–91.

15. Babic A, Bao Y, Qian ZR, Yuan C, Giovannucci EL, Aschard H, et al. Pancreaticcancer risk associated with prediagnostic plasma levels of leptin and leptinreceptor genetic polymorphisms. Cancer Res 2016;76:7160–7.

16. Stolzenberg-Solomon RZ, Newton CC, Silverman DT, Pollak M, Nogueira LM,Weinstein SJ, et al. Circulating leptin and risk of pancreatic cancer: a pooledanalysis from 3 cohorts. Am J Epidemiol 2015;182:187–97.

17. Vainer N, Dehlendorff C, Johansen JS. Systematic literature review of IL-6 as abiomarker or treatment target in patients with gastric, bile duct, pancreatic andcolorectal cancer. Oncotarget 2018;9:29820–41.

18. Mayers JR, Wu C, Clish CB, Kraft P, Torrence ME, Fiske BP, et al. Elevation ofcirculating branched-chain amino acids is an early event in human pancreaticadenocarcinoma development. Nat Med 2014;20:1193–8.

19. Katagiri R, Goto A, Nakagawa T, Nishiumi S, Kobayashi T, Hidaka A, et al.Increased levels of branched-chain amino acid associated with increased risk ofpancreatic cancer in a prospective case-control study of a large cohort. Gastro-enterology 2018;155:1474–82.e1.

20. Yip-Schneider MT, Simpson R, Carr RA, Wu H, Fan H, Liu Z, et al. Circulatingleptin and branched chain amino acids-correlation with intraductal papillarymucinous neoplasm dysplastic grade. J Gastrointest Surg 2019;23:966–74.

21. Shindo K, Yu J, Suenaga M, Fesharakizadeh S, Cho C, Macgregor-Das A, et al.Deleterious germline mutations in patients with apparently sporadic pancreaticadenocarcinoma. J Clin Oncol 2017;35:3382–90.

22. Yurgelun MB, Chittenden AB, Morales-Oyarvide V, Rubinson DA, Dunne RF,Kozak MM, et al. Germline cancer susceptibility gene variants, somatic secondhits, and survival outcomes in patients with resected pancreatic cancer.Genet Med 2019;21:213–23.

23. Hu C, Hart SN, Polley EC, Gnanaolivu R, Shimelis H, Lee KY, et al. Associationbetween inherited germlinemutations in cancer predisposition genes and risk ofpancreatic cancer. JAMA 2018;319:2401–9.

24. Lu Y, Ek WE, Whiteman D, Vaughan TL, Spurdle AB, Easton DF, et al. Mostcommon 'sporadic' cancers have a significant germline genetic component.Hum Mol Genet 2014;23:6112–8.

25. Amundadottir L, Kraft P, Stolzenberg-Solomon RZ, Fuchs CS, Petersen GM,Arslan AA, et al. Genome-wide association study identifies variants in the ABOlocus associated with susceptibility to pancreatic cancer. Nat Genet 2009;41:986–90.

26. Petersen GM, Amundadottir L, Fuchs CS, Kraft P, Stolzenberg-Solomon RZ,Jacobs KB, et al. A genome-wide association study identifies pancreatic cancersusceptibility loci on chromosomes 13q22.1, 1q32.1 and 5p15.33. Nat Genet2010;42:224–8.

27. Wolpin BM, Rizzato C, Kraft P, Kooperberg C, Petersen GM, Wang Z, et al.Genome-wide association study identifies multiple susceptibility loci for pan-creatic cancer. Nat Genet 2014;46:994–1000.

28. Childs EJ,Mocci E, CampaD, Bracci PM,Gallinger S, GogginsM, et al. Commonvariation at 2p13.3, 3q29, 7p13 and 17q25.1 associated with susceptibility topancreatic cancer. Nat Genet 2015;47:911–6.

29. Zhang M, Wang Z, Obazee O, Jia J, Childs EJ, Hoskins J, et al. Three newpancreatic cancer susceptibility signals identified on chromosomes 1q32.1,5p15.33 and 8q24.21. Oncotarget 2016;7:66328–43.

30. Klein AP, Wolpin BM, Risch HA, Stolzenberg-Solomon RZ, Mocci E, Zhang M,et al. Genome-wide meta-analysis identifies five new susceptibility loci forpancreatic cancer. Nat Commun 2018;9:556.

31. Klein AP, Lindstrom S, Mendelsohn JB, Steplowski E, Arslan AA, Bueno-de-Mesquita HB, et al. An absolute riskmodel to identify individuals at elevated riskfor pancreatic cancer in the general population. PLoS One 2013;8:e72311.

Absolute Risk for Pancreatic Cancer

AACRJournals.org Cancer Epidemiol Biomarkers Prev; 29(5) May 2020 1007

Page 44: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

32. Risch HA, Yu H, Lu L, Kidd MS. Detectable symptomatology preceding thediagnosis of pancreatic cancer and absolute risk of pancreatic cancer diagnosis.Am J Epidemiol 2015;182:26–34.

33. Boursi B, Finkelman B, Giantonio BJ, Haynes K, Rustgi AK, Rhim AD, et al. Aclinical predictionmodel to assess risk for pancreatic cancer among patients withnew-onset diabetes. Gastroenterology 2017;152:840–50.e3.

34. Giovannucci E, Ascherio A, Rimm EB, Colditz GA, Stampfer MJ, Willett WC.Physical activity, obesity, and risk for colon cancer and adenoma in men.Ann Intern Med 1995;122:327–34.

35. Colditz GA,Hankinson SE. TheNurses' Health Study: lifestyle and health amongwomen. Nat Rev Cancer 2005;5:388–96.

36. Steering Committee of the Physicians' Health Study Research G. Final report onthe aspirin component of the ongoing Physicians' Health Study. N Engl J Med1989;321:129–35.

37. Langer RD, White E, Lewis CE, Kotchen JM, Hendrix SL, Trevisan M. TheWomen's Health Initiative Observational Study: baseline characteristics ofparticipants and reliability of baseline measures. Ann Epidemiol 2003;13:S107–21.

38. Bao Y, Giovannucci EL, Kraft P, Qian ZR, Wu C, Ogino S, et al. Inflammatoryplasma markers and pancreatic cancer risk: a prospective study of five U.S.cohorts. Cancer Epidemiol Biomarkers Prev 2013;22:855–61.

39. Dupont WD. Converting relative risks to absolute risks: a graphical approach.Stat Med 1989;8:641–51.

40. Gail MH, Pfeiffer RM. On criteria for evaluating models of absolute risk.Biostatistics 2005;6:227–39.

41. Centers for Disease Control and Prevention, National Center for HealthStatistics. Tables of summary health statistics. Volume 2018, 2009. Availablefrom: https://www.cdc.gov/nchs/nhis/SHS/tables.htm.

42. Jacobs EJ, Chanock SJ, Fuchs CS, LaCroix A,McWilliams RR, Steplowski E, et al.Family history of cancer and risk of pancreatic cancer: a pooled analysis from thePancreatic Cancer Cohort Consortium (PanScan). Int J Cancer 2010;127:1421–8.

43. Centers for Disease Control and Prevention, National Center for HealthStatistics. Detailed technical notes to the United States 2007 data—mortality.

Volume 2018, 2010. Available from: https://www.cdc.gov/nchs/data/dvs/MortFinal2007_Worktable12.pdf.

44. Owens DK, Davidson KW, Krist AH, Barry MJ, Cabana M, Caughey AB, et al.Screening for pancreatic cancer: US Preventive Services Task Force Reaffirma-tion Recommendation Statement. JAMA 2019;322:438–444.

45. Petersen GM. Familial pancreatic adenocarcinoma. Hematol Oncol Clin NorthAm 2015;29:641–53.

46. Lucas AL, Kastrinos F. Screening for pancreatic cancer. JAMA 2019;322:407–8.47. Canto MI, Harinck F, Hruban RH, Offerhaus GJ, Poley J-W, Kamel I, et al.

International Cancer of the Pancreas Screening (CAPS) Consortium summit onthe management of patients with increased risk for familial pancreatic cancer.Gut 2013;62:339–47.

48. Munigala S, Singh A, Gelrud A, Agarwal B. Predictors for pancreatic cancerdiagnosis following new-onset diabetesmellitus. Clin Transl Gastroenterol 2015;6:e118.

49. Sharma A, Kandlakunta H, Nagpal SJS, Feng Z, Hoos W, Petersen GM, et al.Model to determine risk of pancreatic cancer in patientswith new-onset diabetes.Gastroenterology 2018;155:730–9.e3.

50. Chari S, Leibson C, Rabe K, Ransom J, Deandrade M, Petersen G. Probability ofpancreatic cancer following diabetes: a population-based study. Gastroenterol-ogy 2005;129:504–11.

51. Cohen JD, Javed AA, Thoburn C, Wong F, Tie J, Gibbs P, et al. Combinedcirculating tumor DNA and protein biomarker-based liquid biopsy for theearlier detection of pancreatic cancers. Proc Natl Acad Sci U S A 2017;114:10202–7.

52. Fahrmann JF, Bantis LE, Capello M, Scelo G, Dennison JB, Patel N, et al. Aplasma-derived protein-metabolite multiplexed panel for early-stage pancreaticcancer. J Natl Cancer Inst 2019;111:372–9.

53. Koay EJ, Lee Y, Cristini V, Lowengrub JS, Kang Y', Lucas FAS, et al. A visuallyapparent and quantifiable CT imaging feature identifies biophysical subtypes ofpancreatic ductal adenocarcinoma. Clin Cancer Res 2018;24:5883–94.

54. Abou-ElkacemL,WangH,Chowdhury SM, Kimura RH, Bachawal SV, GambhirSS, et al. Thy1-targeted microbubbles for ultrasound molecular imaging ofpancreatic ductal adenocarcinoma. Clin Cancer Res 2018;24:1574–85.

Cancer Epidemiol Biomarkers Prev; 29(5) May 2020 CANCER EPIDEMIOLOGY, BIOMARKERS & PREVENTION1008

Kim et al.

Page 45: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

MOLECULAR CANCER THERAPEUTICS | COMPANION DIAGNOSTIC, PHARMACOGENOMIC, AND CANCER BIOMARKERS

Genetic Interactions and Tissue Specificity Modulate theAssociation of Mutations with Drug ResponseDina Cramer1,2,3, Johanna Mazur3, Octavio Espinosa3, Matthias Schlesner1,4, Daniel H€ubschmann1,5,6,7,Roland Eils1,8,9, and Eike Staub3

ABSTRACT◥

In oncology, biomarkers are widely used to predict subgroupsof patients that respond to a given drug. Although clinicaldecisions often rely on single gene biomarkers, machine learningapproaches tend to generate complex multi-gene biomarkers thatare hard to interpret. Models predicting drug response based onmultiple altered genes often assume that the effects of singlealterations are independent. We asked whether the association ofcancer driver mutations with drug response is modulated byother driver mutations or the tissue of origin. We developed ananalytic framework based on linear regression to study interac-tions in pharmacogenomic data from two large cancer cell linepanels. Starting from a model with only covariates, we includedadditional variables only if they significantly improved simpler

models. This allows to systematically assess interactions in small,easily interpretable models. Our results show that includingmutation–mutation interactions in drug response predictionmodels tends to improve model performance and robustness.For example, we found that TP53 mutations decrease sensitivityto BRAF inhibitors in BRAF-mutated cell lines and patienttumors, suggesting a therapeutic benefit of combining inhibitionof oncogenic BRAF with reactivation of the tumor suppressorTP53. Moreover, we identified tissue-specific mutation–drugassociations and synthetic lethal triplets where the simultaneousmutation of two genes sensitizes cells to a drug. In summary, ourinteraction-based approach contributes to a holistic view on thedetermining factors of drug response.

IntroductionThe observation that many drugs are effective in small subgroups of

patients has given rise to precision medicine approaches in whichpatients receive tailored therapies based on their tumors' geneticalterations (1). Genomic and pharmacologic profiling of cancer celllines allows for identifying molecular alterations that are associatedwith drug response. Cancer cell line panels aim at capturing the largeheterogeneity found in primary tumors (2), which calls for datasets ofextensive size. The publicly available drug screen projects Genomics ofDrug Sensitivity in Cancer (GDSC; refs. 3, 4), Cancer Cell LineEncyclopedia (CCLE; ref. 5) and Cancer Therapeutic Response Portal(CTRP; refs. 6, 7) have profiled up to 1,001 cancer cell lines for up to481 drugs. Although the discordance of measured drug responses

within (8) and between (9) different large-scale drug screens has beencriticized, candidate biomarkers can be consistently identified (10) andknown biomarkers can be confirmed. Pharmacogenomic profiling ofcancer cell lines is thus regarded as a useful preclinical model system.

In clinical trials, biomarkers for the selection of suitable patients fora given drug are often based on single genes (11). In contrast, machinelearning approaches tend to generate models of high complexity (12).Although multi-gene models tend to outperform single gene mod-els (13), they often lack biological interpretability and have rarely beenvalidated in independent studies. The gap between very simple modelsin clinical settings and very complex machine learning derived modelsin research remains to be bridged.

Ifmore than one alteration forms part of a drug response biomarker,the association of an individual alteration with drug response candepend on other alterations in the cell. Deviations from the additivephenotype are referred to as genetic interactions (14). In the field ofdrug response research, the importance of interactions has recentlybeen acknowledged. However, proposed approaches are limited tologic models (12, 15) or interactions between one genomic and onetranscriptomic alteration (15, 16). Interactions between twomutationsin drug response have hardly been quantified.

Given the poor performance of mutation data in drug responseprediction models (4), we asked whether model performance can beimproved by incorporating interactions. Using pharmacogenomic datafrom the GDSC (4) and the CTRP (7) project, we modeled therelationship between cancer driver mutations and drug response bylinear regression with interaction terms (Fig. 1).We present an analyticframework for the systematic and quantitative assessment of interac-tions between two mutations or one mutation and the tissue of origin.

Materials and MethodsStatistical analysis

All statistical analyses were conducted using R version 3.4.2. Weexclusively used two-tailed tests.

1Division of Theoretical Bioinformatics, GermanCancer Research Center (DKFZ),Heidelberg, Germany. 2Faculty of Biosciences, Heidelberg University,Heidelberg, Germany. 3Oncology Bioinformatics, Merck KGaA, Darmstadt,Germany. 4Bioinformatics and Omics Data Analytics, German Cancer ResearchCenter (DKFZ), Heidelberg, Germany. 5Pediatric Immunology, Hematology andOncology, University Hospital Heidelberg, Heidelberg, Germany. 6Division ofStem Cells and Cancer, German Cancer Research Center (DKFZ), Heidelberg,Germany. 7Heidelberg Institute for Stem Cell Technology and ExperimentalMedicine (HI-STEM gGmbH), Heidelberg, Germany. 8Health Data Science Unit,Bioquant, Medical Faculty, Heidelberg University, Heidelberg, Germany. 9Centerfor Digital Health, Berlin Institute of Health and Charit�e Universit€atsmedizinBerlin, Berlin, Germany.

Note: Supplementary data for this article are available at Molecular CancerTherapeutics Online (http://mct.aacrjournals.org/).

Corresponding Author: Dina Cramer, German Cancer Research Center, BerlinerStr. 41, Heidelberg 69120, Germany. Phone: 49-6221-422724; Fax: 49-6221-423563; E-mail: [email protected]

Mol Cancer Ther 2020;19:927–36

doi: 10.1158/1535-7163.MCT-19-0045

�2019 American Association for Cancer Research.

AACRJournals.org | 927

Page 46: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

GDSC dataWe retrieved drug response data from the GDSC project (release 6;

ref. 17) that comprises 1,001 cancer cell lines and 265 drugs. Because 14drugs in the dataset were rescreened, we specify the drug ID togetherwith the drug name where applicable. In contrast to previous stud-ies (3, 4, 12), we summarized dose–response curves by theAUC insteadof the IC50, which is in accordance with published recommenda-tions (10, 18, 19). The AUC takes values between 0 (sensitive) and 1(resistant). To model drug response, we transformed the AUC valuesinto Z-scores. Our analysis focused on mutations in 267 potentialcancer driver genes that were published together with the drugresponse data (4). Complete mutation data were retrieved from theCatalogue Of SomaticMutations In Cancer (COSMIC; ref. 20; version80; GRCh37/hg19).We used binary variables to represent the presenceor absence of a mutation in a given gene. Because the data version weused is newer than in the most recent GDSC publication (4) andcomprises a genome noise reduction step, 19 of 267 genes did not havea mutation in any of the cell lines, resulting in a list of 248 potentialcancer driver genes. We retrieved cell line annotation data (17; release6) to assemble potential covariates: Tissue (“GDSC Tissue descriptor100), growth medium (“ScreenMedium”), growth properties (“Growthproperties”), andmicrosatellite instability status [“Microsatellite insta-bility Status (MSI)”] were extracted. We counted the total number ofmutations per cell line, including silent mutations, using the COSMICmutation data (version 80; GRCh37/hg19; ref. 20). To compute copynumber alteration (CNA) counts per cell line, CNA data were down-loaded fromCOSMIC (version 80; GRCh37/hg19; ref. 20). Only CNAswith known minor allele and total copy number were considered.

CTRP dataAs a validationdataset, we retrieved drug response andmutationdata

fromtheCTRPpublication (7).A total of 76drugshavebeen screened intheGDSCand theCTRPproject (4).Drug response in theCTRPdatasetis reported by AUC values between 0 (sensitive) and 20 (resistant).For drug response prediction models, we transformed AUC valuesinto Z-scores. We used the same set of 248 potential cancer drivergenes as for the GDSC dataset. We extracted the covariates tissue(“ccle_primary_site”) and growth medium (“culture_media”) from theCTRP publication (7). Growth property data were not available because

all cell lines in the CTRP dataset are adherent. We reduced the numberof tissue categories by summarizing “upper_aerodigestive_tract” and“oesophagus” as “aero_dig_tract”, “endometrium”, “ovary”, and“urinary_tract” as “urogenital_system”, and “biliary_tract”, “liver”,and “stomach” as “digestive_system”. To increase consistency with theGDSC dataset, we summarized media with the prefix “DMEM0” as“DMEM”, media with the prefix “RPMI0” as “RPMI” and all remainingmedia as “other”. To compute CNA counts per cell line, binary copynumber calls were downloaded from the CCLE website (portals.broad-institute.org/ccle/data; February 29, 2016; ref. 21).

Background modelsThe default background model consists of the four covariates tissue

of origin, growth properties, growthmedium, andCNAcount.We alsotested the association of the covariates with drug response in the CTRPdataset. In univariate models, tissue was significantly associated withdrug response for 73%, growth medium for 37%, and CNA count for20% of the drugs (q < 0.1, Benjamini–Hochberg correction, F test).

To assemble drug-specific background models, we modified thedefault background model with four covariates in three steps. First, wetested for each covariate if the default background model was signif-icantly better than the same model without the respective covariate(P < 0.05, F test). If this was the case, the covariate was retained in themodel, if not, it was excluded. Second, we separately assessed the effectof mutations in 248 potential cancer driver genes. A single mutationwas included if the model with the mutation was significantly betterthan the model with the preselected covariates alone (Holm-correctedq < 0.1, t test). Third, mutation–tissue interactions were tested if boththe mutation and the tissue covariate were already in the model. Werestricted the data for this test to tissues that had at least five cell lineswith and without a mutation in the respective gene. Mutation–tissueinteraction terms were added if they significantly improved the modelwith the preselected covariates and single genes (P < 0.05, F test). If novariables at all were selected during this process, the drug-specificbackground model was defined as a null model that predicts the meandrug response across cell lines.

Mutation pair modelsFor all 265 drugs, a regression model,

AUC � b0 þ b1mut1 þ b2mut2 þ b3mut1mut2 þ covariatesþ "

ð1Þ

was fitted for pairs of mutations in 248 potential cancer driver genes.Mutation pairs that cooccurred in less than five cell lines with availabledrug response data were excluded for all analyses. In themutation pairmodels, the AUC is the response variable, b0 is a constant, b1,2,3 are theregression coefficients and mut1,2 are binary variables that encode themutation status of two genes. Mutation pair models without interac-tion are denoted asmut1þmut2,mutation pairmodels with interactionasmut1�mut2. The included covariates are defined by the default or thedrug-specific background models.

Assessing model performanceTo assess the performance ofmodels that were fitted using the entire

GDSC dataset, we used the Bayesian information criterion (BIC) andthe adjusted coefficient of determination (adj. R2). Both measurespenalize the number of parameters in the models.

In addition, cross-validation was used to assess model perfor-mance. We created 100 independent cross-validation instancesusing the createMultiFolds function in the R package “caret”(version 6.0-47; https://cran.r-project.org/web/packages/caret). In

Drug screen Concentration

Via

bilit

y

Drug response

Mutation Mutation

Cov

aria

tes

Drug-specific background modelDefault background model

Growth propertiesGrowth medium

CNA count

Tissue

Figure 1.

Schematic overview of the analytic framework. Building blocks for drugresponse prediction models comprise covariates, mutations, mutation–tissueinteractions, and mutation–mutation interactions. Filled boxes represent back-ground models, horizontal arrows represent interactions, and the vertical arrowrepresents the association with drug response. CNA, copy number alteration.

Cramer et al.

Mol Cancer Ther; 19(3) March 2020 MOLECULAR CANCER THERAPEUTICS928

Page 47: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

each cross-validation instance, models were fitted to the trainingdataset that contained 80% of the full data. The root mean squaredtest error was computed using the remaining 20% of the data,termed test data. In each cross-validation instance, we selected thelowest test error. We used the distribution of 100 test error valuesper model to compare the performance of different models. Allcomparisons were carried out separately for each drug.

Clinical drug responseWe retrieved drug response in 31 vemurafenib-treated patients with

melanoma (22) with only BRAFV600 (n ¼ 26) or BRAFV600 and TP53(n¼ 4)mutations (mutation status according to https://cancer.sanger.ac.uk/cosmic/study/overview?paper_id ¼ 34281). We excludedpatient 53 due to absence of BRAF mutations. We computed thedisease control rate, defined as the number of patients with the RECISTcriteria “complete response”, “partial response”, and “stable disease”divided by the total number of patients. The relative riskwas calculatedusing the riskratio function in the R package “fmsb” (https://cran.r-project.org/web/packages/fmsb; version 0.6.3). Vemurafenib andPLX4720 are related compounds (23). The study (22) also containsdata for dabrafenib-treated patients, but none of them had both aBRAF and a TP53 mutation.

Mutation–tissue interactionsWe defined tissue-specific and general mutation associations (Sup-

plementary Table S1) based onpan-cancermodels. First, wefiltered formutations that are significantly associated with drug response (P <0.05, t test) in a model with a single mutation and the covariates of thedefault background model as predictors (model 1). Second, model 1was compared with a model that additionally included the mutation–tissue interaction term.We considered associations as tissue-specific ifthe model comparison test yielded a P value smaller than 0.05 (F test).Associations not passing this threshold were defined as generalassociations. Third, we validated tissue-specific and general associa-tions by applying the same tests to the CTRP dataset.We restricted ouranalysis to genes where a minimum of two tissues had at least five celllines with drug response data in both the wild-type and the mutatedgroup.

To investigate whether the proportion of tissue-specific associationsis larger than expected by chance, we permuted the tissue annotation1,000 times and controlled for tissue-specific mutation recurrence bykeeping the number of mutations per tissue constant. The percentageof tissue-specific associations in the original data was compared withthe percentages in the randomized data.

We used Cook's distance to identify influential observations in thepan-cancer models. For tissue-specific associations, Cook's distancewas calculated on the basis of amodelwithmutation–tissue interactionterm. For general associations, a model without mutation–tissueinteraction term was used. Models with influential observations(Cook's distance > 0.5) were marked in Supplementary Table S1 andfiltered out for Fig. 5B.

Next, we fitted model 1 (see above) to data from individual tissues.For tissue-specific associations, we tested whether the mutation coef-ficient estimates for single-tissue models correlated with the mutationfrequency in the respective tissues. We computed the Pearson corre-lation using the R function cor.test. In Supplementary Table S1, weincluded the Pearson correlation coefficient for drug–mutation pairswith significant correlations (P < 0.05, correlation test) and labeled allother tested correlations as not significant (n.s.). Tissue-specific asso-ciations with significant correlations in the GDSC or the CTRP datasetwere excluded from Fig. 5B.

Identification of mutation–mutation interactions and syntheticlethal triplets

To identify mutation–mutation interactions, we used the mutationpair model (equation 1) with the covariates of the drug-specificbackground model. In the GDSC and the CTRP dataset, we appliedthree conditions to select mutation-mutation interaction candidates(Supplementary Table S2). First, the mutation pair model with inter-action must be significantly better than the drug-specific backgroundmodel (P < 0.05, F test) and the drug-specific background model witheither of the mutations (P < 0.05, F test). Second, the interactionbetween both mutations must be significantly associated with drugresponse (P < 0.05, t test). Third, the sum of the coefficients for bothmain effects and the interaction effect must have the same sign in theGDSC and the CTRP dataset, meaning that the overall effect in a cellwith both mutations is consistent. For synthetic lethal relationships,we additionally required a negative overall effect (b1 þ b2 þ b3;equation 1) and a negative interaction coefficient. Examples withinfluential observations (Cook's distance > 0.5) in the respectivemodels were marked in Supplementary Table S2 and excluded fromSupplementary Figs. S3 and S6.

ResultsCovariates explain large proportions of the variation in drugresponse

Using theGDSC dataset, we testedwhether cell line properties otherthan mutations or experimental conditions could confound the rela-tionship between mutations and drug response. We analyzed tissue oforigin, growth medium, growth properties (adherent, semiadherent,suspension), CNA count,mutation count, andMSI status in univariateregression models (Fig. 2).

Because tissue, growthmedium, growth properties, and CNA countwere strongly associated with drug response, we included them ascovariates in subsequent models that serve to analyze mutationassociations. We defined a model with these four covariates as thedefault background model. Comparing models to the default

0.0

0.1

0.2

0.3

0.4

Mutationcount

MSI CNAcount

Growthmedium

Growthproperties

Tissue

Ad

j. R

²

Figure 2.

Covariates explain largeproportions of the variation in drug response. Univariatemodels relatingmutation count, MSI status, CNA count, growthmedium, growthproperties, and tissue to drug response were fitted for all 265 drugs. Each valuerepresents the adj. R2 for a given drug and a given covariate.

Mutation Interactions in Drug Response Prediction Models

AACRJournals.org Mol Cancer Ther; 19(3) March 2020 929

Page 48: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

background model allows to separate mutation-specific effects fromeffects that can be attributed to these covariates.

Including interactions between mutations tends to improve theperformance and robustness of drug response predictionmodels

To analyze the association of mutations with drug response, wesystematically assessedmodels of three different complexities. For eachdrug, we fitted 248 single mutation models and on average 13,000mutation pair models with and without mutation–mutation interac-tion (Supplementary Data S1). Each of these models contained thecovariates of the default background model.

To compare performance within a given model complexity, wecomputed two model performance measures, the adj. R2 and the BIC.We selected the best-performing model per complexity, resulting inthree models per drug. For all drugs, the models that were selectedbased on adj. R2 and BIC were identical. We found that models withone or two mutations explain up to 50% of the variation in drugresponse (Fig. 3A).

To compare performance across model complexities, we selectedthe best-performing model per drug based on adj. R2 and BIC. Forabout half of the drugs (47.9% based on adj.R2, 47%based onBIC), thismodel was not significantly better than the default background model(q� 0.1, Benjamini–Hochberg correction, F test; Fig. 3B). For most ofthe remaining drugs (49.8% based on adj. R2, 41% based on BIC),mutation pair models with interaction were the best-performingmodels. This suggests that including mutation–mutation interactionstends to improve model performance.

As a third model performance measure, we used the test errors of across-validation analysis. Figure 3C shows the percentage of drugs forwhich more complex models predict drug response better (P < 0.05,Mann–Whitney–Wilcoxon test). Because the covariates of the defaultbackground model form part of all more complex models and accountfor large proportions of the variation in drug response (Fig. 2), weassessed performance differences with respect to the default back-ground model. Including interactions in mutation pair modelsincreased the number of drugs for which the default backgroundmodel was outperformed by 20% (68%–48%; Fig. 3C). Mutation pairmodels with interaction were never outperformed by mutation pairmodels without interaction (P � 0.05, Mann–Whitney–Wilcoxontest), indicating that including interactions tends to increase modelperformance.

To evaluate model robustness to changes in the training data, weselected the model with the lowest test error in each cross-validationinstance. We used the abundance of the most frequently selectedmodel as a measure of model robustness. We found that mutation pairmodels were less robust than single mutation models (P < 10�15,Mann–Whitney–Wilcoxon test; Fig. 3D). However, mutation pairmodels with interaction were more robust than mutation pair modelswithout interaction (P < 10�7, Mann–Whitney–Wilcoxon test). Thisimplies that including mutation–mutation interactions increasesmodel robustness.

The interaction of BRAF and TP53 mutations is associated withresistance to the BRAF inhibitors dabrafenib and PLX4720

According to the cross-validation results (Fig. 3C), the BRAFinhibitor dabrafenib was the drug with the most significant perfor-mance difference between mutation pair models with and withoutinteraction (P ¼ 0.002, Mann–Whitney–Wilcoxon test). The mostfrequent mutation pair model with interaction across all cross-validation instances was BRAF�TP53 (frequency: 37%). This model

also performed best based on the full data (adj. R2 and BIC; Fig. 3A).The BRAF�TP53 mutation pair model with interaction predicts thatcell lines with BRAF and TP53mutations respond worse to dabrafenibthan cell lines with only BRAF mutations [P < 10-10 (GDSC),t test; Fig. 4A and Supplementary Fig. S1A].

Similarly, a BRAF�TP53 model explained the highest fraction ofvariation in response to the BRAF inhibitor PLX4720 (drug ID1371; Fig. 3A). Again, we observed a negative impact of TP53 muta-tions on drug sensitivity in BRAFmutated cell lines [P < 10�7 (GDSC),t test; Fig. 4B and Supplementary Fig. S1C]. For both dabrafenib (P¼0.005, t test; Fig. 4A and Supplementary Fig. S1B) and PLX4720 (P <10�4, t test;Fig. 4B and Supplementary Fig. S1D), we could validate theBRAF–TP53 interaction effect in the CTRP dataset. This impliesrobustness of our observations across different BRAF inhibitors anddatasets.

Because dabrafenib and PLX4720 selectively inhibit BRAF kinaseswith the activating V600E mutation (24, 25), we retested BRAF�TP53models by considering only cell lines with BRAFV600E mutations asBRAF mutated. We could confirm the interaction effect of BRAF andTP53 mutations for dabrafenib in the GDSC dataset (P < 10�9, t test)and for PLX4720 in both datasets [P < 10�4 (GDSC) and P ¼ 0.048(CTRP), t test]. For dabrafenib-treated cell lines in the CTRP dataset,the cooccurrence threshold for BRAF and TP53 mutations was notpassed.

To further corroborate our findings, we analyzed published datafrom a clinical trial with 30 patients with BRAFV600-mutantmelanomathat were treated with the BRAF inhibitor vemurafenib (22). Patientswith BRAF and TP53 mutations tended to respond worse to vemur-afenib than patients with only BRAF mutations [disease control rate:50% and 85%, relative risk: 3.25 (90% confidence interval 1.06–9.94),P ¼ 0.1]. The exceptional response for TP53 wild-type patientsindicates clinical relevance.

The association of a mutation with drug response can be tissue-specific

We next investigated whether interactions other than mutation–mutation interactions are important for drug response predictions. Asstated above, the tissue of origin explains a large proportion of thevariation in drug response. In addition to this direct effect, we askedwhether the tissue covariate indirectly influences drug response bymodulating the effect of a mutation. We thus searched for mutationassociations that depend on the tissue in which they occur.

To identify tissue-specific mutation associations, we comparedsingle mutation models with models that additionally included aninteraction term between the mutation status and the tissue covariate.For 374 of 2,232 associations (17%), the mutation–tissue interactionterm improved the model (P < 0.05, F test), indicating that theseassociations between mutations and drug response are tissue-specific.

To assess whether random differences across data subsets coulddrive this effect, we randomized the tissue labels 1,000 times whilecontrolling for tissue-specific mutation recurrence. We then retestedthe mutation associations for tissue specificity. The proportion oftissue-specific associations in the original data was larger thanexpected by chance (P < 0.001, randomization test; Fig. 5A).

To assess differences between tissue-specific associations (P < 0.05,F test) and associations that can be generalized across tissues (P� 0.05,F test), we rigorously filtered for consistent effect patterns in the GDSCand the CTRP dataset. In the filtered list (Supplementary Table S1),32% of the associations were tissue-specific. To illustrate the effect oftissue specificity, we fitted separate models using data from only onetissue. We then compared the association between a given mutation

Cramer et al.

Mol Cancer Ther; 19(3) March 2020 MOLECULAR CANCER THERAPEUTICS930

Page 49: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

and a given drug across tissues. Figure 5B shows examples of tissue-specific and general associations that were validated in the CTRPdataset. In these models, the sign of the coefficient for mutation statusdetermines the direction of the effect, that is, whether the mutationconfers resistance or sensitivity to a drug.

We observed that the association ofNRASmutationswith resistanceto the BRAF inhibitor PLX4720 was tissue-specific [drug screenedtwice; P < 10�4 (drug ID 1036 and 1371), F test; Fig. 5B]. Thisresistance association was especially pronounced in skin cancer celllines. In accordance with our results, a previous study (26) reported

Figure 3.

Including interactions betweenmutations tends to improve the performance and robustness of drug response predictionmodels. A, Best model per complexity anddrug. The adj. R2 and the significance of a model comparison test against the default background model (F test) are shown for 291 models with an adjusted P valuelower than 0.1 (Benjamini–Hochberg correction). Genes (plain) in selected models are labeled together with the corresponding drug (bold) and drug target (italic).Two models for the same complexity are assigned to PLX4720 because the drug was screened twice. B, Best model complexity per drug. Pie chart showing theproportion of drugs for which a givenmodel complexity represents the best model based on adj. R2 or BIC. Drugs for which the best model is not significantly betterthan the default background model (q� 0.1, Benjamini–Hochberg correction, F test) are categorized as not significant (n.s.). C, Comparison of model complexitiesbased on test errors. The percentage of drugs for which themore complexmodel (model 2) performs better (P <0.05, Mann–Whitney–Wilcoxon test) is indicated.D,Model robustness. For a given drug and a given model complexity, the abundance of the most frequently selected model across cross-validation instances isdisplayed. Model complexities are denoted as null (null model), background (default background model), single (single mutation model), pair –int (mutation pairmodel without interaction), and pair þint (mutation pair model with interaction).

Mutation Interactions in Drug Response Prediction Models

AACRJournals.org Mol Cancer Ther; 19(3) March 2020 931

Page 50: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

that NRAS mutations in melanoma confer resistance to BRAF inhi-bitors through reactivation of MAPK signaling.

Likewise, the association of EGFRmutations with sensitivity to theEGFR/ERBB2 inhibitor afatinib showed tissue specificity (drug ID1377; P ¼ 0.007, F test; Fig. 5B). We observed the strongest effect for

non–small cell lung cancer (NSCLC) cell lines. In line with thisobservation, afatinib is exclusively approved for the treatment ofpatients with EGFR-mutated NSCLC (27).

We also compared mutation coefficients across single tissues forthree drug–mutation associations that are less tissue-specific and thus

Figure 4.

The interaction of BRAF and TP53mutations is associated with resistance to the BRAF inhibitors dabrafenib and PLX4720. BRAF-mutated cell lines with additionalTP53mutations show decreased sensitivity to dabrafenib (A) and PLX4720 (B). Drugs (bold) and drug targets (italic) are depicted. Cancer cell lines are grouped bydataset (GDSC or CTRP) and by mutation status of BRAF and TP53. Each point represents one cancer cell line. Horizontal lines show the median drug response pergroup. See also Supplementary Fig. S1.

Figure 5.

The association of a mutation with drug response can be tissue-specific. A, The observed percentage of tissue-specific models (P < 0.05, F test) exceeds randomexpectation (P < 0.001, randomization test). B, Mutation coefficients for tissue-specific associations (blue) show high variability, whereas coefficients for generalassociations (gray) are almost identical across tissues. Each point represents the mutation coefficient for a single tissue. Negative coefficients represent sensitivityassociations, whereas positive coefficients represent resistance associations. Drug names (bold, with drug IDs in parentheses), drug targets (italic), and mutatedgenes (plain) are indicated. See also Supplementary Table S1.

Cramer et al.

Mol Cancer Ther; 19(3) March 2020 MOLECULAR CANCER THERAPEUTICS932

Page 51: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

more general (Fig. 5B), namely the association of PTEN mutationswith resistance to the ALK inhibitor TAE684 (P ¼ 0.94, F test),PIK3CA (catalytic subunit of PI3K) mutations with sensitivity to thePI3K inhibitor GDC0941 (drug ID 1058; P ¼ 0.95, F test), andMLL3(KMT2C) mutations with resistance to the IGF1R (insulin-like growthfactor 1 receptor) inhibitor BMS-536924 (drug ID 1091; P ¼ 0.97, Ftest). The mutation coefficients for these general associations werealmost identical across tissues.

Drug-specific background models tend to outperform thedefault background model

On the basis of our observations that (i) the association of covariateswith drug response can be drug-specific (Fig. 2); (ii) single mutationscan show strong associations with drug response (Fig. 3A); and (iii)mutation associations can be tissue-specific (Fig. 5B), we adapted ourbackground model. For each drug, we started from the defaultbackground model with four covariates and set up models that allowfor the exclusion of covariates, the inclusion of single mutations andthe inclusion of mutation–tissue interactions. These models, termeddrug-specific background models, explain drug response better formost drugs (Supplementary Fig. S2).

Mutation–mutation interactions and synthetic lethal tripletsThe drug-specific background models account for many important

predictors of drug response described here. Therefore, we used them asa starting point to identify mutation–mutation interactions. Weassembled a list of testable drugs and mutation pairs that occur inthe GDSC and the CTRP dataset. By applying a set of conditions toboth datasets, we retrieved mutation–mutation interaction candidates(Supplementary Table S2).

We reidentified the BRAF–TP53 interaction that consistently med-iates resistance to the BRAF inhibitor PLX4720 (drug ID 1371, Fig. 4Band Supplementary Fig. S1C and S1D; Supplementary Table S2) in theGDSC (P¼ 0.0005, t test) and the CTRP (P¼ 0.047, t test) dataset. Thedrug-specific background model for PLX4720 contains the tissue oforigin and the mutations NRAS and BRAF. It also contains themutation–tissue interaction terms for both genes, confirming thetissue specificity of the association between NRAS and BRAF muta-tions and response to PLX4720 (Supplementary Table S1; Fig. 5B).

In addition, our results suggest that cell lines with CREBBP andFGFR2 mutations tend to be resistant to the cytotoxic drugs gemci-tabine [P ¼ 0.01 (GDSC) and 0.02 (CTRP), t test], bleomycin [drugID 1378, P ¼ 0.0001 (GDSC) and 0.008 (CTRP), t test] and SN-38[P < 10-5 (GDSC) and P¼ 0.002 (CTRP), t test; Supplementary Fig. S3;Supplementary Table S2]. Because of the involvement of bothCREBBP (28) and FGFR2 (29) in DNA repair, we hypothesized thatdrug resistance may arise as a result of an increased DNA repaircapacity or an increased DNA damage tolerance (30).

Next, we wanted to identify mutation pairs that show syntheticlethality upon drug treatment. Two genes are synthetically lethal if theperturbation of either gene is viable, but the simultaneous perturbationof both genes leads to cell death (31). One gene can be perturbed by amutation while the other gene can be perturbed by a drug. In thecontext of the mutation pairs studied here, we searched for syntheticlethality between a drug and two mutations, thus, synthetic lethaltriplets. We applied a set of conditions to identify synthetic lethalcandidates (Supplementary Table S2).

We identified the relationship between theMDM2 inhibitorNutlin-3a and mutations in CTNNB1 (b-catenin) and PIK3CA as a syntheticlethal triplet. We observed increased sensitivity to Nutlin-3a in celllines with bothCTNNB1 and PIK3CAmutations [P¼ 0.02 (GDSC and

CTRP), t test; Fig. 6A]. The proapoptotic transcription factor FOXO3may provide a mechanistic link between Nutlin-3a, CTNNB1, andPIK3CA (Fig. 6B). On one hand, FOXO3 is inhibited by CTNNB1 andPIK3CA via its downstream effector AKT kinase (32). On the otherhand, FOXO3 is degraded by MDM2 (33).

In addition, we found that simultaneous mutation of KRAS andMAP3K4 sensitizes cancer cell lines to the DNA synthesis inhibitorcytarabine [P¼ 0.003 (GDSC) and 0.001 (CTRP), t test; Fig. 6C]. BothKRAS and MAP3K4 take part in MAPK signaling, which contributesto the regulation of cell proliferation, including the induction of DNAsynthesis (34). Their involvement in parallel pathways (MetaCore, Rasfamily GTPases in kinase cascades; http://pathwaymaps.com/maps/379; Fig. 6D) may explain why the perturbation of both genessynergistically increases sensitivity to cytarabine.

DiscussionWe found that including interactions inmutation pairmodels tends

to improve model performance and robustness. Models that neglectinteractions follow the strong assumption that individual mutationeffects are additive, but biological processes usually depend on thecooperation of multiple players.

Our mutation pair models predicted several interactions betweencancer driver mutations. For BRAF-TP53 (35) and PIK3CA-CTNNB1 (36), a cooperative role in tumorigenesis, but not in drugresponse, was described in previous studies. We observed that muta-tion pairs that jointly mediate drug resistance or drug sensitivity areoften involved in parallel pathways upstream of the drug target.

We highlight a BRAF�TP53 mutation pair model with interactionthat explains response to theBRAF inhibitors dabrafenib andPLX4720in the GDSC and the CTRP dataset. Not only in cancer cell lines, butalso in clinical data, we observed that TP53 mutations tend to conferresistance to BRAF inhibitors in BRAF-mutated contexts. In line withour findings, an experimental study identified a link between BRAFmutations and the TP53 pathway (37). The miR-3151 functionsdownstream of BRAF and downregulates the expression and nuclearlocalization of TP53. Because cell lines that are resistant to the BRAFinhibitor vemurafenib overexpress miR-3151, targeting miR-3151could represent a promising therapeutic approach to overcome vemur-afenib resistance. Another study showed that the TP53 reactivatorPRIMA-1Met can potentiate the effect of vemurafenib inmelanoma celllines (38). A combination trial with dabrafenib and PRIMA-1Met

is ongoing (https://www.clinicaltrials.gov/ct2/show/NCT03391050),which suggests that the BRAF–TP53 interaction we identified can betherapeutically exploited.

We illustrate synthetic lethal triplets between a drug and twomutations, for instance, increased sensitivity to cytarabine in presenceof both KRAS and MAP3K4 mutations. Because synthetic lethalitydatabases (39, 40) are usually confined to synthetic lethal pairs, they donot allow to validate our synthetic lethal triplets. Instead, we show thatwe can robustly identify synthetic lethal triplets in the GDSC and theCTRP dataset. Despite many overlapping cell lines, we consider thesetwo datasets as independent because the experimental setup includingthe viability assay is different (4, 7). Because we do not use abinarization threshold to define sensitive and resistant cell lines, itcan be argued that the effects we observe should be described assynthetic sickness rather than synthetic lethality (31).

Because the tissue specificity of in vitro drug responses remainscontroversial (3, 4, 41, 42), modeling drug response based on tissue-specific or pan-cancer datasets is regarded as a critical decision (43). Incontrast to previous studies that opted for tissue subsets (4), we tested

Mutation Interactions in Drug Response Prediction Models

AACRJournals.org Mol Cancer Ther; 19(3) March 2020 933

Page 52: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

for tissue specificity within the pan-cancer setting, which allows toestimate the model coefficients based on the entire dataset. We showthat the tissue of origin cannot only influence drug response by itselfbut also modulate the effect of a mutation.

Previous studies (4) concluded that gene expression predicts drugresponse better than mutation data. However, the predictive powerof gene expression strongly correlates with the tissue of origin.Instead of using expression data in our models, we included thetissue of origin as a covariate. We control for confounding factors

by assessing model performance in comparison with backgroundmodels. At the same time, we keep our models small enough to bereadily comprehensible.

Depending on the sample size, the sparsity of the binary mutationmatrix poses a challenge for the analysis of interactions. A sufficientnumber of samples with bothmutations is required to reliably estimateinteraction effects. To address this in future studies, mutation eventsmay be grouped into pathways. Our analytic framework can be usedfor the assessment of other drug response screens or dependency data

Figure 6.

Synthetic lethal triplets between a mutation pair and a drug. A, Mutations in CTNNB1 and PIK3CA sensitize cancer cell lines to the MDM2 inhibitor Nutlin-3a. B, Thetranscription factor FOXO3 may provide a mechanistic link for the synthetic lethal triplet between CTNNB1 and PIK3CA mutations, and the drug Nutlin-3a. C,Mutations in KRAS and MAP3K4 sensitize cancer cell lines to the DNA synthesis inhibitor cytarabine. D, KRAS and MAP3K4 are involved in parallel pathways thatinduce DNA synthesis, which may explain why their mutation synergistically increases sensitivity to cytarabine.A and C depict drugs (bold) and drug targets (italic).Cell lines are grouped by dataset (GDSC or CTRP) and by mutation status of the twomutated genes. Each point represents the measured drug response for one cellline. Horizontal lines represent the median drug response per group. See also Supplementary Table S2.

Cramer et al.

Mol Cancer Ther; 19(3) March 2020 MOLECULAR CANCER THERAPEUTICS934

Page 53: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

like shRNA or CRISPR screens. Follow-up studies, especially ofexperimental nature, could give insights into the molecular mechan-isms that underlie strong deviations from additivity. Understandingresistance and sensitivity mechanisms that involve interactions canimprove our understanding of a drug's mode of action.

In summary, we describe interactions of mutations in drugresponse. We show that although some associations between a muta-tion and drug response are generalizable across contexts, othersstrongly depend on the mutation status of additional genes or thetissue of origin. Including interactions tends to improve the perfor-mance and robustness of drug response prediction models. Our workcontributes to a systems-level understanding of the factors thatmediate drug response.

Disclosure of Potential Conflicts of InterestD. Cramer is a scientist (Biostatistics) at Immatics Biotechnologies GmbH and

reports receiving a commercial research grant from Merck KGaA. J. Mazur is theprincipal scientist (Oncology Bioinformatics) at Merck KGaA and has ownershipinterest (including patents) in Merck KGaA stocks. O. Espinosa is a biostatistician at

Fast Track Diagnostics Ltd. and has ownership interest (including patents) in MerckKGaA. E. Staub is the director (Oncology Bioinformatics) atMerckHealthcare KGaA.No potential conflicts of interest were disclosed by the other authors.

Authors’ ContributionsConception and design: D. Cramer, J. Mazur, O. Espinosa, R. Eils, E. StaubDevelopment of methodology: D. Cramer, J. Mazur, O. Espinosa, D. H€ubschmann,R. Eils, E. StaubAnalysis and interpretation of data (e.g., statistical analysis, biostatistics,computational analysis): D. Cramer, J. Mazur, D. H€ubschmannWriting, review, and/or revision of the manuscript: D. Cramer, J. Mazur,O. Espinosa, M. Schlesner, D. H€ubschmann, R. Eils, E. StaubStudy supervision: J. Mazur, M. Schlesner, D. H€ubschmann, R. Eils, E. Staub

The costs of publication of this article were defrayed in part by the payment of pagecharges. This article must therefore be hereby marked advertisement in accordancewith 18 U.S.C. Section 1734 solely to indicate this fact.

Received January 22, 2019; revised June 21, 2019; accepted December 4, 2019;published first December 11, 2019.

References1. Garraway LA. Genomics-driven oncology: framework for an emerging para-

digm. J Clin Oncol 2013;31:1806–14.2. Gillet JP, Varma S, Gottesman MM. The clinical relevance of cancer cell lines.

J Natl Cancer Inst 2013;105:452–8.3. Garnett MJ, Edelman EJ, Heidorn SJ, Greenman CD, Dastur A, Lau KW, et al.

Systematic identification of genomic markers of drug sensitivity in cancer cells.Nature 2012;483:570–5.

4. Iorio F, Knijnenburg TA, Vis DJ, Bignell GR, Menden MP, Schubert M, et al. Alandscape of pharmacogenomic interactions in cancer. Cell 2016;166:740–54.

5. Barretina J, Caponigro G, Stransky N, Venkatesan K, Margolin AA, Kim S, et al.The cancer cell line encyclopedia enables predictivemodelling of anticancer drugsensitivity. Nature 2012;483:603–7.

6. Basu A, Bodycombe NE, Cheah JH, Price EV, Liu K, Schaefer GI, et al. Aninteractive resource to identify cancer genetic and lineage dependencies targetedby small molecules. Cell 2013;154:1151–61.

7. Seashore-Ludlow B, Rees MG, Cheah JH, Coko M, Price EV, Coletti ME, et al.Harnessing connectivity in a large-scale small-molecule sensitivity dataset.Cancer Discov 2015;5:1210–23.

8. Safikhani Z, El-Hachem N, Smirnov P, Freeman M, Goldenberg A, Birkbak NJ,et al. Safikhani et al. reply. Nature 2016;540:E2.

9. Haibe-Kains B, El-Hachem N, Birkbak NJ, Jin AC, Beck AH, Aerts HJ, et al.Inconsistency in large pharmacogenomic studies. Nature 2013;504:389–93.

10. Haverty PM, Lin E, Tan J, Yu Y, Lam B, Lianoglou S, et al. Reproduciblepharmacogenomic profiling of cancer cell line panels. Nature 2016;533:333–7.

11. Goossens N, Nakagawa S, Sun X, Hoshida Y. Cancer biomarker discovery andvalidation. Transl Cancer Res 2015;4:256–69.

12. Knijnenburg TA, Klau GW, Iorio F, Garnett MJ, McDermott U, Shmulevich I,et al. Logic models to predict continuous outputs based on binary inputs with anapplication to personalized cancer therapy. Sci Rep 2016;6:36812.

13. Nguyen L,DangCC, Ballester PJ. Systematic assessment ofmulti-gene predictorsof pan-cancer cell line sensitivity to drugs exploiting gene expression data.F1000Res 2016;5. doi: 10.12688/f1000research.10529.2.

14. BatesonW, Punnett R, Hurst C. Reports to the EvolutionCommittee of the RoyalSociety, report II. London, United Kingdom: Harrison and Sons; 1905.

15. Liu Y, Fei T, Zheng X, Brown M, Zhang P, Liu XS, et al. An integrativepharmacogenomic approach identifies two-drug combination therapies forpersonalized cancer medicine. Sci Rep 2016;6:22120.

16. Jiang P, LeeW, Li X, Johnson C, Liu JS, BrownM, et al. Genome-scale signaturesof gene interaction from compound screens predict clinical efficacy of targetedcancer therapies. Cell Syst 2018;6:343–54

17. Yang W, Soares J, Greninger P, Edelman EJ, Lightfoot H, Forbes S, et al.Genomics of drug sensitivity in cancer (GDSC): a resource for therapeuticbiomarker discovery in cancer cells. Nucleic Acids Res 2013;41:D955.

18. Bouhaddou M, DiStefano MS, Riesel EA, Carrasco E, Holzapfel HY, Jones DC,et al. Drug response consistency in CCLE and CGP. Nature 2016;540:E9–10.

19. Huang S, Pang L. Comparing statistical methods for quantifying drugsensitivity based on in vitro dose-response assays. Assay Drug Dev Technol2012;10:88–96.

20. Forbes SA, Beare D, Boutselakis H, Bamford S, Bindal N, Tate J, et al.COSMIC: somatic cancer genetics at high-resolution. Nucleic Acids Res2017;45:D777–83.

21. Kim JW, Botvinnik OB, Abudayyeh O, Birger C, Rosenbluh J, Shrestha Y, et al.Characterizing genomic alterations in cancer by complementary functionalassociations. Nat Biotechnol 2016;34:539–46.

22. VanAllen EM,Wagle N, Sucker A, Treacy DJ, Johannessen CM, Goetz EM, et al.The genetic landscape of clinical resistance to RAF inhibition in metastaticmelanoma. Cancer Discov 2014;4:94–109.

23. Bollag G, Hirth P, Tsai J, Zhang J, Ibrahim PN, Cho H, et al. Clinical efficacy of aRAF inhibitor needs broad target blockade in BRAF-mutant melanoma. Nature2010;467:596–9.

24. Rheault TR, Stellwagen JC, AdjabengGM,Hornberger KR, PetrovKG,WatersonAG, et al. Discovery of dabrafenib: a selective inhibitor of Raf kinases withantitumor activity against B-Raf-driven tumors. ACS Med Chem Lett 2013;4:358–62.

25. Tsai J, Lee JT, Wang W, Zhang J, Cho H, Mamo S, et al. Discovery of a selectiveinhibitor of oncogenic B-Raf kinase with potent antimelanoma activity.Proc Natl Acad Sci 2008;105:3041–6.

26. Nazarian R, Shi H, Wang Q, Kong X, Koya RC, Lee H, et al. Melanomas acquireresistance to B-RAF(V600E) inhibition by RTK or N-RAS upregulation. Nature2010;468:973–7.

27. Wecker H, Waller CF. Afatinib. Recent Results Cancer Res 2018;211:199–215.

28. Dutto I, Scalera C, Prosperi E. CREBBP and p300 lysine acetyl transferases in theDNA damage response. Cell. Mol. Life Sci 2018;75:1325–38.

29. Huang YL, ChouWC, Hsiung CN, Hu LY, Chu HW, Shen CY. FGFR2 regulatesMre11 expression and double-strand break repair via the MEK-ERK-POU1F1pathway in breast tumorigenesis. Hum Mol Genet 2015;24:3506–17.

30. Cheung-Ong K, Giaever G, Nislow C. DNA-damaging agents in cancer che-motherapy: serendipity and chemical biology. Chem. Biol 2013;20:648–59.

31. O’Neil NJ, Bailey ML, Hieter P. Synthetic lethality and cancer. Nat Rev Genet2017;18:613–23.

32. Tenbaum SP, Ord�o~nez-Mor�an P, Puig I, Chicote I, Arqu�es O, Landolfi S,et al. b-Catenin confers resistance to PI3K and AKT inhibitors andsubverts FOXO3a to promote metastasis in colon cancer. Nat Med2012;18:892–901.

33. Fu W, Ma Q, Chen L, Li P, Zhang M, Ramamoorthy S, et al. MDM2 actsdownstream of p53 as an E3 ligase to promote FOXO ubiquitination anddegradation. J Biol Chem 2009;284:13987–4000.

34. Zhang W, Liu HT. MAPK signal pathways in the regulation of cell proliferationin mammalian cells. Cell Res 2002;12:9–18.

Mutation Interactions in Drug Response Prediction Models

AACRJournals.org Mol Cancer Ther; 19(3) March 2020 935

Page 54: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

35. Yu H, McDaid R, Lee J, Possik P, Li L, Kumar SM, et al. The roleof BRAF mutation and p53 inactivation during transformation of asubpopulation of primary human melanocytes. Am J Pathol 2009;174:2367–77.

36. Riemer P, RydenfeltM,MarksM, van EunenK, Thedieck K, HerrmannBG, et al.Oncogenic b-catenin and PIK3CA instruct network states and cancer pheno-types in intestinal organoids. J Cell Biol 2017;216:1567–77.

37. LankenauMA, Patel R, Liyanarachchi S,Maharry SE,HoagKW,DugganM, et al.MicroRNA-3151 inactivates TP53 in BRAF-mutated human malignancies.Proc Natl Acad Sci 2015;112:E6744–51.

38. Krayem M, Journe F, Wiedig M, Morandini R, Najem A, Sal�es F, et al. p53reactivation by PRIMA-1Met (APR-246) sensitisesV600E/KBRAFmelanoma tovemurafenib. Eur J Cancer 2016;55:98–110.

39. Li XJ, Mishra SK, Wu M, Zhang F, Zheng J. Syn-lethality: an integrativeknowledge base of synthetic lethality towards discovery of selective anticancertherapies. Biomed Res Int 2014;2014:196034.

40. Jerby-Arnon L, Pfetzer N, Waldman YY, McGarry L, James D, Shanks E, et al.Predicting cancer-specific vulnerability via data-driven detection of syntheticlethality. Cell 2014;158:1199–209.

41. Jaeger S, Duran-Frigola M, Aloy P. Drug sensitivity in cancer cell lines is nottissue-specific. Mol Cancer 2015;14:40.

42. Yao F, Madani Tonekaboni SA, Safikhani Z, Smirnov P, El-HachemN, FreemanM, et al. Tissue specificity of in vitro drug sensitivity. J Am Med InformaticsAssoc 2017;25:158–66.

43. Azuaje F. Computational models for predicting drug responses in cancerresearch. Brief Bioinform 2017;18:820–9.

Mol Cancer Ther; 19(3) March 2020 MOLECULAR CANCER THERAPEUTICS936

Cramer et al.

Page 55: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

CANCER IMMUNOLOGY RESEARCH | RESEARCH ARTICLE

High-Throughput Prediction of MHC Class I and IINeoantigens with MHCnuggets A C

Xiaoshan M. Shao1,2, Rohit Bhattacharya1,3, Justin Huang1,3, I.K. Ashok Sivakumar1,3,4, Collin Tokheim1,2,Lily Zheng1,5, Dylan Hirsch1,2, Benjamin Kaminow1,6, Ashton Omdahl1,2, Maria Bonsack7,8,9,Angelika B. Riemer7,8, Victor E. Velculescu1,5,10, Valsamo Anagnostou10, Kymberleigh A. Pagel1,2, andRachel Karchin1,2,10

ABSTRACT◥

Computational prediction of binding between neoantigen pep-tides andmajor histocompatibility complex (MHC) proteins can beused to predict patient response to cancer immunotherapy. Currentneoantigen predictors focus on in silico estimation of MHC bindingaffinity and are limited by low predictive value for actual peptidepresentation, inadequate support for rare MHC alleles, and poorscalability to high-throughput data sets. To address these limita-tions, we developed MHCnuggets, a deep neural network methodthat predicts peptide–MHC binding. MHCnuggets can predictbinding for common or rare alleles of MHC class I or II with asingle neural network architecture. Using a long short-term mem-ory network (LSTM), MHCnuggets accepts peptides of variablelength and is faster than other methods. When compared withmethods that integrate binding affinity and MHC-bound peptide

(HLAp) data frommass spectrometry, MHCnuggets yields a 4-foldincrease in positive predictive value on independentHLAp data.Weapplied MHCnuggets to 26 cancer types in The Cancer GenomeAtlas, processing 26.3 million allele–peptide comparisons in under2.3 hours, yielding 101,326 unique predicted immunogenic mis-sense mutations (IMM). Predicted IMM hotspots occurred in38 genes, including 24 driver genes. Predicted IMM load wassignificantly associated with increased immune cell infiltration(P < 2 � 10�16), including CD8þ T cells. Only 0.16% of predictedIMMs were observed in more than 2 patients, with 61.7% of thesederived from driver mutations. Thus, we describe a method forneoantigen prediction and its performance characteristics anddemonstrate its utility in data sets representing multiple humancancers.

IntroductionThe presentation of peptides bound to major histocompatibility

complex (MHC) proteins on the surface of antigen-presentingcells and subsequent recognition by T-cell receptors is fundamentalto the mammalian adaptive immune system. Neoantigensderived from somatic mutations are targets of immunoediting

and drive therapeutic responses in cancer patients treated withimmunotherapy (1, 2). Because experimental characterization ofneoantigens is both costly and time-consuming, computationalmethods have been developed to predict peptide–MHC bindingand subsequent immune response (3, 4). Supervised neural networkmachine learning approaches have performed the best (5–7) and arethe most widely used in silico methods. Despite advances incomputational approaches, improvements in predictive perfor-mance have been minimal, due in part to a lack of sufficientlylarge sets of experimentally characterized peptide binding affinitiesfor most MHC alleles.

Although neoantigen prediction for commonMHC class I alleles iswell studied (8), predictive accuracy on rare and less-characterizedMHC alleles remains poor (9, 10). Class II predictors are scarce (11).Current estimates suggest that class II antigen lengths primarily rangefrom 13 to 25 amino acids (12), and this diversity has been an obstacleto developing in silico neoantigen predictors (11, 13). As most neuralnetwork architectures are designed for fixed-length inputs, methodssuch as NetMHC (14–17) and MHCflurry (18) require preprocessingof peptide sequences or training of separate classifiers for each peptidelength.

Clinical application of MHC–peptide binding predictors, toidentify biomarkers for cancer immunotherapy, requires scalabilityto large patient cohorts and low false-positive rates (19). A cancermay contain hundreds of candidate somatically altered peptides, butfew will actually bind to MHC proteins and elicit an immuneresponse (20). For many years, most neoantigen predictors weretrained primarily on quantitative peptide–MHC binding affinitydata from in vitro experiments (21). Advances in immunopepti-domics technologies have enabled the identification of thousands ofnaturally presented MHC-bound peptides (HLAp) from cancerpatient samples and cell lines (19, 22). Several neoantigen predictors

1Institute for Computational Medicine, Johns Hopkins University, Baltimore,Maryland. 2Department of Biomedical Engineering, Johns Hopkins University,Baltimore, Maryland. 3Department of Computer Science, Johns HopkinsUniversity, Baltimore, Maryland. 4Applied Physics Laboratory, Johns HopkinsUniversity, Laurel, Maryland. 5McKusick-Nathans Institute of Genetic Medicine,Johns Hopkins University School ofMedicine, Baltimore, Maryland. 6Departmentof Chemical and Biomolecular Engineering, JohnsHopkins University, Baltimore,Maryland. 7Immunotherapy and Immunoprevention, German Cancer ResearchCenter (DKFZ), Heidelberg, Germany. 8Molecular Vaccine Design, GermanCenter for Infection Research (DZIF), partner site Heidelberg, Heidelberg,Germany. 9Faculty of Biosciences, Heidelberg University, Heidelberg, Germany.10The Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins UniversitySchool of Medicine, Baltimore, Maryland.

Note: Supplementary data for this article are available at Cancer ImmunologyResearch Online (http://cancerimmunolres.aacrjournals.org/).

X.M. Shao, R. Bhattacharya, J. Huang, and I.K.A. Sivakumar contributed equallyto this article.

Corresponding Author: Rachel Karchin, Johns Hopkins University, 316 Hacker-manHall, 3400NorthCharles Street, Baltimore, MD21204. Phone: 410-516-5578;Fax: 410-516-5294; E-mail: [email protected]

Cancer Immunol Res 2020;8:396–408

doi: 10.1158/2326-6066.CIR-19-0464

�2019 American Association for Cancer Research.

AACRJournals.org | 396

Page 56: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

are trained only on HLAp data for class I, and only for a limitednumber of peptide lengths (21, 23). The EDGE neural network istrained primarily on multiallelic HLAp and RNA sequencing(RNA-seq) data from 74 cancer patients; ForestMHC is a randomforest trained on HLAp from publicly available monoallelic anddeconvoluted multiallelic cell lines. The potential to improveneoantigen predictors by integrating binding affinity and HLApdata (19) has motivated hybrid approaches (14, 18). However, mostmethods predict more candidate neoantigens than are actuallyimmunogenic in patients (11, 19).

Here, we present a long short-term memory (LSTM) neuralnetwork method, MHCnuggets, a neoantigen predictor designedfor MHC class I and II alleles in a single framework. The methodleverages transfer learning and allele clustering to accommodateboth common, well-characterized MHC alleles and rare, less-studied alleles. Although existing computational neoantigen pre-dictors generate a ranked list of candidate peptides, maximizing thenumber of predictions that identify immunogenic peptides wouldbe preferred in many applications (18). We demonstrate thatMHCnuggets’ predictive performance is competitive with widelyused methods on binding affinity benchmark data sets. In com-parison with hybrid methods that integrate binding affinity andHLAp data, MHCnuggets shows fewer false positives and increasedpositive predictive value (PPV) in a held-out cell line data set ofligands identified by mass spectrometry (7, 22). To demonstrate theclinical utility and applicability of MHCnuggets to large patientcohorts, we investigated candidate immunogenic mutations from 26tumor types in The Cancer Genome Atlas (TCGA). MHCnuggetsyielded 101,326 predicted immunogenic missense mutations(IMM), observed in at least 1 individual (out of 1,124,266) in lessthan 2.3 hours. These mutations were correlated with increasedlymphocyte infiltration; however, only 0.16% were observed in morethan 2 patients.

Materials and MethodsImplementation

MHCnuggets uses an LSTM neural network architecture (ref. 24;Fig. 1A). LSTM architectures excel at handling variable lengthsequence inputs and can learn long-term dependencies betweennoncontiguous elements, enabling an input encoding that doesnot require peptide shortening or splitting (Fig. 1B). LSTMsare capable of handling peptides of any length. In practice, amaximum peptide length should be selected for network training.We set maximum peptide input length of 15 for class I and 30for class II, for computational efficiency purposes. These valuescover the majority of lengths observed in naturally presentedMHC-bound peptides (12). The networks were trained with trans-fer learning (25), which allows networks for less well-characterizedalleles to leverage information from extensively studied alleles(Fig. 1C). Transfer learning was also used to train networkscombining binding affinity and HLAp data sets. In addition,MHCnuggets architectures can be trained using either continuousbinding affinity measurements from in vitro experiments (halfmaximal affinity or IC50) and/or immunopeptidomic (HLAp)binary labels. The former utilizes a mean-squared error (MSE)loss, whereas the latter utilizes binary cross-entropy (BCE) lossfor training. For each MHC allele, we trained a neural networkmodel consisting of an LSTM layer of 64 hidden units, a fullyconnected layer of 64 hidden units, and a final output layer of asingle sigmoid unit (Fig. 1A).

For the 16 alleles where allele-specific HLAp training data wereavailable (26), we trained networks on both binding affinity andHLApdata (MHCnuggets). Next, we trained networks only with bindingaffinity measurements (MHCnuggets without mass spectrometry dataor noMS) for all MHC class I alleles. Due to the lack of allele-specificHLAp training data for class II, all MHC class II networks were trainedonly on binding affinity measurements. In total, we trained 148 class Iand 136 class II allele–specific networks. Common alleles comprise asmall fraction (<1%) of all knownMHCalleles (27). To handle bindingpredictions for rare alleles, MHCnuggets selects a network by search-ing for the closest allele, based on previously published supertypeclustering approaches. We prioritized approaches based on bindingpocket biochemical similarity when available. Briefly, HLA-A andHLA-B alleles were clustered by MHC binding pocket amino acidresidue composition (28), and HLA-C and all MHC II alleles werehierarchically clustered based upon experimental mass spectrometryand binding assay results (29, 30). For alleles with no supertypeclassification, the closest allele was from the same HLA gene, andallele group if available, with preference for alleles with the largestnumber of characterized binding peptides. All networks were imple-mented with the Keras Python package (TensorFlow back-end;refs. 31, 32). Open-source software is available at https://github.com/KarchinLab/mhcnuggets, installable via pip or Docker, and hasbeen integrated into the PepVacSeq (33), pvactools (34), and Neoepi-scope (35) pipelines.

Transformation of peptide binding affinitiesPredicted binding affinity can be transformed into a range of

values well suited for neural network learning by selecting alogarithmic base to match the weakest binding affinity of inter-est (36). For most benchmarks in this work, we used the standardupper limit of 50,000 nmol/L, so that predicted binding affinitywas

y ¼ max 0; 1� log50k IC50ð Þ� �

For the Bonsack and colleagues data set (8), the upper limit waschanged to 100,000 nmol/L because in their experiments, as describedin O'Donnell and colleagues (18), binders were defined as peptideswith IC50 < 100,000 nmol/L. As binding affinity was determined basedon in vitro HLA binding-competition versus a known strong binder(reported IC50 < 50 nmol/L) experimental IC50 values were in themicromolar range.

Performance metricsPPV ¼ NTP/(NTP þ NFP), where NTP is the number of true

positives and NFP is the number of false positives. We calculatedPPV with respect to the top-ranked n peptides, where n is thenumber of true binders in the ranked list, denoted as PPVn. For theBassani- Sternberg/Trolle (BST) benchmark, we also calculated PPVover the top 50- and 500-ranked peptides.

Selection of final network weightsTo minimize overfitting, network training was stopped after 100

epochs but if the best PPVn was reached earlier, network weights fromthat earlier epoch were used in the final network. Notably, although wechose to optimize the networks on PPVn, an alternative approachcould optimize on area under the ROC curve (auROC), Kendall's tau,or Pearson r correlation. For the two alleles in the Immune EpitopeDatabase (IEDB) with the most training examples in their respective

High-Throughput Prediction of Neoantigens with MHCnuggets

AACRJournals.org Cancer Immunol Res; 8(3) March 2020 397

Page 57: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

class, HLA-A�02:01 for class I and HLA-DRB1�01:01 for class II,training was stopped after 200 epochs.

Network trainingMSE loss LMSE was used to train networks with continuous-valued

binding affinity data and BCE loss LBCE for binary HLAp data. For adata set with n samples,

LMSE y; yð Þ ¼ 1n

Xni¼1

ðy ið Þ � y ið ÞÞ2

LBCE y; yð Þ ¼ � 1n

Xni¼1

y ið Þlog y ið Þ� �

þ 1� y ið Þ� �

log 1� y ið Þ� �

All training used backpropagation with the Adam optimizer (37)and learning rate of 0.001. Regularization was performed with dropoutand recurrent dropout (38) probabilities of 0.2. The number of hiddenunits, dropout rate, and number of training epochs were estimated by3-fold cross-validation onMHC class I A�02:01, a common allele witha large number of experimentally characterized binding peptides.

One-hot encodingPeptides were represented to the network as a series of amino acids;

each amino acid was represented as a 21-dimensional smoothed, one-hot encoded vector (0.9 and 0.005 replace 1 and 0, respectively).

Peptide paddingMHCnuggets’ architecture is capable of handling peptides of any

length, but in practice a maximum length should be selected, which inthis work was 15 for class I and 30 for class II). Peptides that are lessthan the maximum length are padded at the end with a character (“Z”which is not in the amino acid alphabet) until they reach themaximumlength.

Transfer learning protocol for binding affinity data onlyWe used transfer learning to improve network learning for MHC

alleles with limited characterized peptides available for training. Wefirst trained base allele-specific networks for class I and class II, usingalleles with the most training examples in IEDB (HLA-A�02:01 forclass I and HLA-DRB1�01:01 for class II). For all other alleles, the finalweights of the base network for its respective class were used toinitialize network training, and then an allele-specific network wastrained for each allele. Next, we assessed prediction performance ofeach allele-specific network on the training examples for each of thealleles. For each allele, if the network that performed best was not theHLA-A�02:01 network (for class I alleles) or HLA-DRB1�01:01 net-work (for class II alleles), we did a second round of training, with thebest-performing network's weights used in the initialization step.

Transfer learning protocol for binding affinity and HLAp dataTo integrateHLAp data into the class I networks, we initially trained

each network with binding affinity data as described above, transferred

Figure 1.

A,MHCnuggets’ architecture. A network is trained for each MHC allele. Each network has an LSTM layer with 64 hidden units, a fully connected layer with 64 hiddenunits, and a final output layer of a single sigmoid unit.B, Input scheme for peptideswith variable lengths. MHCnuggets’ architecture is capable of handling peptides ofany length, but in practice, a maximum length should be selected. Peptides are extended with padding until they reach the maximum length, prior to input into theneural network. The example shows padding for class II peptides with maximum length set to 30 amino acids. C, Transfer learning protocol for parameter sharingamong alleles. A base allele-specific network is trained for eachMHC class, with an allele selected by largest number of training examples. Transfer learning is appliedto train networks for the remaining alleles, with initial networkweights set to final base networkweights. A fine-tuning step identifies alleles that can be leveraged fora second round of transfer learning to produce a final network (Materials and Methods).

Shao et al.

Cancer Immunol Res; 8(3) March 2020 CANCER IMMUNOLOGY RESEARCH398

Page 58: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

the final weights to a new network, and then continued training withthe HLAp data as positive examples augmented with random peptidedecoys as negative examples.

Performance assessmentTo accurately assess the performance ofMHCnuggets on a variety of

MHC–peptide binding prediction tasks, we utilized six benchmarksets: MHC class I alleles, MHC class II alleles, common alleles with atrained model (allele-specific prediction), and rare alleles (pan-alleleprediction; Fig. 2; Supplementary Table S1). To compare to the HLAligand prediction tools from the NetMHC group (NetMHC 3.0,NetMHC 4.0, NetMHCpan 2.0, NetMHCpan 4.0; refs. 16, 17), whichcan be trained only by their developers, as well as the open-sourceMHCflurry tools (18), we used multiple benchmarking strategies: (i)independent benchmark test set of peptides not included astraining data for any of the methods; (ii) a previously published pairedtraining/testing benchmark; (iii) 5-fold cross-validation benchmark;and (iv) leave-one-molecule-out (LOMO) benchmark.

We evaluated six MHC class I predictors on independent bindingaffinity andHLAp data sets (7, 8, 22). First, we comparedMHCnuggetsto several class I predictors that incorporate both binding affinity andHLAp data: MHCflurry 1.2.0, MHCflurry (train-MS), NetMHC 4.0,and NetMHCpan 4.0. Each method was benchmarked using anindependent set of MHC-bound peptides identified by mass spec-trometry across seven cell lines for sixMHC I alleles. For testing, HLAphits were combined with random decoy peptides sampled from thehuman proteome in a 1:999 hit–decoy ratio, as described byAbelin andcolleagues (26), totaling 23,971,000 peptides. Next, four MHC class Ipredictors trained only on binding affinity data [MHCnuggets (noMS)and MHCflurry (noMS), NetMHC 3.0 and NetMHCpan 2.0] wereevaluated with the Kim and colleagues data set (5), in which eachpredictor was trained with the BD2009 data and tested on BLINDdata. It was possible to compare NetMHC 3.0 and NetMHCpan

2.0 performance on Kim and colleagues, because they have pre-viously published predicted IC50 values for all peptide–MHC pairsin BLIND. This allowed us to calculate their PPVn, auROC, Kendall'stau, and Pearson r correlations.

Next, we compared MHCnuggets’ class II ligand predictionperformance with self-reported performance statistics of NetMHCgroup's MHC class II methods (39). We used the Jensen andcolleagues 5-fold cross-validation benchmark to assess allele-specific MHC class II prediction of MHCnuggets and NetMHCII2.3, for 27 alleles. NetMHCII 2.3 reported the average auROCfor five-fold cross-validation, and we report MHCnuggets’ PPVfor each of the 27 alleles as well as the average auROC, Pearson r,and Kendall's tau correlations.

The LOMO benchmarks are a type of cross-validation designed toestimate the performance of peptide binding predictionwith respect torare MHC alleles. Given training data for nMHC alleles, the data for asingle allele are held out and networks are trained for the remainingn � 1 alleles. Then for each peptide, predictions are generated by theremaining networks. We designed a LOMO benchmark to evaluateMHC class I rare allele prediction, by selecting 20 alleles with 30 to 100characterized peptides in the IEDB (40). For class II rare alleleprediction, we used the Jensen and colleagues (39) LOMObenchmark.We were unable to assess rare allele prediction for NetMHC class Imethods, as no published results were available. For theNetMHC classII methods, we comparedMHCnuggets to their self-reported auROCs.

Data set collection and curationData sources for network training and testing, TCGA somatic

mutations, TCGA tumor gene expression, and haplotype calling areshown in Supplementary Tables S1 and S2. A curated version of theIEDB 2018 (40) and the 16 class I monoallelic B-cell line immuno-peptidomes (26) was provided by Tim O'Donnell (https://data.mendeley.com/datasets/8pz43nvvxh/2), binding affinity assays of human

Figure 2.

MHCnuggets’ features.A,Venn diagram representation of theMHC–peptide binding prediction functions ofMHCnuggets and similar tools.B, Training andMHC allelemodel selection scheme for MHCnuggets.

High-Throughput Prediction of Neoantigens with MHCnuggets

AACRJournals.org Cancer Immunol Res; 8(3) March 2020 399

Page 59: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

papillomavirus (HPV)–derived peptides were provided by MariaBonsack and Angelika Riemer (8), BST ¼ immunopeptidomes fromsix cell lines with multiallelic MHCs (26 MHC class I alleles; ref. 22)and from soluble HLA(sHLA)-transfected HeLa cells separated byallele (4 MHC class I alleles; ref. 7). Decoy random peptides weresampled from the human proteome.

Training data setsThe networks were trained with data from a curated version of the

IEDB (2018; ref. 40), containing chemical binding affinity measure-ments for 241,553 peptide–allele pairs covering 217 class I alleles and96,211 peptide–allele pairs covering 135 class II alleles. Additionaltraining data consisted of 16 class I mon-allelic B-cell line immuno-peptidomes (26). The immunopeptidome data are limited to HLApbinders and were supplemented by decoy random peptides sampledfrom the human proteome (https://data.mendeley.com/datasets/8pz43nvvxh/2).

Benchmark data sets

Kim and colleagues: This benchmark contained 53 MHC class Ialleles and 137,654 IC50 measurements published prior to 2009(training set) and 53 unique MHC class I alleles with 26,888 IC50

measurements, published from 2009 to 2013 (test set). Three alleles(HLA-B�27:03, HLA-B�38:01, and HLA-B�08:03) did not containsufficient training data, and two alleles (HLA-A�46:01 and HLA-B�27:03) did not contain any peptides defined as binders in this work(IC50 < 500 nmol/L). Therefore, a total of four alleles (HLA-A�46:01,HLA-B�27:03, HLA-B�38:01, and HLA-B�08:03) were dropped fromthe analysis. All peptides in this benchmark set consisted of 8 to 11amino acid residues.

Bonsack and colleagues: This data set contains 475 synthetic pep-tides derived from model protein sequences HPV16 E6 and E7 testedfor binding to 7 alleles (HLA-A�01:01, HLA-A�02:01, HLA-A�03:01,HLA-A�11:01,HLA-A�24:02,HLA-B�07:02, andHLA-B�15:01). Eachpeptide was tested in competition-based cellular binding assays with aknown high-affinity fluorescein-labeled reference peptide. EBV-transformed B-lymphoblastic cells were stripped of their naturallybound peptides and mixed with serially diluted test peptides and 150nmol/L of reference peptide. Each synthetic peptide was tested at 8different concentrations ranging from 780 nmol/L to 100,000 nmol/L.Mixture fluorescence at each synthetic peptide concentration wasmeasured with flow cytometry, and a nonlinear regression analysiswas used to find the test peptide concentration that inhibited 50% ofthe reference peptide binding (IC50). Peptideswere classified as binders(IC50 � 100,000 nmol/L) or nonbinders (IC50 > 100,000 nmol/L).Peptides in this independent benchmark set do not have IEDB entries.

Bassani-Sternberg and colleagues, 2015: This data set contains22,598 unique peptides eluted from 6 cell lines with multiallelicMHCs. Out of the total 6 cell lines, a total of 26 alleles werereported. For each multiallelic cell line, peptide/MHC pairs werefound through deconvolution, following the protocol describedby (26), with the difference that we used MHCnuggets rather thanNetMHCpan 2.8 (41) to predict IC50 values for each peptide–MHCpair. For each cell line, each peptide was initially assigned as abinder to all expressed alleles. Then, for each allele, we filtered outany peptide predicted to bind with IC50 > 1,000 nmol/L to thatallele, and with IC50 < 150 nmol/L to any other allele. Peptidesfound for 6 alleles (HLA-A�01:01, HLA-A�02:01, HLA-A�03:01,

HLA-A�24:02, HLA-A�31:01, and HLA-B�51:01) were selected forallele-specific prediction testing. Trained networks were availablefor these alleles from all the methods that we compared.

Trolle and colleagues: This data set contains 15,524 unique peptideseluted from soluble HLA (sHLA) transfected HeLa cells, a process thatallowed for separating binding peptides to a single MHC allele. Thisdata set reports peptides for 5MHC alleles. Peptides found for 4 alleles(HLA-A�01:01, HLA-A�02:01, HLA-A24�02, andHLA-B�51:01) wereselected for testing. Peptide lengths in this data set range from 8 to 15amino acid residues.

BST: This benchmark consists of 23,971 HLAp hits for 6 alleles,from Bassani-Sternberg and colleagues (22) and Trolle and collea-gues (42) plus 23,947,029 random decoy peptides sampled from thehuman proteome. Any peptides found to overlap with the trainingHLAp data (26) were removed.

Jensen and colleagues: This benchmark was designed to assess bothallele-specific and rareMHC class II binding affinity predictors. Allele-specific prediction was tested with a 5-fold cross-validation experi-ment on peptides found in IEDB in 2016 but not 2013. Rare allelepredictions were tested with the LOMO protocol.

IEDB class I rare alleles: This data set was designed to apply theLOMO protocol to class I alleles. It included 20 "pseudo-rare"alleles with 30 to 100 binding affinity peptide measurementsin IEDB.

All data sets used in this work are available at http://dx.doi.org/10.17632/8c26kkrfpr.2.

TCGA analysis pipelineTo assess candidate immunogenic somatic mutations in patients

from the TCGA cohort, we developed and implemented a basicpipeline based on whole-exome and RNA-seq data (SupplementaryFig. S1). Our analysis builds upon work from the TCGA PanCancerAnalysis teams for drivers (43), mutation calling (44), and cancerimmune landscapes (45). We obtained somatic mutation calls for allcancer types from Multicenter Mutation Calling in Multiple Cancers(MC3; v0.2.8; 7,775 patients). Tumor-specific RNA expression valuesfrom Broad TCGA Firehose were standardized across tumor typesusing the RSEM Z-score (46). MHC allele calls were obtained fromthe TCGA cancer immune landscape publication, in which up to 6MHC class I alleles (HLA-A, HLA-B, and HLA-C) were identifiedfor each patient using OptiType (ref. 47; Supplementary Table S2).We included patients for which mutation calls, MHC allele calls,and RNA expression values were available from TCGA. After theseconsiderations, the analysis included 6,613 patients from 26 TCGAtumor types. Six cancer types were not included in our analysis,because 15 or fewer patientsmet this requirement: lymphoid neoplasmdiffuse large B-cell lymphoma, esophageal carcinoma, mesothelioma,skin cutaneous melanoma, stomach adenocarcinoma, and ovarianserous cystadenocarcinoma.

The somatic missense mutations identified in each patient werefiltered to include only thosewith strong evidence ofmutant gene RNAexpression in that patient (Z > 1.0). For each mutation that passed thisfilter, we used the transcript assigned by MC3 to pull flanking aminoacid residues from the SwissProt database (48), yielding a 21 aminoacid residue sequence fragment centered at the mutated residue. Allcandidate peptides of length 8, 9, 10, and 11 that included the mutatedresidue were extracted from each sequence fragment. Next binding

Shao et al.

Cancer Immunol Res; 8(3) March 2020 CANCER IMMUNOLOGY RESEARCH400

Page 60: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

affinity predictions were generated for each mutated peptide for up tosix MHC class I alleles, depending on the patient's HLA genotypes. Intotal, each somatic mutation was represented by 38 mutated peptidesfor up to six possible MHC pairings.

We applied a permissive filter to select candidate immunogenicpeptides, requiring mutated peptides to have binding affinity of IC50 <500 nmol/L for at least one MHC allele. Somatic missense mutationsthat generated neoantigens meeting these criteria were consideredpredicted IMMs. For a given patient, if amutationwas predicted to be apredicted IMM for multiple alleles, it was counted only once using theMHC allele with the lowest predicted IC50. Finally, for each patient, wecounted the number of predicted IMMs found in their exome andstratified by tumor type. We then identified predicted IMMs that wereharbored by more than 1 patient.

We sought to ascertain whether predicted IMMs occurred prefer-entially in particular gene or protein regions. Using the HotMAPS 1Dalgorithm v1.2.2 (49), we clustered primary amino acid residuesequence to identify regions where mutations were frequently pre-dicted as IMM, with statistical significance (q < 0.01, Benjamini–Hochberg method; ref. 50). In this analysis, mutations were stratifiedby cancer type, and we considered enrichment within linear regions of50 amino acid residues.

We considered that mutation immunogenicity might be associatedwith potential driver status of amutation. Driver status was inferred byCHASMplus (51), a random forest classifier that utilizes amultifacetedfeature set to predict driver missense mutations and is effective atidentifying both common and rare driver mutations. For each muta-tion, immunogenicity was represented as a binary response variable

and driver status was used as a covariate. Mutations with CHASMplusq-value < 0.01 were considered drivers (51). We modeled the rela-tionship with univariate logistic regression (R glm package withbinomial link logit function).

To assess whether the total number of predicted IMMs per patientwas associated with changes in tumor immune infiltrates, we per-formed Poisson regression (R glm package with Poisson link logfunction). All estimates of immune infiltrates were obtained fromThorsson and colleagues (45, 52). We fit two univariate models inwhich the response variable was the predicted IMM count and thecovariate was either total leukocyte fraction or fraction ofCD8þT cells.

MC3 mutation filteringMC3 TCGA somatic mutation calls were filtered for missense

mutations.

Regression modelsWe applied two univariate Poisson regression models. In the first

model, each patient's predicted IMM load was the response variableand the independent variable X was the total leukocyte fraction. Thefitted coefficient b ¼ 0:75 (P < 2 � 10�16, Wald test) indicated thatincreased predicted IMM load was significantly associated withincreased leukocyte fraction in a patient's cancer. In a second model,X was the proportion of CD8þ T cells inferred by CIBERSORT (53).The fitted coefficientb ¼ 5:9 (P< 2� 10�16,Wald test) indicated thatincreased predicted IMM load was associated with increased tumor-infiltrating CD8þ T cells. Total lymphocyte and (Aggregate3) CD8þ

T-cell fractions were estimated in Thorsson and colleagues (45).

Figure 3.

MHC class I benchmark comparisons. A, PPVn for MHC class I allele–specific prediction on binding affinity test sets from Bonsack and colleagues (seven alleles) andKim and colleagues (53 alleles; refs. 5, 8). B, PPVn for MHC class I allele–specific prediction on HLAp BST data set (Bassani-Sternberg and colleagues and Trolle andcolleagues; refs. 7, 22), stratified by allele (six alleles).C, PPVn for MHC class I allele–specific prediction on HLAp BST data set (from B) stratified by peptide sequencelength. D, True and false positives for each method on the top 50 ranked peptides from the HLAp BST data set. FP, false positives; PPVn, positive predictive value onthe top n ranked peptides, where n is the number of true binders; TP, true positives.

High-Throughput Prediction of Neoantigens with MHCnuggets

AACRJournals.org Cancer Immunol Res; 8(3) March 2020 401

Page 61: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

Runtime analysisTo assess the speed and scalability of the testedmethods, we selected

1million peptides sampled from theAbelin and colleagues data set (26)for class I alleles, and one million peptides sampled from the IEDB(curated data set 2018; ref. 54) for class II alleles. Sampling was donewith replacement. For each method listed in Fig. 1A, networks forthree class I MHC alleles (HLA-A�02:01, HLA-A�02:07, and HLA-A�01:01) and three class II MHC alleles (HLA-DRB1�01:01, HLA-DRB1�11:01, and HLA-DRB1�04:01) were used to predict bindingover a range of input sample sizes (102, 103, 104, 105, and 106). Allmethodswere run on a single graphics processing unit (GPU) computenode (oneNVIDIATESLAK80GPUplus six 2.50 GHz Intel Xeon E5-2680v3 CPUs, 20 GB memory).

ResultsHigh-throughput MHCnuggets breaks the MHC ligandprediction plateau

The MHCnuggets LSTM neural network architecture accepts pep-tides of variable lengths as inputs so that ligand binding prediction canbe performed for both MHC class I and II alleles. To enable bindingprediction for rare MHC alleles that have limited associated experi-mental data, we designed a method that leverages networks built forclosely related common alleles with extensive data.When available, weutilize a transfer learning protocol to integrate binding affinity andHLAp results in a single networkmodel, to better represent the naturaldiversity of MHC binding peptides.

To assess the baseline performance of MHCnuggets allele-specificnetworks on binding affinity data, we compared our approach with

widely used MHC class I ligand prediction methods using twovalidation sets of binding affinity measurements (5, 8). We trainedand tested MHCnuggets (noMS) and MHCflurry (noMS) using theKim and colleagues data set (5) and evaluated the predictionsprovided by NetMHC 3.0 and NetMHCpan 2.0. We observed thatMHCnuggets' performance (PPVn ¼ 0.829, auROC ¼ 0.924) wascomparable with thesemethods (Fig. 3A; PPVn of all methods¼ 0.825� 0.005, auROC of all methods¼ 0.928� 0.0031). MHCnuggets wasalso comparable (PPVn ¼ 0.633, auROC ¼ 0.794) to these methodswhen tested on the Bonsack and colleagues data set (ref. 8; PPVn of allmethods ¼ 0.625 � 0.008, auROC of all methods ¼ 0.77 �0.02; Fig. 3A; � refers to standard deviation; Supplementary TablesS3A and S3B, and S4A and S4B).

Earlier neoantigen prediction methods focused on class I andtrained on binding affinity data from IEDB (54). More recent workincorporated both binding affinity and HLAp data into networktraining (14, 18). We compared MHCnuggets with several class Ipredictors that used both binding affinity and HLAp data: MHCflurry1.2.0, MHCflurry (train-MS), NetMHC 4.0, and NetMHCpan 4.0. Weselected the BST HLAp data set (7, 22, 26) as an independentbenchmark, as it was not used as training data by any of thesemethods.For all alleles tested, MHCnuggets achieved an overall PPVn of 0.42and auROC of 0.82 (Fig. 3B). On average, MHCnuggets' PPVn wasmore than three times higher than MHCflurry 1.2.0, MHCflurry(train-MS), NetMHC 4.0, and NetMHCpan 4.0. For all alleles,MHCnuggets predicted fewer binders than other methods, resultingin fewer false-positive predictions. Stratifying by peptide length,MHCnuggets’ increased PPVn was most evident for peptides of length9, 10, and 11 (Fig. 3C). The length distribution of predicted binders

Figure 4.

MHCclass II benchmark comparisons.A,PPVn forMHC class II allele–specific prediction onbinding affinity test set from Jensen and colleagues (27 alleles, stratifiedbyallele).B, auROC, K-Tau, andPearson r scores forMHCclass II alleles from5-fold cross-validation. NetMHCII 2.3 performance is from their self-reported auROC. K-Tau,Kendall's tau correlation; PPVn, positive predictive value on the top n ranked peptides, where n is the number of true binders.

Shao et al.

Cancer Immunol Res; 8(3) March 2020 CANCER IMMUNOLOGY RESEARCH402

Page 62: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

was commensurate with the observed distribution of naturally occur-ring binders in the HLAp benchmark tests (ref. 7; SupplementaryTable S5A–S5D).

For some clinical applications, it may be desirable to minimizethe number of false positives among a small number of top-scoredpeptides. We also compared PPV of the methods listed above ontheir top 50 and 500 ranked peptides from the BST data set(six MHC class I alleles; Supplementary Table S5E and S5F).MHCnuggets exhibited the highest PPV in the top 50 for all allelesexcept HLA-B�51:01 and the highest PPV in the top 500 for allalleles (Fig. 3D).

Prediction of peptide–MHC binding for class II and rare allelesWe assessed the baseline performance of MHCnuggets class II

allele–specific networks on binding affinity data. To enable compar-ison with the class II methods from the NetMHC group, we used a5-fold cross-validation benchmark derived from IEDB thatwas included in the publication describing NetMHCII 2.3 and

NetMHCIIpan 3.2 (39). First, we computed PPVn for each of the27 allele-specific networks separately (Fig. 4A; mean PPVn ¼ 0.739).Next, we computed the overall auROC, Pearson r, and Kendall's taucorrelations for all 27 class II alleles. MHCnuggets overall auROC(0.849) was comparable with that of the NetMHCII 2.3 (0.861) andNetMHCIIpan 3.2 (0.861). Comparison with NetMHC class II meth-odswas limited to overall auROCas published in Jensen and colleagues(39), because their PPVn results are not publicly available (Fig. 4B;Supplementary Table S6A and S6B).

We estimated performance for those class I and II MHC alleles forwhich we were unable to train allele-specific networks, using LOMOcross-validation (39). In this LOMO protocol, MHC-peptide bindingis assessed for a well-characterized allele that has been held out fromtraining, to approximate prediction performance for a rare allele(Fig. 5A). For the 20 class I alleles, the mean PPVn was 0.65, and themean auROC was 0.671. For the 27 class II alleles, the mean PPVn was0.65 and the mean auROCwas 0.792. In comparison, the class II meanauROC of NetMHCIIpan 3.2 was 0.781 (Fig. 5B–D; Supplementary

Figure 5.

MHC class I and II benchmark comparisons to estimate rare allele performance. A, Schematic representation of LOMO testing. B, PPVn for MHC class I rare alleleprediction on IEDB pseudo-rare alleles binding affinity test set (20 alleles, stratified by allele).C, PPVn for MHC class II rare allele prediction on binding affinity test setfrom Jensen and colleagues (27 alleles, stratified by allele; ref. 39).D, auROC for MHC class II rare allele prediction on LOMO binding affinity test set from Jensen andcolleagues (27 alleles, stratified by allele; ref. 39). NetMHCIIpan 3.2 results are from their self-reported auROC. PPVn, positive predictive value on the top n rankedpeptides, where n is the number of true binders.

High-Throughput Prediction of Neoantigens with MHCnuggets

AACRJournals.org Cancer Immunol Res; 8(3) March 2020 403

Page 63: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

Tables S7, and S8A and S8B).NetMHCpan rare allele predictor LOMOtest results for class I are not publicly available; therefore, we wereunable to compare with them.

Fast and scalable computationWhen run on a GPU architecture, MHCnuggets was faster

and scaled more efficiently than MHC ligand predictors from theNetMHC family and MHCflurry. Given an input of one millionpeptides randomly selected from Abelin and colleagues (26),MHCnuggets runtime was 3.62, 69.7, and 624.5 times fasterthan MHCflurry 1.2.0, NetMHC 4.0, NetMHCpan 4.0, respective-ly (Fig. 6A). The improvement was similar for class II peptides,for which an input of one million peptides to MHCnuggetsran 65.6 times and 126 times faster than NetMHCII 2.3 andNetMHCIIpan 3.2, respectively (Fig. 6B). As the total numberof input peptides was increased from 0 to 1 million, the runtimeper peptide plateaued for other methods but decreased exponen-tially for MHCnuggets.

Predicted MHC class I IMMs in TCGA patientsTo illustrate the utility ofMHCnuggets' improvements in scalability

and PPV for the analysis of very large patient cohorts, we predictedclass I IMMs in patients whose exomes were sequenced by the TCGAconsortium (Materials and Methods). In our analysis pipeline, patientexomes were split into 21 amino acid residue sequence fragments,centered on each somatic missense mutation. For each sequencefragment, MHCnuggets predicted the MHC binding for all possible8-, 9-, 10-, and 11-length peptide windows. Peptides that passed filtersof predicted IC50 threshold (<500 nmol/L) and gene expression (Z >1.0; Materials and Methods) for at least one patient-specific MHCallele were classified as predicted IMMs (Supplementary Table S9A).Finally, we characterized driver status and positional hotspot propen-sity of the predicted IMMs.

Total processing time for 26,284,638 allele-peptide comparisonssupported by RNA-seq expression was under 2.3 hours. First, wesought to ascertain the extent of variability in predicted IMM countamong individuals with different cancer types. Next, we identifiedpredicted IMMs and protein regions enriched for predicted IMMsthat were shared across patients, because these might be informative

for neoantigen-based therapeutic applications. Then we consideredwhether predicted IMMs were more or less likely to be drivermutations. Finally, we assessed the associations between predictedpatient IMM load and computationally estimated immune cellinfiltrates.

After applying a gene-expression filter, we identified 101,326 uniquepredicted IMMs in 26 TCGA cancer types, with a mean of 15.6 perpatient.We found that themajority of patients harbored fewer than sixpredicted IMMs, and 197 patients had none. Seventy-two percent ofpatients had from one to 10 predicted IMMs, compared with 1.9% ofpatients with more than 100, and 9 patients with more than 1,000(Fig. 7A). Cancer types with the highest number of predicted IMMswere uterine corpus endometrial carcinoma (UCEC), colon adeno-carcinoma (COAD), and lung adenocarcinoma (LUAD), all three ofwhich are known for highmutation burden and immunogenicity (45).UCEC and COAD are also known to have a high frequency ofmicrosatellite-instable (MSI) tumors. The lowest number were foundin uveal melanoma (UVM), paraganglioma and pheochromocytoma(PCPG), and testicular germ cell tumors (TGCT; Fig. 7B; Supple-mentary Table S9B).

Across all cancer types, we identified 1,393 predicted IMMsharbored by 2 or more patients, of which 167 were identifiedin 3 or more patients. Of these, 167 only 11.5% occurred exclu-sively in a single cancer type (Fig. 7C). The predicted IMMsidentified in the largest number of patients were IDH1 R132H(62), FGFR3 S249C (24), PIK3CA E545K (23), KRAS G12D (18),PIK3CA E542K (18), TP53 R175H (18), TP53 R248Q (18), TP53R273C (17), and KRAS G12V (16), which are known recurrentoncogenic driver mutations (55, 56). Of the 1,071 genes harboringpredicted IMMs in 2 or more patients, the ones containingthe most included TP53 (68), CTNNB1 (18), PIK3CA (16),HRAS (8), KRAS (7), PTEN (7), FBXW7 (6), EGFR (5), MDN1(5), POLE (5), TRRAP (5), and VPS13C (5; SupplementaryTable S9C). Six missense mutations harbored by patients in theTCGA cohorts were previously validated by CD8þ T-cell responseassays (57, 58, 59). Of the six missense mutations, TP53R248Q,TP53Y220C, TP53R175H, TP53R248W, and KRASG12D were predictedto be IMMs by our MHCnuggets pipeline and were shared by 3 ormore of the TCGA patients.

Figure 6.

Timing and scalability. Runtime benchmark of tested methods using versions available on October 1, 2019, over a range of inputs (up to 1 million peptides). A, MHCclass I prediction. B, MHC class II prediction.

Shao et al.

Cancer Immunol Res; 8(3) March 2020 CANCER IMMUNOLOGY RESEARCH404

Page 64: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

Furthermore, 61.7% of the 167 predicted IMMs shared by 3 ormorepatients were classified as driver missense mutations by CHASMplus(q < 0.01). This percentage is significantly higher than the number ofpredicted drivers among all TCGA missense mutations (9,821 out of

791,637 or 1.2%). Although many shared IMMs were predicted to bedriver missense mutations, the percentage of predicted IMMs pre-dicted to be drivers was �0.1% of total predicted IMMs in our study.When compared with the OncoKB database of experimentally

Figure 7.

MHC class I IMMs in TCGA patients.A,Number of predicted IMMs identified in 6,613 TCGA patients. Dotted line, mean predicted IMMs per patient (15.6). Note that 123patients had >100 predicted IMMs but are not included for visual clarity.B,Number of predicted IMMs by cancer type. C, Predicted IMMs shared by 3 ormore patientsand the cancer types inwhich they occurred. Each row represents a cancer type, and each column illustrates the overlap of predicted IMMs seen in a single cancer typeor multiple cancer types. For example, the first column shows the number of predicted IMMs shared among patients with colorectal adenocarcinoma and uterinecorpus endometrial carcinoma. Bars to the left show the total number of unique predicted IMMs in each cancer type. Bar heights reflect the count of unique sharedpredicted IMMs, not the total number of patients in which the predicted IMM was observed. Image generated with UpSetR. D, Fibroblast growth factor receptor(FGFR3) predicted IMM hot region identified by HotMAPs in bladder cancer (BLCA). Predicted IMMs shown and number of BLCA patients with the predicted IMM:p.E216K (1), p.D222N (1), p.G235D (1), p.R248C (3), andp.S249C (24). Except for p.G235D, these predicted IMMsare proximal to the interface of FGFR3protein and thelight and heavy chains of an antibody fragment designed for therapeutic application in bladder cancer (PDB ID: 3GRW; ref. 61). ACC, adrenocortical carcinoma; BLCA,bladder urothelial carcinoma; BRCA, breast invasive carcinoma; CESC, cervical squamous cell carcinoma and endocervical adenocarcinoma; CHOL, cholangio-carcinoma; COAD, colon adenocarcinoma; GBM, glioblastomamultiforme; HNSC, head and neck squamous cell carcinoma; KICH, kidney chromophobe; KIRC; kidneyrenal clear cell carcinoma; KIRP, kidney renal papillary cell carcinoma; LGG, brain lower grade glioma; LIHC, liver hepatocellular carcinoma; LUAD, lungadenocarcinoma; LUSC, lung squamous cell carcinoma; PAAD, pancreatic adenocarcinoma; PCPG, pheochromocytoma and paraganglioma; PRAD, prostateadenocarcinoma; READ, rectum adenocarcinoma; SARC, sarcoma; TGCT, testicular germ cell tumor; THCA, thyroid carcinoma; THYM, thymoma; UCEC, uterinecorpus endometrial carcinoma; UCS, uterine carcinosarcoma; UVM, uveal melanoma.

High-Throughput Prediction of Neoantigens with MHCnuggets

AACRJournals.org Cancer Immunol Res; 8(3) March 2020 405

Page 65: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

confirmed driver mutations (60), 53.9% of the shared predicted IMMsidentified as “oncogenic” or “likely oncogenic” driver mutations. Thepercentage is lower (25.7%) if “likely oncogenic” mutations areexcluded.

Although we observed a limited number of shared predicted IMMs,we reasoned that protein regions enriched for predicted IMMscould present a therapeutic opportunity in certain cancer types.Using HotMAPS 1D, we identified clusters of residues within proteinregions having statistically significant enrichment of predicted IMMs(q < 0.01). These included CIC in low-grade glioma (LGG); NFE2L2and FGFR3 (Fig. 7D) in bladder cancer (BLCA; ref. 61); KRAS inpancreatic adenocarcinoma (PAAD); KIT in TGCTs; HRAS inhead and neck squamous carcinoma (HNSC); PTEN, POLE, andPPP2R1A in UCEC; and GNAQ and SF3B1 in UVM. Three genesharbored predicted immunogenic regions in more than one cancertype: P53 in BLCA, BRCA, HNSC, LGG, and UCEC; PIK3CA inHNSC and cervical squamous cell carcinoma (CESC); andCTNNB1 inliver hepatocellular carcinoma (LIHC) and UCEC (SupplementaryTable S9D).

We explored the relationship between mutation driver statuspredicted by CHASMplus and predicted IMM status using logisticregression. The log odds of being a predicted IMM was signi-ficantly decreased for drivers (b ¼ �0.66, Wald test P < 2e�16),which is consistent with previous work suggesting that negativeevolutionary selection eliminates MHC class I immunogenic onco-genic mutations early in tumor development (62).

Finally, we considered whether a patient's predicted IMM load wasassociated with changes in immune cell infiltrates as estimated fromRNA-seq of bulk cancer tissue. Predicted IMM load was significantlyassociated with increased total leukocyte fraction (b¼ 0.75, Wald testP < 2� 10�16) and with increased CD8þT-cell fraction (b¼ 5.9,Waldtest P < 2 � 10�16; Supplementary Table S9E).

These findings suggest that IMMs drive tumor immunoediting andmay be informative for the interpretation of clinical responses toimmunotherapy.

DiscussionMHCnuggets provides a flexible open-source platform for MHC–

peptide binding prediction that can handle commonMHC class I andII alleles as well as rare alleles of both classes. The LSTM networkarchitecture can handle peptide sequences of arbitrary length, withoutshortening or splitting. The single neural network architecture requiresfewer hyperparameters than more complex architectures and simpli-fies network training. In addition, our neural network transfer learningprotocols allow for parameter sharing among allele-specific, bindingaffinity– and HLAp-trained networks. When trained on bindingaffinity data, MHCnuggets performs as well as other current methods.When trained on both binding affinity and HLAp data, we demon-strate improved PPVn on an independent HLAp test set, withrespect to other methods that use both binding affinity and HLApdata. Although PPVn was lowest for the independent HLAp test set forall methods, this result is likely due to systematic differences betweentraining HLAp data (monoallelic B-cell lines; ref. 26) and the testdata comprised of seven multiallelic cell lines (HeLA, HTC116, JY,fibroblasts, SupB15, HCC1937, and HCC1143; refs. 7, 22), yielding amore challenging prediction problem. We attribute MHCnuggets’improvement on the independent test set with respect to othermethods to (1) optimization of PPVn in our network trainingprotocol and (2) our implementation of transfer learning to integrateinformation from binding affinity and HLAp measurements. The

performance of all methods is generally highest when both trainingand test data come from similar binding affinity experiments, butperformance improvement on HLAp data is more biologicallyrelevant (63).

We demonstrate improved scalability by comparing the runtime ofMHCnuggets on 1 million peptides to comparable methods, andfurther by processing over 26 million expressed peptide–allele pairsacross TCGA samples in under 2.3 hours. We identified 101,326unique IMMs harbored by patients using 26 cancer types sequencedby the TCGA, based on transcriptional abundance and differentialbinding affinity compared with reference peptides. These resultscontrast with a previous report of neoantigens in TCGA patients inseveral respects. Rech and colleagues (64) applied a minimum expres-sion threshold of 1 RNA-seq read count, an IEDB-recommendedcombination of neoantigen predictors derived primarily fromdifferentversions of NetMHC, and IC50 threshold of 50 nmol/L to identifystrongMHC binders. Their approach yielded 495,793 predicted class Iclassically defined neoantigen peptides (each harboring a singleimmunogenic mutation) from 6,324 patients in 26 cancer types. Asin our study, high variability in neoantigen burden across cancer typeswas observed. The difference between predicted IMM and neoantigenburden in the two studies is likely due to differences in RNA expressionthreshold and the low false-positive rate of MHCnuggets comparedwith IEDB-recommended tools.

Based on our conservative thresholds, predicted IMMs were almostexclusively private to individual TCGA patients, with only 1,393predicted IMMs observed in more than 1 patient. Although morethan 61% of predicted IMMs shared by more than 2 patients werepredicted to be driver mutations, the overall log odds of immunoge-nicity decreased for predicted driver mutations, indicating immuno-genicity might shape the driver mutation landscape. Patient predictedIMM counts were also associated with increase in total leukocytefraction and fraction of CD8þ T cells, suggesting that they may berelevant to immune system response to cancer.

This work has several limitations. First, our analyses are limited tomissense mutations, which, although numerous, cannot account forthe various somatic gene fusions, frameshift indels, splice variants, etc.,in tumors that also generate neoantigens. Although MHCnuggets canhandle peptide sequences regardless of their mutational origins, weprioritized missense mutations in this study. Indeed, the context of apeptide sequence, such as what sequences are flanking, its sourceprotein and the expression of the source protein, is informativefor MHC ligand prediction (21, 26). This type of information isavailable only for a limited number of HLAp data sets, which wereunavailable to us for training purposes. As more well-characterizedHLAp data sets become available, we will further developMHCnuggets to include these features. We did not address T-cellreceptor binding to bound peptide–MHC complexes or T-cell acti-vation upon complex binding. Although we are pursuing this morecomplex modeling problem, we believe that improved prediction ofpeptide binding to MHC is also therapeutically relevant (21). Finally,we are unable to directly compare performance to the MHC class IIprediction methods from the NetMHC group, except for self-reportedauROC. Although we are not able to do a rigorous comparison ofMHCnuggets class II prediction, our benchmark comparisons sug-gested that MHCnuggets was competitive with NetMHCII 2.3 andthat MHCnuggets class II rare allele performance was competitivewith NetMHCIIpan 3.2, suggesting that further work in this area iswarranted.

In summary, we present MHCnuggets, an open-source softwarepackage for MHC ligand prediction that performs better than existing

Shao et al.

Cancer Immunol Res; 8(3) March 2020 CANCER IMMUNOLOGY RESEARCH406

Page 66: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

methods with respect to PPV by leveraging transfer learning tointegrate binding affinity and HLAp data. In contrast to previousmethods, MHCnuggets handles both MHC class I and II ligandprediction and both common and rare HLA alleles, all within a singleframework. We demonstrated the utility of MHCnuggets as a basicpipeline to analyze mutation immunogenicity, shared predictedIMMs, and the relationship betweenmutation immunogenicity, driverpotential, and immune infiltrates from large-scale cancer patientsequencing data from TCGA.

Disclosure of Potential Conflicts of InterestV.E. Velculescu is founder and on the board of directors of Personal Genome

Diagnostics, is a scientific advisory board member for Takeda, and has ownershipinterest (including patents) in Personal Genome Diagnostics. V. Anagnostou reportsreceiving a commercial research grant from Bristol-Meyers Squibb. No potentialconflicts of interest were disclosed by the other authors.

Authors’ ContributionsConception and design: X.M. Shao, R. Bhattacharya, V.E. Velculescu, R. KarchinDevelopment of methodology: X.M. Shao, R. Bhattacharya, J. Huang,I.K.A. Sivakumar, R. KarchinAnalysis and interpretation of data (e.g., statistical analysis, biostatistics,computational analysis): X.M. Shao, R. Bhattacharya, J. Huang,I.K.A. Sivakumar, C. Tokheim, L. Zheng, B. Kaminow, A. Omdahl,V.E. Velculescu, V. Anagnostou, R. Karchin

Writing, review, and/or revision of the manuscript: X.M. Shao, R. Bhattacharya,J. Huang, I.K.A. Sivakumar, C. Tokheim, L. Zheng, D. Hirsch, A. Omdahl,M. Bonsack, A.B. Riemer, V.E. Velculescu, V. Anagnostou, K.A. Pagel, R. KarchinAdministrative, technical, or material support (i.e., reporting or organizing data,constructing databases): J. Huang, D. Hirsch, M. Bonsack, A.B. RiemerStudy supervision: V.E. Velculescu, R. Karchin

AcknowledgmentsPart of this research project was conducted using computational resources at

the Maryland Advanced Research Computing Center (MARCC). This work wassupported in part by a William R. Brody Faculty Scholarship to R. Karchin,Dr. Miriam and Sheldon G. Adelson Medical Research Foundation, the Stand UpTo Cancer–Dutch Cancer Society International Translational Cancer ResearchDream Team Grant (SU2C-AACR-DT1415), the Commonwealth Foundation, theU.S. NIH (grants CA121113, CA006973, and CA180950), the V Foundation, andLUNGevity to V.E. Velculescu. Stand Up To Cancer is a division of theEntertainment Industry Foundation. Research grants are administered by theAmerican Association for Cancer Research, the Scientific Partner of SU2C.

The costs of publication of this article were defrayed in part by the payment of pagecharges. This article must therefore be hereby marked advertisement in accordancewith 18 U.S.C. Section 1734 solely to indicate this fact.

Received June 20, 2019; revised October 8, 2019; accepted December 20, 2019;published first December 23, 2019.

References1. Anagnostou V, Smith KN, Forde PM, Niknafs N, Bhattacharya R, White J, et al.

Evolution of neoantigen landscape during immune checkpoint blockade in non–small cell lung cancer. Cancer Discov 2017;7:264–76.

2. Yarchoan M, Johnson BA, Lutz ER, Laheru DA, Jaffee EM. Targeting neoanti-gens to augment antitumour immunity. Nat Rev Cancer 2017;17:209–22.

3. Lundegaard C, Lund O, Buus S, Nielsen M. Major histocompatibility complexclass I binding predictions as a tool in epitope discovery. Immunology 2010;130:309–18.

4. Andreatta M, Nielsen M. Gapped sequence alignment using artificial neuralnetworks: application to the MHC class I system. Bioinformatics 2016;32:511–7.

5. Kim Y, Sidney J, Buus S, Sette A, Nielsen M, Peters B. Dataset size andcomposition impact the reliability of performance benchmarks for peptide-MHC binding predictions. BMC Bioinformatics 2014;15:241.

6. Kim Y, Sidney J, Pinilla C, Sette A, Peters B. Derivation of an amino acidsimilarity matrix for peptide: MHC binding and its application as a Bayesianprior. BMC Bioinformatics 2009;10:394.

7. Trolle T, McMurtrey CP, Sidney J, Bardet W, Osborn SC, Kaever T, et al. Thelength distribution of class I-Restricted T cell epitopes is determined by bothpeptide supply and MHC allele-specific binding preference. J Immunol 2016;196:1480–7.

8. BonsackM,Hoppe S,Winter J, TichyD, Zeller C, KupperMD, et al. Performanceevaluation of MHC class-I binding prediction tools based on an experimentallyvalidated MHC-peptide binding data set. Cancer Immunol Res 2019;7:719–36.

9. Gfeller D, Bassani-Sternberg M, Schmidt J, Luescher IF. Current tools forpredicting cancer-specific T cell immunity. Oncoimmunology 2016;5:e1177691–e.

10. Liu XS, Mardis ER.Applications of immunogenomics to cancer. Cell 2017;168:600–12.

11. The problem with neoantigen prediction. Nat Biotechnol 2017;35:97.12. Wieczorek M, Abualrous ET, Sticht J, Alvaro-Benito M, Stolzenberg S, Noe F,

et al. Major histocompatibility complex (MHC) class I and MHC class IIproteins: conformational plasticity in antigen presentation. Front Immunol2017;8:292.

13. Lu YC, Robbins PF. Targeting neoantigens for cancer immunotherapy.Int Immunol 2016;28:365–70.

14. Jurtz V, Paul S, Andreatta M, Marcatili P, Peters B, NielsenM. NetMHCpan-4.0:improved peptide–MHC class I interaction predictions integrating eluted ligandand peptide binding affinity data. J Immunol 2017;199:3360–8.

15. Karosiene E, Rasmussen M, Blicher T, Lund O, Buus S, Nielsen M.NetMHCIIpan-3.0, a common pan-specific MHC class II prediction meth-

od including all three human MHC class II isotypes, HLA-DR, HLA-DPand HLA-DQ. Immunogenetics 2013;65:711–24.

16. Lundegaard C, Lamberth K, Harndahl M, Buus S, Lund O, Nielsen M.NetMHC-3.0: accurate web accessible predictions of human, mouse andmonkey MHC class I affinities for peptides of length 8–11. Nucleic Acids Res2008;36:509–12.

17. Nielsen M, Lundegaard C, Blicher T, Lamberth K, Harndahl M, Justesen S, et al.NetMHCpan, a method for quantitative predictions of peptide binding to anyHLA-A and -B locus protein of known sequence. PLoS One 2007;2:1–10.

18. O'Donnell TJ, Rubinsteyn A, Bonsack M, Riemer AB, Laserson U, Hammer-bacher J. MHCflurry: open-source class I MHC binding affinity prediction.Cell Syst 2018;7:129–32.

19. Bassani-SternbergM, Coukos G.Mass spectrometry-based antigen discovery forcancer immunotherapy. Curr Opin Immunol 2016;41:9–17.

20. Schumacher TN, Schreiber RD. Neoantigens in cancer immunotherapy. Science2015;348:69–74.

21. Bulik-Sullivan B, Busby J, Palmer CD, Davis MJ, Murphy T, Clark A, et al. Deeplearning using tumor HLA peptide mass spectrometry datasets improvesneoantigen identification. Nat Biotechnol 2019;37:55–63.

22. Bassani-Sternberg M, Pletscher-Frankild S, Jensen LJ, MannM.Mass spectrom-etry of human leukocyte antigen class I peptidomes reveals strong effects ofprotein abundance and turnover on antigen presentation. Mol Cell Proteomics2015;14:658–73.

23. Boehm KM, Bhinder B, Raja VJ, Dephoure N, Elemento O. Predicting peptidepresentation by major histocompatibility complex class I: an improved machinelearning approach to the immunopeptidome. BMC Bioinformatics 2019;20:7.

24. Hochreiter S, Schmidhuber J. Long short-termmemory. Neural Computat 1997;9:1735–80.

25. Tan C, Sun F, Kong T, Zhang W, Yang C, Liu C. A survey on deep transferlearning. In: Ku� rkov�a V, Manolopoulos Y, Hammer B, Iliadis L, Maglogiannis I,editors. Artificial neural networks and machine learning – ICANN 2018.Proceedings, Part III, of the 27th International Conference on Artificial NeuralNetworks; 2018 Oct 4–7; Rhodes, Greece. Cham (Switzerland): Springer; 2018.p. 270–9.

26. Abelin JG, Keskin DB, Sarkizova S, Hartigan CR, Zhang W, Sidney J, et al. Massspectrometry profiling of HLA-associated peptidomes in mono-allelic cellsenables more accurate epitope prediction. Immunity 2017;46:315–26.

27. Lefranc MP, Giudicelli V, Duroux P, Jabado-Michaloud J, Folch G, Aouinti S,et al. IMGT(R), the international ImMunoGeneTics information system(R)25 years on. Nucleic Acids Res 2015;43:D413–22.

High-Throughput Prediction of Neoantigens with MHCnuggets

AACRJournals.org Cancer Immunol Res; 8(3) March 2020 407

Page 67: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

28. Sidney J, Assarsson E, Moore C, Ngo S, Pinilla C, Sette A, et al. Quantitativepeptide binding motifs for 19 human and mouse MHC class I molecules derivedusing positional scanning combinatorial peptide libraries. Immunome Res2008;4:2.

29. Gfeller D, Guillaume P, Michaux J, Pak HS, Daniel RT, Racle J, et al. The lengthdistribution and multiple specificity of naturally presented HLA-I ligands.J Immunol 2018;201:3705–16.

30. Greenbaum J, Sidney J, Chung J, Brander C, Peters B, Sette A. Functionalclassification of class II human leukocyte antigen (HLA) molecules reveals sevendifferent supertypes and a surprising degree of repertoire sharing across super-types. Immunogenetics 2011;63:325–35.

31. Abadi Mn, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, et al. TensorFlow:large-scale machine learning on heterogeneous distributed systems. CoRR 2016;abs/1603.04467. Available from: https://www.tensorflow.org/.

32. Chollet F, others. Keras. GitHub; 2015. Available from: https://github.com/keras-team/keras.

33. Hundal J, Carreno BM, Petti AA, Linette GP, Griffith OL, Mardis ER, et al.pVAC-Seq: a genome-guided in silico approach to identifying tumor neoanti-gens. Genome Med 2016;8:11.

34. Hundal J, Kiwala S, McMichael J, Miller CA, Xia H, Wollam AT, et al.pVACtools: a computational toolkit to identify and visualize cancer neoantigens.Cancer Immunol Res 2020;8:409–20.

35. Wood MA, Paralkar M, Paralkar MP, Nguyen A, Struck AJ, Ellrott K, et al.Population-level distribution and putative immunogenicity of cancer neoepi-topes. BMC Cancer 2018;18:414.

36. NielsenM, Lundegaard C,Worning P, Lauemoller SL, Lamberth K, Buus S, et al.Reliable prediction of T-cell epitopes using neural networks with novel sequencerepresentations. Protein Sci 2003;12:1007–17.

37. Kingma DP, Ba J. Adam: A method for stochastic optimization. In: Proceedingsof the Third International Conference on Learning Representations, ICLR 2015;2015 May 7–9; San Diego, CA.

38. Gal YA, Ghahramani Z. A theoretically grounded application of dropout inrecurrent neural networks. In: Lee DD, Sugiyama M, Luxburg UV, Guyon I,Garnett R. Advances in Neural Information Processing Systems 29 (NIPS 2016);2016 Dec 5–10; Barcelona, Spain. San Diego (CA): Neural Information Proces-sing Systems; 2016.

39. Jensen KK, Andreatta M, Marcatili P, Buus S, Greenbaum JA, Yan Z, et al.Improved methods for predicting peptide binding affinity to MHC class IImolecules. Immunology 2018;154:394–406.

40. Vita R, Mahajan S, Overton JA, Dhanda SK, Martini S, Cantrell JR, et al. TheImmune Epitope Database (IEDB): 2018 update. Nucleic Acids Res 2019;47:D339–D43.

41. Hoof I, Peters B, Sidney J, Pedersen LE, Sette A, Lund O, et al. NetMHCpan, amethod for MHC class I binding prediction beyond humans. Immunogenetics2009;61:1–13.

42. Trolle T, Metushi IG, Greenbaum JA, Kim Y, Sidney J, Lund O, et al. Automatedbenchmarking of peptide-MHCclass I binding predictions. Bioinformatics 2015;31:2174-.

43. BaileyMH, Tokheim C, Porta-Pardo E, Sengupta S, Bertrand D,Weerasinghe A,et al. Comprehensive characterization of cancer driver genes andmutations. Cell2018;173:371–85.

44. Ellrott K, Bailey MH, Saksena G, Covington KR, Kandoth C, Stewart C, et al.Scalable open science approach for mutation calling of tumor exomes usingmultiple genomic pipelines. Cell Syst 2018;6:271–81.

45. Thorsson V, Gibbs DL, Brown SD, Wolf D, Bortone DS, Ou Yang TH, et al. Theimmune landscape of cancer. Immunity 2018;48:812–30.

46. Li B, Dewey CN. RSEM: accurate transcript quantification fromRNA-Seq data with or without a reference genome. BMC Bioinformatics2011;12:323.

47. Szolek A, Schubert B, Mohr C, SturmM, FeldhahnM, Kohlbacher O. OptiType:precision HLA typing from next-generation sequencing data. Bioinformatics2014;30:3310–6.

48. UniProt Consortium. Activities at the universal protein resource (UniProt).Nucleic Acids Res 2014;42:D191–8.

49. Tokheim C, Bhattacharya R, Niknafs N, Gygax DM, Kim R, Ryan M, et al.Exome-scale discovery of hotspot mutation regions in human cancer using 3Dprotein structure. Cancer Res 2016;76:3719–31.

50. Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practicaland powerful approach to multiple testing. J Royal Stat Soc: Ser B 1995;57:289–300.

51. Tokheim C, Karchin R. CHASMplus reveals the scope of somatic missensemutations driving human cancers. Cell Syst 2019;9:9–23.

52. Newman AM, Liu CL, Green MR, Gentles AJ, Feng W, Xu Y, et al. Robustenumeration of cell subsets from tissue expression profiles. Nat Methods 2015;12:453–7.

53. Chen B, Khodadoust MS, Liu CL, Newman AM, Alizadeh AA. Profilingtumor infiltrating immune cells with CIBERSORT. Methods Mol Biol 2018;1711:243–59.

54. Vita R, Overton JA, Greenbaum JA, Ponomarenko J, Clark JD, Cantrell JR,et al. The Immune Epitope Database (IEDB) 3.0. Nucleic Acids Res 2015;43:D405–D.

55. Karakas B, Bachman KE, Park BH.Mutation of the PIK3CA oncogene in humancancers. Br J Cancer 2006;94:455–9.

56. Tomlinson DC, Hurst CD, Knowles MA. Knockdown by shRNA identifiesS249C mutant FGFR3 as a potential therapeutic target in bladder cancer.Oncogene 2007;26:5889–99.

57. Tran E, Ahmadzadeh M, Lu YC, Gros A, Turcotte S, Robbins PF, et al.Immunogenicity of somatic mutations in human gastrointestinal cancers.Science 2015;350:1387–90.

58. Malekzadeh P, Pasetto A, Robbins PF, Parkhurst MR, Paria BC, Jia L,et al. Neoantigen screening identifies broad TP53 mutant immuno-genicity in patients with epithelial cancers. J Clin Invest 2019;129:1109–14.

59. Parkhurst MR, Robbins PF, Tran E, Prickett TD, Gartner JJ, Jia L, et al. Uniqueneoantigens arise from somatic mutations in patients with gastrointestinalcancers. Cancer Discov 2019;9:1022–35.

60. Chakravarty D, Gao J, Phillips S, Kundra R, Zhang H, Wang J, et al. OncoKB: aprecision oncology knowledge base. JCO Precis Oncol 2017 Jul;2017. doi:10.1200/PO.17.00011.

61. Qing J, Du X, Chen Y, Chan P, Li H, Wu P, et al. Antibody-based targeting ofFGFR3 in bladder carcinoma and t(4;14)-positive multiple myeloma in mice.J Clin Invest 2009;119:1216–29.

62. Marty R, Kaabinejadian S, Rossell D, Slifker MJ, van de Haar J, Engin HB, et al.MHC-I genotype restricts the oncogenic mutational landscape. Cell 2017;171:1272–83.

63. Bassani-Sternberg M, Chong C, Guillaume P, Solleder M, Pak H, Gannon PO,et al. Deciphering HLA-I motifs across HLA peptidomes improves neo-antigenpredictions and identifies allostery regulating HLA specificity. PLoS ComputBiol 2017;13:e1005725.

64. Rech AJ, Balli D, Mantero A, Ishwaran H, Nathanson KL, Stanger BZ, et al.Tumor immunity and survival as a function of alternative neopeptides in humancancer. Cancer Immunol Res 2018;6:276–87.

Cancer Immunol Res; 8(3) March 2020 CANCER IMMUNOLOGY RESEARCH408

Shao et al.

Page 68: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

Resource Report

Modeling Cellular Response in Large-ScaleRadiogenomic Databases to Advance PrecisionRadiotherapyVenkata SK. Manem1,2, Meghan Lambie1,2, Ian Smith1,2,3, Petr Smirnov1,2,3, Victor Kofia1,Mark Freeman1, Marianne Koritzinsky1,4,5, Mohamed E. Abazeed6,7,Benjamin Haibe-Kains1,2,3,8,9, and Scott V. Bratman1,2,4

Abstract

© 2019 American Association for Cancer Research

RadioGx is a computational toolbox that integrates cell line molecular data with radiation response, drug response, andmathematical modeling to advance preclinical research.

Breast

Cell lines mTOR

EGFR

DNAreplication

Chromatin

100% 1%

Oxygen concentration

Dose (Gy)

∗∗

Sur

vivi

ng f

ract

ion

Put

ativ

e g

enes

Dose (Gy)S

urvi

ving

fra

ctio

n

Lung Liver Pancreas

Enr

ichm

ent

sco

re0

.00

.20

.40

.60

.8

Ranked AUC

–0.3

–0.1

0.1

0.2

0.3

Ranked SF2

Esophagus

Radiotherapy is integral to the care of a majorityof patients with cancer. Despite differences intumor responses to radiation (radioresponse),dose prescriptions are not currently tailored toindividual patients. Recent large-scale cancer cellline databases hold the promise of unravelling thecomplex molecular arrangements underlying cel-lular response to radiation, which is critical fornovel predictive biomarker discovery. Here, wepresent RadioGx, a computational platform forintegrative analyses of radioresponse using radio-genomic databases. We fit the dose–response datawithin RadioGx to the linear-quadratic model. Theimputed survival across a range of dose levels(AUC) was a robust radioresponse indicator thatcorrelated with biological processes known tounderpin the cellular response to radiation. UsingAUC as a metric for further investigations, wefound that radiation sensitivity was significantlyassociatedwith disruptivemutations in genes relat-ed to nonhomologous end joining. Next, by sim-ulating the effects of different oxygen levels, weidentified putative genes that may influence radioresponse specifically under hypoxic conditions. Furthermore, usingtranscriptomic data, we found evidence for tissue-specific determinants of radioresponse, suggesting that tumor type couldinfluence the validity of putative predictive biomarkers of radioresponse. Finally, integrating radioresponse with drug responsedata, we found that drug classes impacting the cytoskeleton, DNA replication, andmitosis display similar therapeutic effects toionizing radiation on cancer cell lines. In summary, RadioGx provides a unique computational toolbox for hypothesisgeneration to advance preclinical research for radiation oncology and precision medicine.

Significance: The RadioGx computational platform enables integrative analyses of cellular response to radiation with drugresponses and genome-wide molecular data.

Graphical Abstract: http://cancerres.aacrjournals.org/content/canres/79/25/6227/F1.large.jpg.See related commentary by Spratt and Speers, p. 6076

1Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario,Canada. 2Department of Medical Biophysics, University of Toronto, Toronto,Ontario, Canada. 3Vector Institute, Toronto, Ontario, Canada. 4Department ofRadiation Oncology, University of Toronto, Toronto, Ontario, Canada. 5Instituteof Medical Sciences, University of Toronto, Toronto, Ontario, Canada. 6Depart-ment of Translational Hematology Oncology Research, Cleveland, Ohio.7Department of Radiation Oncology, Cleveland Clinic, Cleveland, Ohio. 8Depart-ment of Computer Science, University of Toronto, Toronto, Ontario, Canada.9Ontario Institute of Cancer Research, Toronto, Ontario, Canada.

Note: Supplementary data for this article are available at Cancer ResearchOnline (http://cancerres.aacrjournals.org/).

V.S.K. Manem and M. Lambie contributed equally to this article.

Corresponding Authors: Scott V. Bratman, Princess Margaret Cancer Centre,University Health Network, 101 College Street, MaRS/PMCRT, 13-305, Toronto,ON M5G1L7, Canada. Phone: 416-634-7077; Fax: 416-946-6561; E-mail:[email protected]; and Benjamin Haibe-Kains,[email protected]

Cancer Res 2019;79:6227–37

doi: 10.1158/0008-5472.CAN-19-0179

�2019 American Association for Cancer Research.

CancerResearch

www.aacrjournals.org 6227

Page 69: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

IntroductionRadiotherapy is routinely used as curative therapy for patients

with cancer. Recent technologic advances have considerably aug-mented the physical precision of radiotherapy, resulting inimproved cure rates and less toxicity (1–3). Biologically motivat-ed improvements (such as the addition of radiosensitizing drugs)to radiotherapy delivery have not seen such dramatic improve-ments despite the known differences in radiation efficacy thatexist among patients with a particular tumor type. This is due inpart to a lack of predictive biomarkers on which to stratifypatients. Instead, the stratification of patients to different radio-therapy-containing regimens continues to be based primarily onclinical variables such as tumor stage.

The biological determinants of cellular response to radiation,referred to as radioresponse, are complex and include bothgenomically-based cell-intrinsic and external microenvironmen-tal factors (4, 5). Intrinsic radiosensitivity varies among individualtumors of the same type with implications for optimal radiother-apy dosing and curability. Measurement of intrinsic radiosensi-tivity in molecularly-characterized cancer cell lines could providethe radiogenomic data necessary to develop radioresponse pre-dictors. However, despite decades of research there remains noclinically utilized radiosensitivity biomarker discovered from cellculture radiogenomic studies. Reasons for this include the needfor clonogenic assays when measuring intrinsic radiosensitivityin vitro, which are cumbersome and not amenable to largescreens (6, 7). Furthermore, radiosensitivity varies with dose ina complex and tumor-specific manner, rendering measurementsat multiple dose levels a necessity.

Most short-term cytotoxicity assays amenable for high-throughput analysis of drug response have endpoints at 72 hours.These assays are inappropriate for measuring radiosensitivitybecause of the delayed cellular death by mitotic catastrophe thatoften occurs following radiation exposure (8). To address thislimitation, an extended (9-day) viability assay was developed as asurrogate for clonogenic survival that is amenable to high-throughput processing (9). This assay was recently applied to533 cancer cell lines with multiple radiation dose levels (10),becoming the largest radioresponse dataset published by a sig-nificantmargin. This increase in scale of radioresponse data couldlead to robust predictive biomarkers. However, full utilization bythe research community requires sophisticated analysis tools thatcan appropriately model cellular response to radiation and seam-lessly integrate associated molecular and pharmacogenomic pro-files of cell lines.

In this study, we performed a preclinical assessment of intrinsicradiosensitivity using large-scale radiogenomic datasets. Wesought to (i) model dose–response data using the linear-quadratic (LQ) model (11); (ii) integrate the modeled radio-response profiles with transcriptomic data to determine pathway-and tissue-specific determinants of radioresponse; (iii) identifymutations associated with radioresponse; (iv) estimate radio-response under hypoxic conditions; and (v) identify classes ofdrugs with cytotoxic effects that correlate with radioresponse. Tofacilitate these and other future analyses, we developed RadioGx,a new computational toolbox enabling comparative and integra-tive analysis of radiogenomic datasets. Our work provides aframework for future hypothesis generation and preclinicalassessments of radioresponse using appropriate biological assaysand indicators.

Materials and MethodsCuration of dose–response, transcriptomic, and mutation data

Supplementary Table S1 presents the sensitivity and transcrip-tomic datasets that are used in this study. Supplementary Table S2presents the functionality of the RadioGx package. Cell linegenomic studies often lack standardized identifiers. To overcomethis, we assigned a unique identifier to each cell line and radiationtreatment. Within RadioGx we implemented a RadioSet (RSet), adata container storing radiation dose-response and moleculardata along with experimental metadata (detailed structure pro-vided in Supplementary Fig. S1). In addition, the RSet alsoenables efficient implementation of curated annotations andmolecular features for cell lines, which facilitates comparisonsbetween different datasets. We have implemented a unique set offunctions that enables users to analyze radiogenomic datasets.One of the primary functions is the downloadRSet that allows usersto download the RadiationSet (RSet) object. We have also incor-porated a function, linearQuadraticModel, which fits the radiationcell survival data using the standard radio-biological formalism,the LQ model. This function uses a normal error distribution bydefault, but users also have an option to choose Cauchy distri-bution. For a given dataset by the end user, this function fits thedataset with the LQ model, and returns radiobiological para-meters alpha and beta along with the goodness-of-fit. To extractseveral features from this curve, we have implemented the func-tions computeAUC, which enables users to compute area under thesurvival curve (AUC), computeSF2 function, which returns thefraction of cells that survive a radiation dose of 2 Gy, andcomputeD10 function, which returns the radiation dose at whichonly 10% of cells survive.

Radiobiologic modelRadiobiologic modeling is used to allow comparisons of var-

ious clinically relevant radiotherapy treatment regimens. Themost common formulation in current clinical practice is the LQmodel (11), which assumes that there are 2 components to cellkilling induced by radiation: one that is proportional to dose(linear, a) and another that is proportional to the square of thedose (quadratic, b). The LQ model describes the fraction of cellsthat survived (S) a uniform dose D (Gy); the survival fraction ofcells after irradiating with an acute dose D is given by:

S ¼ exp �aD� bD2� �

: ðAÞ

The ratio ab varies by the cell population or tissue that is

being irradiated, and reflects the response to different fraction-ation schemes. Cell populations or tissues with a high value areless sensitive to the effects of fractionation than those with a lowvalue.

Mutation analysisMutation data were obtained through the Cancer Cell Line

Encyclopedia (CCLE) data portal (DepMap 18Q3 data release).The mutation data used in the analysis are incorporated into theRadioSet as one of the molecular profiles. Ensembl Variant EffectPredictor (VEP; Ensembl release 96 – April 2019) was used toannotate predicted functional impact of mutations. Impact rat-ings were categorized into 'High' and 'Not High', with the lattercategory encompassing "Modifier," "Moderate," and "Low"impact mutations (12). For genes with multiple mutations, the

Manem et al.

Cancer Res; 79(24) December 15, 2019 Cancer Research6228

Page 70: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

highest impact mutation was considered. Radiation responseAUC values were compared between the 3 groups: (i) cell lineswith VEP-Highmutations; (ii) cell lines withmutations that werenot VEP-High; and (iii) cell lines with no mutations (wildtype,WT).WilcoxonU test compared VEP-High andWTgroups. Cohend values to assess effect size were calculated for each individualgene using the effsize package (v0.7.4). For thick forest plotcreation, we used the metaviz package (v0.3.0) with defaultparameters.

Radiobiological modeling of hypoxiaThe LQmodel can also be used to model the effect of hypoxia.

Hypoxia is a hallmark of many solid malignant tumors andinfluences tumor progression, therapy resistance, developmentof metastases, clinical behavior, and response to conventionaltreatments like radiotherapy. The survival fraction of cells due to agiven radiotherapy dose is given by Eq. A under well-oxygenated,or normoxic conditions. However, the surviving fraction of cellsmay vary depending on the amount of oxygen concentration inthe tumor, as cells in the hypoxic region are considered to bemoreresistant to radiation therapy. This hypoxic effect can be incor-porated into the LQmodel using the, "oxygen enhancement ratio(OER)," which can be normalized to yield the "oxygen modifi-cation factor (OMF)" (13). OMF is defined as follows:

OMF ¼ OERðO2ÞOERm

¼ 1OERm

ðOERm �O2Þ þ Km

O2 þ KmðBÞ

where O2 is the oxygen concentration in the system in mmHg, Km ¼ 3 mm Hg, defined as the oxygen at which half of theratio is achieved, and OERm ¼ 3 is the maximum value at well-oxygenated condition. Therefore, the LQmodel given in Eq. A canbe modified to include oxygen concentration as follows:

S ¼ expð�aOMFD� b OMFDð Þ2Þ: ðCÞ

In general, the OER can be a function of radiation dose. Somestudies have suggested that the maximal oxygen enhancementvaries in the range of 2.5 to 3 with differences in radiationdosage (14–16). This can be simply included into the revisedLQmodel by considering different OERs for the parameters a andb, that is, OERa and OERb. However, because we consider thenormalized OER (or, OMF), the introduction of these separateterms will not produce a significant difference in the final survivalfraction. Thus, we assume OERa ¼ OERb in our mathematicalframework. We assume that the system is moderately hypoxic,that is, approximately 5 mm HG for this study.

Associationwithdrug response andpharmacologic enrichmentanalysis

We used CTRPv2 dataset in PharmacoGx package (v1.10.3;ref. 17) that has 545 drugs to compute the association betweenradioresponse and drug response (defined by the AUC of the Hillfunction). We also performed pharmacologic enrichment analy-sis, an adaptation of the gene set enrichment analysis (GSEA)methodology. For this, we computed the correlation of radio-response with each drug response, and a pharmacologic setrepresents a gene set. Similar to the GSEA method, a runningsum is calculated, starting with the first compound-level statisticto the last. The sum is increased if a compound-level statisticbelongs to the pharmacologic class of interest, otherwise, the sum

is decreased. The enrichment score of the pharmacologic class ofinterest is defined as the maximum deviation from zero of therunning sum (Supplementary Fig. S2; ref. 18).

Pathway analysisPathway enrichment analysis on the gene expression data

was carried out using GSEA (19) with pathways defined byQIAGEN's Ingenuity Pathway Analysis (IPA; QIAGEN Red-wood City, www.qiagen.com/ingenuity). Genes were rankedbased on their coefficient of correlation between the geneexpressions and the radioresponse of interest (AUC or SF2).GSEA was then used to compute the enrichment score foreach pathway with statistical significance calculated using apermutation test (10,000 permutations) as implemented in thepiano package (20). Nominal P values obtained for each path-way are corrected for multiple testing using the false discoveryapproach (FDR; ref. 21).

Research reproducibilityRadioGx is implemented in R and is freely available from the

Comprehensive R Archive Network (CRAN) from cran.r-project.org/web/packages/RadioGx/. The code, documentation, anddetailed tutorial describing how to run our pipeline and repro-duce our analysis results are open-source and publicly availablethrough the RadioGx GitHub repository (https://github.com/bhklab/RadioGx-analysis). A virtual machine reproducing thefull software environment is available on Code Ocean (DOI:10.1101/449793). Our study complies with the guidelines out-lined in refs. 22–24. All the data are available in the form of RSetobjects with associated digital object identifiers (DOI).

ResultsThe RadioGx platform

To realize the full potential of large-scale radiogenomics data-sets for robust biomarker discovery, we developed the RadioGxsoftware package (Supplementary Fig. S1). RadioGx representsthefirst computational toolbox that integrates radioresponse datawith radiobiologic modeling and molecular data from hundredsof cancer cell lines. Within RadioGx, datasets are standardizedwith comprehensive cell line annotations including the type ofradioresponse assay (i.e., clonogenic assay and 9-day viabilityassay) and indicators used to generate dose-response data (i.e.,SF2 and AUC). RadioGx enables fitting of dose–response datausing established radiobiologic models, quality control in orderto investigate the consistency and biological plausibility of radio-response assays and indicators, and integration of these data withother data types and radioresponse models.

Modeling radiation response within RadioGxMultiple dose–response measurements from the same cell line

can be incorporated into established radiobiological models topredict the effect of specific perturbations (e.g., radiotherapyfraction size or hypoxia) on radioresponse. Within RadioGx, weapplied the commonly used LQ model to fit 9-day viabilityassay data for 533 cancer cell lines (Fig. 1A; ref. 10). The LQmodel goodness-of-fit was high for the majority of cell lines(median R2 ¼ 0.958; Supplementary Fig. S3A and S3B). For498/533 (93%) of cell lines, the model fit the experimental datareasonably well (R2 � 0.6); these cell lines were retained forsubsequent analyses.

A Radiogenomic Resource to Advance Precision Radiotherapy

www.aacrjournals.org Cancer Res; 79(24) December 15, 2019 6229

Page 71: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

Using the LQ model for each cell line, we calculated AUC as asummary radioresponse indicator that is independent of a specificdose level. As expected, a range of radioresponse profiles wereseen (Fig. 1B). We next compared AUC and dose-specific survivaldata (SF2, SF4, SF6, and SF8) from the 9-day viability assay withclonogenic survival data generated by Yard and colleagues for asubset of cell lines (Fig. 1C; Supplementary Fig. S4A–S4E; ref. 10).Weobserved high Pearson correlation (R� 0.8) for AUC (n¼ 15),SF2 (n¼12), SF4 (n¼15), and SF6 (n¼15), but SF8 showedonlymoderate correlation (R ¼ 0.64; n ¼ 11), consistent with priorobservations suggesting poor reproducibility of survival assaysfollowing high radiation doses (25). Taken together, the 9-dayviability assay provides a robust surrogate for clonogenic survivalat a range of radiation doses. Moreover, the LQ model withinRadioGx allows for characterization of radioresponse and deri-vation of radioresponse indicators for the vast majority of cancercell lines.

Comparison of radioresponse indicatorsSummary indicators of radioresponse are useful for preclinical

investigations. As radioresponse data within RadioGx has been fitto the LQ model, one could describe radioresponse throughimputed survival across a range of dose levels (AUC) or at aspecific dose level (e.g., SF2). There is currently no consensusregarding the optimal indicator for use across studies, with bothAUC and SF2 frequently used (26–29). The use of SF2 as aradioresponse indicator has been bolstered by clinical observa-tions that local tumor control following radiotherapy may beassociated with SF2 measured from ex vivo tumor cells (30).Moreover, SF2 is thought to differentiate between radiosensitiveand radioresistant cell types (31). However, there is currentlyinsufficient evidence to support the routine use of SF2 or AUCwhen probing the molecular determinants of radioresponse.

We comparedAUCandSF2 across all cell lineswithin RadioGx.The values were well correlated (R¼ 0.92; 95%CI, 0.90–0.93; P¼2.2e�16; Fig. 2A); the weakest correlations were observed amongthe most radioresistant cell lines, where cell death at higher doses

likely contributes to the AUC value but by definition has nobearing on SF2 (Fig. 2B). We then asked whether the biologicalprocesses that govern these 2 radioresponse indicators are thesame. To achieve this, we correlated the basal level gene expres-sion data from the CCLE (32) with the radioresponse indicators(SF2 and AUC), and performed GSEA on the gene list rankedbased on correlation estimates. For FDR <5%, 77 transcriptionalpathways were enriched using AUC as the radioresponse indica-tor, out of which, 41 and 36 pathways were positively andnegatively correlated with AUC, respectively (SupplementaryFile 1; Supplementary Fig. S5). Similarly, using SF2 as theradioresponse indicator, only 38 pathways were enriched, out ofwhich, 19 were positively correlated with the SF2 value. All but 3pathways enriched using SF2 were enriched using AUC (Fig. 2Cand D).

The 17 pathways that were significantly correlated with radio-response using the AUC indicator but not the SF2 indicatorincluded biological processes known to impact radioresponse,suggesting stronger relevance for AUC. For instance, the NRF2-mediated oxidative stress response pathway was positively asso-ciated with AUC but not with SF2 (Supplementary File 1). Inconditions of oxidative stress, such as following radiation, deg-radation of NRF2 is prevented, leading to its stabilizationand translocation into the nucleus, where it activates expressionof a wide variety of downstream antioxidant targets (33); thispathwayhaspreviously beendescribed as contributing to intrinsicradioresistance (9, 34). In addition, progression through the cellcycle following radiation response is a known factor in determin-ing cell survival vs. cell death via mitotic catastrophe. Threepathways directly related to cell-cycle progression [(i) cell cycle:G2–M DNA damage checkpoint regulation; (ii) cell cycle: G1–Scheckpoint regulation; and (iii) cell-cycle control of chromosom-al replication] were all seen exclusively when using AUC as theradioresponse indicator. Thus, as compared with SF2, AUC wasable to capture more gene expression pathways putatively corre-latedwith radioresponse. Taken together, our analyses reveal AUCand SF2 as related radioresponse indicators with AUC providing

Figure 1.

Fitting of dose–response data to the LQmodel and concordance of radiation response across assays.A, LQmodel fit using RadioGx on the SNU-245cholangiocarcinoma cell line (black) and SK-ES-1 Ewing sarcoma cell line (gray). The LQmodel describes the fraction of cells predicted to survive (y-axis) a

uniform radiation dose (x-axis) and is characterized by a and b components for each cell line. For SNU-245 and SK-ES-1, a ¼ 0:14 ðGy�1Þ; b ðGy�2Þ ¼ 0 and

a ¼ 0:45 ðGy�1Þ; b ¼ 0:02 ðGy�2Þ, respectively. Solid curves indicate the model fit and points denote experimental data (10). B, Histogram of AUC valuescalculated using the computeAUC function in RadioGx. C, Correlation (Pearson Rwith SD) of radioresponse results produced by the 9-day viability assay and thestandard clonogenic assay according to the following indicators: SF2, SF4, SF6, SF8, and AUC. Primary data were obtained from Yard and colleagues (10).

Manem et al.

Cancer Res; 79(24) December 15, 2019 Cancer Research6230

Page 72: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

for a more comprehensive characterization of the biologicalprocesses underpinning radioresponse. As a result of these find-ings, we exclusively used AUC as the radioresponse indicator forsubsequent analyses.

Radiobiological modeling to estimate impact of DNA repair onsurvival

The LQ model can be used to estimate the dependence ofcellular survival on radiotherapy fraction size and DNA repair.The a and b values produced by the LQ model allow forcomparisons among distinct cell lines or tumors, and in clinicalpractice the a=b ratio is used to predict cellular response todifferent radiotherapy fractionation schemes. Using the LQmodel, we derived the a=b ratio for cancer cell lines withinRadioGx. A wide range of a=b values were observed (Fig. 3A;median ¼ 10.14; interquartile range ¼ 4.49–28.07). Asexpected, the a component was strongly anticorrelated withAUC, whereas the b component displayed no significant asso-ciation with AUC (Fig. 3B). This result indicates that for the cellline data contained within RadioGx, dependence of cellularsurvival on radiotherapy fraction size is a distinct parameterthat describes radioresponse and should therefore be consid-ered alongside radiosensitivity (e.g., AUC or SF2) in preclinicalinvestigations.

To examine the biological factors that underlie the differencesbetween a, b, and AUC, we identified transcriptional pathways

that were significantly associated with each radioresponse metric.For FDR <5%, we found 14 pathways commonly associated withall 3 metrics (Fig. 3C; Supplementary File 2). Supporting thebiological relevance of these pathways, several known compo-nents of DNA damage response, signaling, and repair were repre-sented among the 14 common pathways. For instance, pathwaysrelated to mismatch repair in eukaryotes, role of BRCA1 in DNAdamage response, and cell-cycle control of chromosomal repli-cation were each present. These results, which are consistent withfundamental tenets of radiobiology, suggest that analysis of largecell line resourceswithin RadioGx could be performed to generatenovel hypotheses and could contribute to preclinical biomarkerdiscovery.

Mutation analysis of a positive control radiation damage-related gene set

High fidelity DNA repair is critical for cellular survival follow-ing exposure to ionizing radiation. Among the variousDNA repairprocesses, nonhomologous end joining (NHEJ) is the mostimportant determinant of radiation response (35). Loss-of-function mutations within NHEJ pathway components conferradiosensitivity. We therefore used mutations in 19 genes impli-cated in NHEJ (Supplementary Table S3) as positive controls inthe RadioGx datasets. Mutations were annotated according topredicted functional impact using VEP (12), and genes withoutpredicted high functional impact (`VEP-High') within our subset

Figure 2.

Concordance of SF2 and AUC. A, Correlation between the radioresponse indicators, SF2 and AUC, across 498 cell lines. B, Pearson correlation (with SD) betweenSF2 and AUC across 498 cell lines based on tertiles. C, Venn diagram illustrating the transcriptional pathways associated with radioresponse using SF2 or AUC asthe response indicator. D, FDR for each transcriptional pathway from C illustrating greater levels of statistical significance among pathways specific to AUC.

A Radiogenomic Resource to Advance Precision Radiotherapy

www.aacrjournals.org Cancer Res; 79(24) December 15, 2019 6231

Page 73: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

of cell lines were excluded. There was a strong overall associationbetween `VEP High' mutations and radioresponse (Fig. 4A).Associations on an individual gene basis was limited by smallsample size (Fig. 4B; Supplementary Fig. S6); however, VEP-Highmutations in RAD50 were significantly associated with radio-response. Overall, these results show that at the mutation level,the radioresponse data contained within RadioGx reflects knownclinical radiosensitivity trends. Thus, the positive control analysissupports the robustness of the datasets containedwithin RadioGxand lends support for further studies that utilize these data.

Modeling the effects of hypoxia on radioresponseBy integrating radioresponse and molecular data, RadioGx is

meant to enable new biological insights and predictions. Tofurther demonstrate the utility of RadioGx for this purpose, wenext extended the radiobiological modeling to incorporate theputative effects of oxygen availability in the tumor microenvi-ronment on radioresponse (13).

Molecular oxygen is necessary to mediate the indirect effects ofionizing radiation to exert cell kill. Thus, cells become moreresistant to radiation under oxygen-deficient conditions. Wederived adjusted AUC values for the cancer cell lines withinRadioGx at a range of oxygen partial pressures. As expected,reduced oxygen partial pressure resulted in a predicted increasein AUC (Fig. 5A). Cell lines from distinct cancer histologiesdisplayed consistent increases in AUC under hypoxic conditions(P < 2.2e�16 for all, Wilcoxon test), but the magnitude of thisincrease differed between histologies (Fig. 5B). The largest andsmallest median differences in AUC were observed for cancer celllines from the breast and large intestine, respectively. Thesedifferences reflect a nonlinear relationship between oxygen avail-ability and radioresponse that is dependent on a=b.

Next, we evaluated the univariate association of gene expres-sion levels measured under normoxic conditions with AUCvalues under normoxic and hypoxic conditions. For an FDR<5%, the numbers of genes that were significantly associated

Figure 3.

Distinct biological underpinnings of a=b derived from the LQmodel. A, Histogram of a=b ðGyÞ values obtained from the LQmodel across all cell lines. B, Pearsoncorrelations (with SD) between AUC and the a and b components of the LQmodel. C, Transcriptional pathways that are significantly associated with AUC,a, and/or b.

Figure 4.

Examining somatic mutations in positive control gene set for impact on radiation response. A, Boxplots (Tukey) displaying the radiation response of cell linesharboring mutations in at least one of the 14 positive control genes. Wilcoxon-U P values were determined by comparing "VEP High" and "WT" groups. B, Thickforest plot showing the Cohen d effect size of the impact of each individual mutation as well as the summary (aggregate of all 14 genes), with 95% confidenceinterval. Height of the error bars is proportional to the weight of the impact, which is influenced by cell line number within the "VEP High" mutation group.

Manem et al.

Cancer Res; 79(24) December 15, 2019 Cancer Research6232

Page 74: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

with radioresponse were 1,825 and 2,395 under normoxic andhypoxic conditions, respectively (Supplementary File 3). More-over, 1,375 genes were negatively associated with radioresponseunder normoxic condition but positively associated with radio-response under hypoxic condition, and 471 genes were positivelyassociated with radioresponse under normoxic condition butnegatively associated with radioresponse under hypoxic condi-tion (Supplementary Fig. S7). In keeping with these effects, weobserved large changes in the ranking of strength of correlation ofspecific genes with radioresponse under oxic and hypoxic condi-tions (Fig. 5C). The gene with the greatest change, PPM1A, hasbeen implicated in the regulation of cellular stress response andhas previously been shown to have hypoxia-specific activity (36).WDR70, a gene with known roles in DNA double-strand breakrepair (37, 38), also displayed a large change in this analysis(Fig. 5C). One might hypothesize based on our results thatWDR70 could have previously uncharacterized hypoxia-specificactivities and/or expression; these findings warrant furtherinvestigation.

Tissue specificity of radioresponse and repairIt is known that distinct tissues and tumor types respond

differently to ionizing radiation exposure. Intrinsic radiosensitiv-ity has been suggested as a major contributing factor to thisdifferential response (10). We used RadioGx to interrogate radio-response within tissue types (Fig. 6).

To examine the biological factors that may underlie suspecteddifferences in radioresponse between tissue types, we identified281 transcriptional pathways that were significantly associatedwith radioresponse within at least one tissue type (Fig. 6A; Sup-plementary File 4). Of these 281 pathways, 123 were statisticallysignificant only in one tissue type (Supplementary Fig. S8). Over-all, there were more statistically significant pathway associationswith radiosensitivity than radioresistance (total across all tissuetypes: 437 and 226, respectively). Remarkably, we did not findany transcriptional pathways that were statistically significantlyassociated with radioresponse across all tissue types. We alsoobserved variable a=b values among the tissue types within

RadioGx (Fig. 6B), suggesting heterogeneity of DNA repair anddependence on radiotherapy fraction size.

Androgen receptor (AR) expression has emerged as a mediatorof radioresistance in breast (10) and prostate cancer (39), but itseffect on radioresponse in other tissue types is poorly understood.We used RadioGx to interrogate AR expression and its associationwith radioresponse across tissue types. As expected, among theavailable tissue types, breast cancer cell lines expressed AR at thehighest levels and displayed the highest concordance between ARexpression and radioresponse (Supplementary Fig. S9A). Nota-bly, other tissue types including soft tissue, kidney, urinary tract,and stomach cancer showed similar concordance between ARexpression and radioresponse (Supplementary Fig. S9B). Ofthese, soft tissue cell lines demonstrated the highest AR expres-sion. Interestingly, AR expression has been found to be detectablein clinical samples (40) and is associated with radioresistance inrhabdomyosarcoma (41).

Common dependencies of therapeutic effects amongradiotherapy and drugs

Datasets within RadioGx are standardized with regard to cellline annotations such that integrated analyses using other existingdatasets can be easily conducted. For instance, our previouslypublished tool, PharmacoGx (17), contains pharmacogenomicdata frommultiple studies and enables meta-analysis of pharma-cogenomic data. We wished to identify categories of drugs withcytotoxic effects that correlate with radioresponse, so we interro-gated RadioGx to compare cellular responses to ionizing radia-tion and chemotherapeutic agents (n¼ 545 distinct drugs). Drugresponses were obtained from 480 cancer cell lines from theCTRPv2 pharmacogenomic dataset (Supplementary Table S1)that were in common between the datasets. We computed thecorrelation between drug response and radiation response acrossthe cancer cell lines (Supplementary Fig. S10) and then classifieddrugs according to pharmacologic categories (i.e., by cellulartargets and/or mechanisms of action). Drugs targeting the cyto-skeleton, DNA replication, and mitosis displayed the strongestcorrelations with radioresponse (FDR < 5%; Fig. 7). Thymidylate

Figure 5.

Integrative analysis of radiobiologic model with transcriptomic data and prediction of radioresponse under hypoxia.A, Hypothetical illustration of cancer cellsurviving fraction according to dose and oxygen partial pressure, as modeled using RadioGx. Solid curves are modeled using Eq. C (Materials and Methods). Thecomputed AUC values are 2.41, 2.71, 2.97 for normoxia (160 mmHg), hypoxic condition with 10 mmHg, and hypoxic condition with 5 mmHg, respectively. B,Changes in AUC by tissue type (with minimum of 15 cell lines within RadioGx) under normoxic (160 mmHg) or hypoxic (5 mmHg) conditions, ordered accordingto median AUC under normoxia. Tukey boxplots are shown. C, The difference in ranks are shown between the strength of univariate association of each genewith AUC under normoxic (160 mmHg) vs. hypoxic (5 mm Hg) conditions across cancer cell lines within RadioGx.

A Radiogenomic Resource to Advance Precision Radiotherapy

www.aacrjournals.org Cancer Res; 79(24) December 15, 2019 6233

Page 75: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

synthetase inhibitors such as the known radiosensitizing drug,fluorourocil, also displayed cytotoxic effects that correlatedwith radioresponse but did not reach statistical significance. Inaddition to these anticipated and largely confirmatory findings,we also observed unexpected negative associations betweenradioresponse and cytotoxic effects of drugs targeting numerouscell signaling pathways (i.e., PI3K signaling, ERKMAPK signaling,WNT signaling, EGFR signaling, ABL signaling), although thesewere not statistically significant.

DiscussionTo date, the paradigm of precision medicine has primarily

been applied to advanced incurable cancers. For early stagecurable cancers for which radiotherapy is used with curativeintent, there remains a need for more precise biologically-guided radiotherapy delivery. For instance, there are currentlyno clinically implemented molecular biomarkers that are pre-dictive of radioresponse. This also extends to predictive insights

Figure 6.

Tissue specificity of molecular determinants of radioresponse. A, The tissue types (columns) represented by aminimum of 15 cancer cell lines were consideredfor analysis. A total of 281 pathways are depicted (rows) and are annotated by function. Colors designate pathways significantly associated with AUC (FDR < 5%).B, Heterogeneity of a=b ratios across cancer cell lines derived from distinct tissue types ordered according to median values. In the violin plot, the white dotrepresents the median, the thick gray bar in the center represents the interquartile range, and the thin gray bar represents the rest of the distribution.

Manem et al.

Cancer Res; 79(24) December 15, 2019 Cancer Research6234

Page 76: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

into the response of tumors to other therapeutic agents thatmay be administered in combination with radiotherapy.Although molecular diagnostic tools are making their way intoclinical practice in other settings, the lack of equivalent molec-ular indicators in the field of radiobiology has impeded trans-lation in this domain (1, 7, 42).

Recently, large radioresponse and genomic datasets have beengenerated fromhundreds of cancer cell lines, providing an oppor-tunity to address this unmet need. We have developed RadioGx,an open-source software package that enables users to performintegrative analysis of radiogenomic datasets for preclinical eval-uation of radioresponse determinants. RadioGx standardizespublished nomenclature and annotations between datasets andintegrates dose–response and molecular data. Because theRadioGx platform is developed on cell-based model systems, itsuffers from certain limitations. Although cell lines have beenderived from primary tumor patient samples, they evolved underdifferent conditions temporally and may have accumulated cul-ture-dependent genomic alterations. More importantly, these cellculturesmay lack the endogenous hierarchical structure of tumors(comprising of stem cells, progenitor cells, differentiated cells,etc.) along withmicroenvironment components such as immunecells and other stromal components, all of which also playimportant roles in mediating tumor response to radiation ther-apy. We aimed to address many of these issues by using highquality data from large cell line factories with rigorous quality

controls and by focusing on intrinsic radiosensitivity as anoutcome.

We used RadioGx to evaluate the appropriateness of the 9-dayviability assay for assessing radioresponse, the robustness ofdistinct radioresponse indicators, and the utility of applyingestablished radiobiologicmodels to the data for novel hypothesisgeneration. We confirmed the findings from Yard and colleagues(10) that the 9-day viability assay, which is amenable to high-throughput processing and analysis, largely recapitulates theresults of the more tedious gold standard clonogenic assay. Wealso were able to show that the radiation response data matchedknown biomarkers of response. We note that some prior putativeintrinsic radiosensitivity gene expression signatures that weregenerated using cell line clonogenic survival data have failedvalidation using independent sets of cancer cell lines (27, 43),highlighting the need for robust reproducible methodologies forfuture studies. Moreover, we found that AUCderived from the LQmodel might provide a more complete characterization of thebiological processes underpinning radioresponse as comparedwith the dose-specific SF2 indicator, particularly for relativelyradioresistant cell types. Based on our findings, we suggestthat AUC should be the radioresponse indicator of choice forpreclinical studies. Although we found that the LQ model fit theradioresponse data for the vast majority of cancer cell lineswithin RadioGx, a small subset was not amenable to LQmodeling. Although their use within the LQ frameworkshould be approached with caution, it remains unclear whetherthese samples might reflect a distinct entity of tumors that donot abide the LQ formalism and may instead abide otheruseful models.

Amajor hurdle in thedevelopment of large-scale radioresponsedatasets has been the technical and throughput challenges asso-ciated with the clonogenic assay. We demonstrated how existingdata within RadioGx can be used to generate hypotheses andmake predictions to inform future investigations. For instance,recognizing a dearth of large-scale radioresponse data underhypoxic conditions, we integrated radiobiological modelingwith gene expression data from RadioGx, which allowed us topredict radioresponse under hypoxic conditions. Our findingssuggest that the change in radioresponse under hypoxia istissue-specific and that certain genes are either differentiallyassociated with radioresponse under normoxic and hypoxic con-ditions ormay have expression levels or activity that are regulatedby oxygen tension; these hypotheses generated using RadioGxcould be tested experimentally in future studies. In addition, bycombining RadioGxwith an existing pharmacogenomics analysisplatform, we uncovered drugs with cytotoxic effects that arecorrelated or anticorrelated with radioresponse, suggestive ofgenomic/transcriptomic dependencies related to their mechan-isms of action. We were able to confirm drug classes with ther-apeutic effects that overlap with ionizing radiation (e.g., mitoticinhibitors); moreover, this analysis proposed novel hypothesesregarding possible anticorrelated therapeutic effects with drugstargeting a number of cellular signaling pathways such as ABL andEGFR. Future studies may seek to examine whether members ofthese drug classes maymake rational combination therapies withradiation as a result of reduced additive toxicity.

Predictive biomarkers of radioresponse could be appliedacross multiple cancer types (pan-cancer) or via a tissue-specific approach. Although conserved cellular processes areactivated by radiation (44), different cell types have variable

Figure 7.

Identification of drugs and pharmacologic classes with cytotoxic effects oncancer cell lines that correlate with radioresponse. Pharmacologicenrichment analysis using radiation AUC as the radioresponse indicator.Pharmacologic classes with statistically significant associations withradioresponse in cancer cell lines are indicated.

A Radiogenomic Resource to Advance Precision Radiotherapy

www.aacrjournals.org Cancer Res; 79(24) December 15, 2019 6235

Page 77: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

downstream responses and rates of survival (10). One factorthat may have contributed to past decisions to focus on pan-cancer analysis is the limited amount of available cell-line radio-sensitivity data; for instance, multiple putative radiosensitivitysignatures have used radiation response data obtained from theNCI-60 cell lines (45–47). We showed that there is considerablevariation in pathways associated with radiation response acrosstissue types within RadioGx, which supports a tissue-specificapproach (48, 49). We envision that future investigations intotissue-specific radiation response biomarkers will be facilitated bythe larger data sets curated in RadioGx, which will only expandwith the inclusion of future data sets.

In summary, this study demonstrates the impact of combiningradiogenomic datasets with established radiobiological modelsand other existing pharmacogenomic data. Future applications ofRadioGx may include generation of biomarkers for intrinsicradiosensitivity and selection of novel combination therapies forpreclinical testing. Thus, we envision that RadioGx will help toaccelerate preclinical radiotherapeutic discovery pipelines andguide the selection of appropriate biological endpoints.

Disclosure of Potential Conflicts of InterestM.E. Abazeed reports a receiving commercial research grant from Bayer AG,

SiemensHealthcare and has received speakers bureau honoraria fromBayer AG.S.V. Bratman reports receiving a commercial research grant from NektarTherapeutics and is a co-inventor on patent licensed to Roche MolecularDiagnostics. No potential conflicts of interest were disclosed by the otherauthors.

Authors' ContributionsConception and design: V.S.K. Manem, M. Lambie, M.E. Abazeed, B. Haibe-Kains, S.V. Bratman

Development of methodology: V.S.K. Manem, M. Lambie, M.E. Abazeed,B. Haibe-KainsAcquisition of data (provided animals, acquired and managed patients,provided facilities, etc.): M.E. Abazeed, B. Haibe-KainsAnalysis and interpretation of data (e.g., statistical analysis, biostatistics,computational analysis): V.S.K. Manem, M. Lambie, I. Smith, M. Freeman,M. Koritzinsky, M.E. Abazeed, B. Haibe-KainsWriting, review, and/or revision of themanuscript:V.S.K.Manem,M. Lambie,P. Smirnov, M. Koritzinsky, M.E. Abazeed, B. Haibe-Kains, S.V. BratmanAdministrative, technical, or material support (i.e., reporting or organizingdata, constructing databases): I. Smith, P. Smirnov, V. Kofia, B. Haibe-KainsStudy supervision: B. Haibe-Kains, S.V. BratmanOther (contributed to the development of the accompanying RadioGxSoftware Package): P. Smirnov

AcknowledgmentsThis work was supported by a grant from the V Foundation for

Cancer Research (V2018-010) and from Canadian Institute of HealthResearch (PJT-162185). V.S.K. Manem was supported by the Terry FoxResearch Institute. V.S.K. Manem and M. Freeman were supported theCanadian Institutes of Health Research. P. Smirnov was supported byGenome Canada and the Ontario Research Funds. S.V. Bratman and B.Haibe-Kains are supported by the Gattuso-Slaight Personalized CancerMedicine Fund at the Princess Margaret Cancer Centre. M. Lambie wassupported by a fellowship from STARS21. We also gratefully acknowledgethe support from the Princess Margaret Cancer Foundation and the PrincessMargaret Cancer Center Head & Neck Translational Program, with philan-thropic funds from the Wharton Family, Joe's Team, and Gordon Tozer.

The costs of publicationof this articlewere defrayed inpart by the payment ofpage charges. This article must therefore be hereby marked advertisement inaccordance with 18 U.S.C. Section 1734 solely to indicate this fact.

Received January 15, 2019; revised July 3, 2019; accepted September 17,2019; published first September 26, 2019.

References1. Baumann M, Krause M, Overgaard J, Debus J, Bentzen SM, Daartz J, et al.

Radiation oncology in the era of precisionmedicine. Nat Rev Cancer 2016;16:234–49.

2. Verellen D, De Ridder M, Linthout N, Tournel K, Soete G, Storme G.Innovations in image-guided radiotherapy. Nat Rev Cancer 2007;71–71.

3. Bernier J,Hall EJ, Giaccia A. Radiation oncology: a century of achievements.Nat Rev Cancer 2004;4:737–47.

4. Steel GG, McMillan TJ, Peacock JH. The radiobiology of human cells andtissues. In vitro radiosensitivity. The picture has changed in the 1980s. Int JRadiat Biol 1989;56:525–37.

5. Bentzen SM, Overgaard J. Patient-to-patient variability in the expression ofradiation-induced normal tissue injury. Semin Radiat Oncol 1994;4:68–80.

6. Yard B, Chie EK, Adams DJ, Peacock C, Abazeed ME. Radiotherapy in theera of precision medicine. Semin Radiat Oncol 2015;25:227–36.

7. Bristow RG, Alexander B, Baumann M, Bratman SV, Brown JM, Camphau-sen K, et al. Combining precision radiotherapy with molecular targetingand immunomodulatory agents: a guideline by the American Society forRadiation Oncology. Lancet Oncol 2018;19:e240–51.

8. Brown JM, Wouters BG. Apoptosis, p53, and tumor cell sensitivity toanticancer agents. Cancer Res 1999;59:1391–9.

9. Abazeed ME, Adams DJ, Hurov KE, Tamayo P, Creighton CJ, Sonkin D,et al. Integrative radiogenomic profiling of squamous cell lung cancer.Cancer Res 2013;73:6289–98.

10. Yard BD, Adams DJ, Chie EK, Tamayo P, Battaglia JS, Gopal P, et al. Agenetic basis for the variation in the vulnerability of cancer to DNAdamage. Nat Commun 2016;7:11428.

11. Dale RG. The application of the linear-quadratic dose-effect equationto fractionated and protracted radiotherapy. Br J Radiol 1985;58:515–28.

12. McLaren W, Gil L, Hunt SE, Riat HS, Ritchie GRS, Thormann A, et al. TheEnsembl Variant Effect Predictor. Genome Biol 2016;17:122.

13. Das uA, Toma-Das u I, KarlssonM. The effects of hypoxia on the theoreticalmodelling of tumour control probability. Acta Oncol 2005;44:563–71.

14. Palcic B, Skarsgard LD. Reduced oxygen enhancement ratio at low doses ofionizing radiation. Radiat Res 1984;100:328–39.

15. Skarsgard LD, Harrison I. Dose dependence of the oxygen enhancementratio (OER) in radiation inactivation of Chinese hamster V79-171 cells.Radiat Res 1991;127:243–7.

16. Freyer JP, Jarrett K, Carpenter S, Raju MR. Oxygen enhancement ratio as afunction of dose and cell cycle phase for radiation-resistant and sensitiveCHO cells. Radiat Res 1991;127:297–307.

17. Smirnov P, Safikhani Z, El-Hachem N, Wang D, She A, Olsen C, et al.PharmacoGx: an R package for analysis of large pharmacogenomic data-sets. Bioinformatics 2016;32:1244–6.

18. Seashore-Ludlow B, Rees MG, Cheah JH, Cokol M, Price EV, Coletti ME,et al. Harnessing connectivity in a large-scale small-molecule sensitivitydataset. Cancer Discov 2015;5:1210–23.

19. SubramanianA, TamayoP,Mootha VK,Mukherjee S, Ebert BL,GilletteMA,et al. Gene set enrichment analysis: a knowledge-based approach forinterpreting genome-wide expression profiles. Proc Natl Acad Sci U S A2005;102:15545–50.

20. V€aremo L,Nielsen J, Nookaew I. Enriching the gene set analysis of genome-wide data by incorporating directionality of gene expression and combin-ing statistical hypotheses and methods. Nucleic Acids Res 2013;41:4378–91.

21. Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practicaland powerful approach to multiple testing. J R Stat Soc Series B StatMethodol 1995;57:289–300.

22. Sandve GK, Nekrutenko A, Taylor J, Hovig E. Ten simple rules for repro-ducible computational research. PLoS Comput Biol 2013;9:e1003285.

Manem et al.

Cancer Res; 79(24) December 15, 2019 Cancer Research6236

Page 78: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

23. GentlemanR. Reproducible research: a bioinformatics case study. Stat ApplGenet Mol Biol 2005;4:Article2.

24. Stroup DF, Berlin JA, Morton SC, Olkin I, David Williamson G, Rennie D,et al. Meta-analysis of observational studies in epidemiology: a proposalfor reporting. JAMA 2000;283:2008–12.

25. Nuryadi E, Mayang Permata TB, Komatsu S, Oike T, Nakano T. Inter-assayprecision of clonogenic assays for radiosensitivity in cancer cell line A549.Oncotarget 2018;9:13706–12.

26. De Jong MC, Ten Hoeve JJ, Gr�enman R, Wessels LF, Kerkhoven R, TeRiele H, et al. Pretreatment microRNA expression impacting on epi-thelial-to-mesenchymal transition predicts intrinsic radiosensitivity inhead and neck cancer cell lines and patients. Clin Cancer Res 2015;21:5630–8.

27. Hall JS, Iype R, Senra J, Taylor J, Armenoult L, Oguejiofor K, et al.Investigation of radiosensitivity gene signatures in cancer cell lines.PLoS One 2014;9:e86329.

28. Torres-Roca JF, Eschrich S, Zhao H, Bloom G, Sung J, McCarthy S, et al.Prediction of radiation sensitivity using a gene expression classifier.Cancer Res 2005;65:7169–76.

29. Deacon J, Peckham MJ, Steel GG. The radioresponsiveness of humantumours and the initial slope of the cell survival curve. Radiother Oncol1984;2:317–23.

30. Torres-Roca JF, Stevens CW. Predicting response to clinical radiother-apy: past, present, and future directions. Cancer Control 2008;15:151–6.

31. Fertil B, Malaise EP. Intrinsic radiosensitivity of human cell lines iscorrelated with radioresponsiveness of human tumors: analysis of 101published survival curves. Int J Radiat Oncol Biol Phys 1985;11:1699–707.

32. Barretina J, Caponigro G, Stransky N, Venkatesan K, Margolin AA, Kim S,et al. The Cancer Cell Line Encyclopedia enables predictive modelling ofanticancer drug sensitivity. Nature 2012;483:603–7.

33. Espinosa-Diez C, Miguel V, Mennerich D, Kietzmann T, S�anchez-P�erez P,Cadenas S, et al. Antioxidant responses and cellular adjustments tooxidative stress. Redox Biol 2015;6:183–97.

34. Singh A, BodasM,Wakabayashi N, Bunz F, Biswal S. Gain of Nrf2 functionin non-small-cell lung cancer cells confers radioresistance. Antioxid RedoxSignal 2010;13:1627–37.

35. Branzei D, Foiani M. Regulation of DNA repair throughout the cell cycle.Nat Rev Mol Cell Biol 2008;9:297–308.

36. Heikkinen PT, Nummela M, Leivonen S-K, Westermarck J, Hill CS, K€ah€ariV-M, et al. Hypoxia-activated Smad3-specific dephosphorylation by PP2A.J Biol Chem 2010;285:3740–9.

37. Guo L-D, Wang D, Yang F, Liang Y-J, Yang X-Q, Qin Y-Y, et al. [Functionalanalysis of DNA damage repair factor WDR70 and its mutation in ovariancancer]. Sichuan Da Xue Xue Bao Yi Xue Ban 2016;47:501–6.

38. ZengM, Ren L,Mizuno K, Nestoras K,WangH, Tang Z, et al. CRL4(Wdr70)regulates H2B monoubiquitination and facilitates Exo1-dependent resec-tion. Nat Commun 2016;7:11364.

39. Spratt DE, EvansMJ, Davis BJ, DoranMG, LeeMX, ShahN, et al. Androgenreceptor upregulation mediates radioresistance after ionizing radiation.Cancer Res 2015;75:4688–96.

40. Ingram DR, Dillon LM, Lev DC, Lazar A, Demicco EG, Eisenberg BL, et al.Estrogen receptor alpha and androgen receptor are commonly expressed inwell-differentiated liposarcoma. BMC Clin Pathol 2014;14:42.

41. Giannattasio S,Megiorni F,DiNisio V,Del Fattore A, Fontanella R, CameroS, et al. Testosterone-mediated activation of androgenic signalling sustainsin vitro the transformed and radioresistant phenotype of rhabdomyosar-coma cell lines. J Endocrinol Invest 2019;42:183–97.

42. Bibault J-E, Fumagalli I, Fert�e C, Chargari C, Soria J-C, Deutsch E. Person-alized radiation therapy and biomarker-driven treatment strategies: asystematic review. Cancer Metastasis Rev 2013;32:479–92.

43. Bratman SV, Milosevic MF, Liu F-F, Haibe-Kains B. Genomic biomarkersfor precision radiation medicine. Lancet Oncol 2017;18:e238.

44. Maier P, Hartmann L, Wenz F, Herskind C. Cellular pathways in responseto ionizing radiation and their targetability for tumor radiosensitization.Int J Mol Sci 2016;17.

45. Amundson SA, Do KT, Vinikoor LC, Lee RA, Koch-Paiz CA, Ahn J, et al.Integrating global gene expression and radiation survival parameters acrossthe 60 cell lines of the National Cancer Institute Anticancer Drug Screen.Cancer Res 2008;68:415–24.

46. Eschrich SA, Pramana J, ZhangH, ZhaoH, BoulwareD, Lee JH, et al. A geneexpression model of intrinsic tumor radiosensitivity: prediction ofresponse and prognosis after chemoradiation. Int J Radiat Oncol BiolPhys 2009;75:489–96.

47. Kim HS, Kim SC, Kim SJ, Park CH, Jeung H-C, Kim YB, et al. Identi-fication of a radiosensitivity signature using integrative metaanalysis ofpublished microarray data for NCI-60 cancer cells. BMC Genomics2012;13:348.

48. Zhao SG, Chang SL, Spratt DE, Erho N, Yu M, Ashab HA-D, et al.Development and validation of a 24-gene predictor of response to post-operative radiotherapy in prostate cancer: a matched, retrospective anal-ysis. Lancet Oncol 2016;17:1612–20.

49. Speers C, Zhao S, Liu M, Bartelink H, Pierce LJ, Feng FY. Development andvalidation of a novel radiosensitivity signature in human breast cancer.Clin Cancer Res 2015;21:3667–77.

www.aacrjournals.org Cancer Res; 79(24) December 15, 2019 6237

A Radiogenomic Resource to Advance Precision Radiotherapy

Page 79: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

CANCER IMMUNOLOGY RESEARCH | RESEARCH ARTICLE

pVACtools: A Computational Toolkit to Identify andVisualize Cancer Neoantigens A C

Jasreet Hundal1, Susanna Kiwala1, Joshua McMichael1, Christopher A. Miller1,2,3, Huiming Xia1,Alexander T.Wollam1, Connor J. Liu1, Sidi Zhao1, Yang-Yang Feng1, Aaron P. Graubert1, Amber Z.Wollam1,Jonas Neichin1, Megan Neveau1, Jason Walker1, William E. Gillanders3,4, Elaine R. Mardis5,Obi L. Griffith1,2,3,6, and Malachi Griffith1,2,3,6

ABSTRACT◥

Identification of neoantigens is a critical step in predictingresponse to checkpoint blockade therapy and design of personalizedcancer vaccines. This is a cross-disciplinary challenge, involvinggenomics, proteomics, immunology, and computational approach-es. We have built a computational framework called pVACtoolsthat, when paired with a well-established genomics pipeline,produces an end-to-end solution for neoantigen characteriza-tion. pVACtools supports identification of altered peptidesfrom different mechanisms, including point mutations, in-frame and frameshift insertions and deletions, and gene fusions.Prediction of peptide:MHC binding is accomplished by support-ing an ensemble of MHC Class I and II binding algorithmswithin a framework designed to facilitate the incorporation ofadditional algorithms. Prioritization of predicted peptides occursby integrating diverse data, including mutant allele expression,peptide binding affinities, and determination whether a mutation

is clonal or subclonal. Interactive visualization via a Web inter-face allows clinical users to efficiently generate, review, andinterpret results, selecting candidate peptides for individualpatient vaccine designs. Additional modules support designchoices needed for competing vaccine delivery approaches. Onesuch module optimizes peptide ordering to minimize junctionalepitopes in DNA vector vaccines. Downstream analysis com-mands for synthetic long peptide vaccines are available toassess candidates for factors that influence peptide synthesis.All of the aforementioned steps are executed via a modularworkflow consisting of tools for neoantigen prediction fromsomatic alterations (pVACseq and pVACfuse), prioritization,and selection using a graphical Web-based interface (pVACviz),and design of DNA vector–based vaccines (pVACvector)and synthetic long peptide vaccines. pVACtools is available athttp://www.pvactools.org.

IntroductionThe increasing use of cancer immunotherapies has spurred interest

in identifying and characterizing predicted neoantigens encoded by atumor's genome. The facility and precision of computational toolsfor predicting neoantigens have become increasingly important (1),and several resources aiding in these efforts have been published (2–4).Typically, these tools start with a list of somatic variants [in VariantCall Format (VCF) or other formats such as Browser Extensible

Data (BED)] with annotated protein changes and predict thestrongest MHC-binding peptides (8–11-mer for Class I MHC and13–25-mer for Class II) using one or more predictionalgorithms (5–7). The predicted neoantigens are then filtered andranked based on defined metrics, including sequencing read cov-erage, variant allele fraction (VAF), gene expression, and differen-tial binding compared with the wild-type peptide (agretopicityindex score; ref. 8). However, of the prediction tools overviewedin Supplementary Table S1 [Vaxrank (3), MuPeXI (2), Cloud-Neo (4), FRED2 (9), Epi-Seq (8), and ProTECT (10)], most lackkey functionality, including predicting neoantigens from genefusions, aiding optimized vaccine design for DNA cassette vaccines,and explicitly incorporating nearby germline or somatic alterationsinto the candidate neoantigens (11). In addition, none of these toolsoffer an intuitive graphical user interface for visualizing and effi-ciently selecting the most promising candidates—a key feature forfacilitating involvement of clinicians and other researchers in theprocess of neoantigen prediction and evaluation.

To address these limitations, we created a comprehensive andextensible toolkit for computational identification, selection, prioriti-zation, and visualization of neoantigens—pVACtools—which faci-litates each of the major components of neoantigen identificationand prioritization. This computational framework can be used toidentify neoantigens from a variety of somatic alterations, includinggene fusions and insertion/deletion frameshift variants, both ofwhich potentially create strong immunogenic neoantigens (12–14).pVACtools can facilitate both MHC Class I and II predictions andprovides an interactive display of predicted neoantigens for review bythe end user.

1McDonnell Genome Institute, Washington University School of Medicine, St.Louis, Missouri. 2Division of Oncology, Department of Medicine, WashingtonUniversity School of Medicine, St. Louis, Missouri. 3Siteman Cancer Center,Washington University School of Medicine, St. Louis, Missouri. 4Department ofSurgery,WashingtonUniversity School ofMedicine, St. Louis, Missouri. 5Institutefor Genomic Medicine, Nationwide Children's Hospital, Columbus, Ohio.6Department of Genetics, Washington University School of Medicine, St. Louis,Missouri.

Note: Supplementary data for this article are available at Cancer ImmunologyResearch Online (http://cancerimmunolres.aacrjournals.org/).

J. Hundal and S. Kiwala contributed equally to and are co-first authors of thisarticle.

Corresponding Authors: Malachi Griffith, Washington University School ofMedicine, Campus Box 8501, 4444 Forest Park Avenue, St. Louis, MO 63108.Phone: 314-286-1274; E-mail: [email protected]; andObi L. Griffith. Phone: 314-747-9248; E-mail: [email protected]

Cancer Immunol Res 2020;8:409–20

doi: 10.1158/2326-6066.CIR-19-0401

�2020 American Association for Cancer Research.

AACRJournals.org | 409

Page 80: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

Materials and MethodsThe Cancer Genome Atlas data preprocessing

Aligned (build GRCh38) tumor and normal BAMs from BWA(ref. 15; version 0.7.12-r1039) as well as somatic variant calls fromVarScan2 (refs. 16, 17; in VCF format) were downloaded from theGenomic Data Commons (GDC, https://gdc.cancer.gov/). Becausethe GDC does not provide germline variant calls for The CancerGenome Atlas (TCGA) data, we used GATK's (18) HaplotypeCallerto perform germline variant calling using default parameters. Thesecalls were refined using VariantRecalibrator in accordance withGATK Best Practices (19). Somatic and germline missense variantcalls from each sample were then combined using GATK's Combi-neVariants, and the variants were subsequently phased usingGATK's ReadBackedPhasing algorithm.

Phased Somatic VCF files were annotated with RNA depth andexpression information using VAtools (http://vatools.org). Werestricted our analysis to only consider “PASS” variants in theseVCFs, as these are higher confidence than the raw set, and thevariants were annotated using the “--pick” option in VEP (Ensemblversion 88).

Existing in silico HLA typing information was obtained from TheCancer Immunome Atlas (TCIA) database (20).

Neoantigen predictionThe VEP-annotated VCF files were then analyzed with

pVACseq using all eight Class I prediction algorithms for neoanti-gen peptide lengths 8–11. The current MHC Class I algorithmssupported by pVACseq are NetMHCpan (21), NetMHC (7, 21),NetMHCcons (22), PickPocket (23), SMM (24), SMMPMBEC(25), MHCflurry (26), and MHCnuggets (27). The four MHCClass II algorithms that are supported are NetMHCIIpan (28),SMMalign (29), NNalign (30), and MHCnuggets (31). For thedemonstration analysis, we limited our prediction to only MHCClass I alleles due to availability of HLA typing information fromTCIA, though binding predictions for Class II alleles can also begenerated using pVACtools.

Ranking of neoantigensTo help prioritize neoantigens, a rank is assigned to all neoantigens

that pass initial filters. Each of the following four criteria is assigned arank-ordered value (where the best is assigned rank ¼ 1):

B ¼ Rank of binding affinityF ¼ Rank of fold change between mutant and wild-type allelesM¼ Rank of mutant allele expression, calculated as (rank of gene

expression� rank ofmutant allele RNA variant allele fraction)D ¼ Rank of DNA variant allele fractionA final ranking is based on a score obtained from combining these

values: B þ F þ (M � 2) þ (D/2).Whereas agretopicity is considered in ranking, we do not rec-

ommend filtering candidates solely based on mutant versus wild-type binding value (default fold change value is 0). A previous studyon the binding properties of known neoantigen candidates showedthat most changes reside in T-cell receptor contact regions and notin anchor residues (32). To help end users make an informeddecision while selecting candidates, we also provide the positionof the variant in the final report. We do not rely on hard rules about“anchor positions” (e.g., position 2 and C-terminal) because thesemay vary by MHC allele and the peptide length. As we gain moreempirical results and validation data, we hope that future versions

of pVACtools may include more options for prioritizing based onthese factors.

The rank is not meant to be a definitive metric of peptide suitabilityfor vaccines, but was designed to be a useful first step in the peptideselection process.

ImplementationpVACtools is written in Python3. The individual tools are imple-

mented as separate command line entry points that can be run usingthe “pvacseq,” “pvacfuse,” “pvacvector,” “pvacapi,” and “pvacviz”commands for each respective tool. pVACapi is required to runpVACviz so both the “pvacapi” and “pvacviz” commands need to beexecuted in separate terminals.

The code test suite is implemented using the Python unittestframework, and GitHub integration tests are run using travis-ci(https://travis-ci.org). Code changes are integrated using GitHubpull requests (https://github.com/griffithlab/pVACtools/pulls). Fea-ture additions, user requests, and bug reports are managed using theGitHub issue tracking (https://github.com/griffithlab/pVACtools/issues). User documentation is written using the reStructuredTextmarkup language and the Sphinx documentation framework(https://www.sphinx-doc.org/en/master). Documentation is hostedon Read the Docs (https://readthedocs.org) and can be viewed athttp://www.pvactools.org.

Multithreading and parallelizationAs many prediction algorithms are CPU intensive, pVACseq,

pVACfuse, and pVACvector support using multiple cores to improveruntime. Using this feature, calls to the Immune Epitope Database(IEDB) and other prediction algorithms are made in parallel over auser-defined number of processes.

The pymp-pypi package was used to add support for parallelprocessing. The number of processes is controlled by the --n-threadsparameter.

pVACseqFor pVACseq, the pyvcf package is first used for parsing the input

VCF file and extracting information about the supported missense, in-frame insertion, in-frame deletion, and frameshift variants into TSVformat.

This output is then used to determine the wild-type peptidesequence by extracting a region around the somatic variant accordingto the --peptide-sequence-length specified by the user. The variant'samino acid change is incorporated in this peptide sequence to deter-mine the mutant peptide sequence. For frameshift variants, the newdownstream protein sequence calculated by VEP is reported from thevariant position onward. The number of downstream amino acids toinclude is controlled by the --downstream-sequence-length para-meter. If a phased VCF with proximal variants is provided, proximalmissense variants that are in phase with the somatic variant of interestare incorporated into the mutant and wild-type peptide sequences, asappropriate. The mutant and wild-type sequences are stored in aFASTA file. The FASTA file is then submitted to the individualprediction algorithms for binding affinity predictions. For algorithmsincluded in the IEDB, we use either the IEDB API or a standaloneinstallation, if an installation path is provided by the user (--iedb-install-directory). Themhcflurry andmhcnuggets packages are used torun theMHCflurry (26) andMHCnuggets (27) prediction algorithms,respectively.

The predicted mutant antigens are then parsed into a TSV reportformat, and, for each mutant antigen, the closest wild-type antigen is

Hundal et al.

Cancer Immunol Res; 8(3) March 2020 CANCER IMMUNOLOGY RESEARCH410

Page 81: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

determined and reported. Predictions for each mutant antigen/neoan-tigen from multiple algorithms are aggregated into the “best” (lowest)and median binding scores. The resulting TSV is processed throughmultiple filtering steps (as described below in Results: Comparison offiltering criteria): (i) Binding filter: this filter selects the strongestbinding candidates based on the mutant binding score and the foldchange (wild-type score/mutant score). Depending on the --top-score-metric parameter setting, thisfilter is applied to either themedian scoreacross all chosen prediction algorithms (default) or the best (lowest)score among the chosen prediction algorithms. (ii) Coverage filter:this filter accepts VAF and coverage information from the tumorDNA, tumor RNA, and normal DNA if these values are available in theinput VCF. (iii) Transcript-support-level (TSL) filter: this filterevaluates each transcript's support level if this information wasprovided by VEP in the VCF. (iv) Top-score filter: the filter picks thetop mutant peptide for each variant using the binding affinity asthe determining factor. This filter is implemented to only select thebest candidate from amongmultiple candidates that could result froma single variant due to different peptide lengths, variant registers,transcript sequences, andHLA alleles. The result of these filtering stepsis reported in a filtered report TSV. The remaining neoantigens arethen annotated with cleavage site and stability predictions byNetChopandNetMHCStabPan, respectively, and a relative rank (as described inthe Materials and Methods: Ranking of neoantigens) is assigned. Therank-ordered final output is reported in a condensed file. The pandaspackage is used for data management while filtering and ranking theneoantigen candidates.

pVACvectorWhen running pVACvector with a pVACseq output file, the

original input VCF must also be provided (--input-vcf parameter).The VCF is used to extract a larger peptide sequence around the targetneoantigen (length determined by the --input-n-mer parameter).Alternatively, a list of target peptide sequences can be provided in aFASTA file. The set of peptide sequences is then combined in allpossible pairs, and an ordering of peptides for the vector is produced asfollows:

To determine the optimal order of peptide–spacer–peptide combi-nations, binding predictions are made for all peptide registers over-lapping the junction. A directed graph is then constructed, with nodesdefined as target peptides, and edges representing junctions. The scoreof each edge is defined as the lowest binding score of its junctionalpeptides (a conservativemetric). Edgeswith scores below the thresholdare removed, and if heuristics indicate that a valid graph may exist, asimulated annealing procedure is used to identify a path through thenodes that maximizes junction scores (preserving the weakest overallpredicted binding for junctional epitopes). If no valid ordering isfound, additional “spacer” amino acids are added to each junction,binding affinities are recalculated, and a new graph is constructed andtested, setting edge weights equal to that of the best performing(highest binding score) peptide–junction–spacer combination.

The spacers used for pVACvector are set by the user with the--spacers parameter. This parameter defaults to None, AAY, HHHH,GGS, GPGPG, HHAA, AAL, HH, HHC, HHH, HHHD, HHL, andHHHC, where None is the placeholder for testing junctions without aspacer sequence. Spacers are tested iteratively, starting with the firstspacer in the list and adding subsequent spacers if no valid path isfound. If the user wishes to specify a custom spacer such as the furincleavage site (e.g., R-X-(R/K)-R), they can do so (e.g., --spacers RKKR).

If no result is found after testing with the full set of spacers, the endsof “problematic” peptides, where all junctions contain at least onewell-

binding epitope, will be clipped by removing one amino acid at a timeand then repeating the above binding and graph-building process. Thisclipping may be repeated up to the number of times specified in the--max-clip-length parameter.

It may be necessary to explore the parameter space when runningpVACvector. As binding predictions for some sites vary substantiallyacross algorithms, themost conservative settingsmay result in no validpaths, often due to one "outlier" prediction. Carefully selecting whichpredictors to run may help ameliorate this issue as well. In general,setting a higher binding threshold (e.g., 1,000 nmol/L) and usingthe median binding value (--top-score-metric median) will lead togreater possibility of a design, whereas more conservative settings of500 nmol/L and lowest/best binding value (--top-score-metric lowest)will give more confidence that there are no junctional neoepitopes.

Our current recommendation is to run pVACvector several differ-ent ways and choose the path resulting from the most conservative setof parameters that produces a vector design containing all candidateneoantigens.

pVACapi and pVACvizpVACapi is implemented using the Python libraries Flask and

Bokeh. The pVACviz client is written in TypeScript using the AngularWeb application framework, theClarityUI component library, and thengrx library for managing application state.

Data availabilityData from 100 cases each of melanoma, hepatocellular carcinoma,

and lung squamous cell carcinoma were obtained from TCGA (33)and downloaded via the GDC. This data can be accessed under dbGaPstudy accession phs000178. Data for demonstration and analysisof fusion neoantigens were downloaded from the GitHub repofor Integrate (https://github.com/ChrisMaherLab/INTEGRATE-Vis/tree/master/example).

Software availabilityThe pVACtools codebase is hosted publicly on GitHub at https://

github.com/griffithlab/pVACtools and https://github.com/griffithlab/BGA-interface-projects (pVACviz). User documentation is availableat http://www.pvactools.org. This project is licensed under the Non-Profit Open Software License version 3.0 (NPOSL-3.0, https://opensource.org/licenses/NPOSL-3.0). pVACtools has been packaged anduploaded to PyPI under the “pvactools” package name and can beinstalled on Linux systems by running the “pip install pvactools”command. Installation requires a Python 3.5 environment whichcan be emulated by using Conda. Versioned Docker images are avail-able on DockerHub (https://hub.docker.com/r/griffithlab/pvactools/).Releases are also made available on GitHub (https://github.com/griffithlab/pVACtools/releases).

Example pipeline for creation of pVACtools input filespVACtools was designed to support a standard VCF variant file

format and thus should be compatible with many existing variantcalling pipelines. However, as a reference, we provide the followingdescription of our current somatic and expression analysis pipeline(manuscript in preparation) which has been implemented usingdocker, common workflow language (CWL; ref. 34), and Crom-well (35). The pipeline consists of workflows for alignment of exomeDNA and RNA sequencing (RNA-seq) data, somatic and germlinevariant detection, RNA-seq expression estimation, and HLA typing.

This pipeline begins with raw patient tumor exome and cDNAcapture (36) or RNA-seq data and leads to the production of annotatedVCFs for neoantigen identification and prioritizationwith pVACtools.

pVACtools: Toolkit for Tumor Neoantigen Characterization

AACRJournals.org Cancer Immunol Res; 8(3) March 2020 411

Page 82: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

Our pipeline consists of several modular components: DNA align-ment, HLA typing, germline and somatic variant detection, variantannotation, and RNA-seq analysis. More specifically, we use BWA-MEM (15) for aligning the patient's tumor and normal exome data.For germline variant calling, the output BAM then undergoesmerging [merging of separate instrument data; Samtools (37) Merge],query name sorting (Picard SortSam), duplicate marking (PicardMarkDuplicates), position sorting (Picard SortSam), and base qualityrecalibration (GATKBaseRecalibrator). GATK'sHaplotypeCaller (18)is used for germline variant calling, and the output variants wereannotated using VEP (38) and filtered for coding sequence variants.

For somatic variant calling, the pipeline is consistent with aboveexcept that the variant calling step combines the output of four variantdetection algorithms: Mutect2 (39), Strelka (40), Varscan (16, 17), andPindel (41). The combined variants are normalized using GATK'sLeftAlignAndTrimVariants to left-align the indels and trim commonbases. Vt (42) is used to split multiallelic variants. Several filters such asgnomAD allele frequency (maximum population allele frequency),percentage of mapq0 reads, as well as pass-only variants are appliedprior to annotation of the VCF using VEP. We use a combinationof custom and standard plugins for VEP annotation (parameters:--format VCF --plugin Downstream --plugin Wildtype --symbol--term SO --transcript_version --tsl --coding_only --flag_pick --hgvs).Variant coverage is assessed using bam-readcount (https://github.com/genome/bam-readcount) for both the tumor and normal DNA

exome data, and this information is also annotated into the VCFoutput using VAtools (http://vatools.org).

Our pipeline also generates a phased-VCF file by combining boththe somatic and germline variants and running the sorted combinedvariants through GATK ReadBackedPhasing.

For RNA-seq data, the pipeline first trims adapter sequences usingFlexbar (43) and aligns the patient's tumor RNA-seq data usingHISAT2 (44). Two different methods, Stringtie (45) and Kallisto (46),are employed for evaluating both the transcript and gene expressionvalues. In addition, coverage support for variants in RNA-seq data isdetermined with bam-readcount. This information is added to theVCF using VAtools and serves as an input for neoantigen prioritiza-tion with pVACtools.

Optionally, our pipeline can also run HLA-typing in silico usingOptiType (47) when clinical HLA typing is not available.

We aim to support various versions of these pipelines (via the use ofstandardized file formats) and are also actively developing an end-to-end public CWL workflow starting from sequence reads to pVACseqoutput (https://github.com/genome/analysis-workflows/blob/master/definitions/pipelines/immuno.cwl).

ResultsThe pVACtools workflow (Fig. 1) is divided into modular compo-

nents that can be run independently. The main tools in the workflow

Figure 1.

Overview of pVACtools workflow. The pVACtools workflow is highly modularized and is divided into flexible components that can be run independently. The maintools under the workflow include pVACseq for identifying and prioritizing neoantigens from a variety of somatic alterations (red inset box); pVACfuse (green) fordetecting neoantigens resulting from gene fusions; pVACviz (blue) for process management, visualization, and selection of results; and pVACvector (orange) foroptimizing design of neoantigens and nucleotide spacers in aDNAvector. All of these tools interact via the pVACapi (purple), anOpenAPI HTTPREST interface to thepVACtools suite.

Hundal et al.

Cancer Immunol Res; 8(3) March 2020 CANCER IMMUNOLOGY RESEARCH412

Page 83: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

are: (i) pVACseq: a significantly enhanced and reengineered version ofour previous tool (48) for identifying and prioritizing neoantigensfrom a variety of tumor-specific alterations, (ii) pVACfuse: a tool fordetecting neoantigens resulting from gene fusions, (iii) pVACviz: agraphical user interface Web client for process management, visual-ization, and selection of results from pVACseq, (iv) pVACvector: atool for optimizing design of neoantigens and nucleotide spacers ina DNA vector that prevents high-affinity junctional neoantigens, and(v) pVACapi: an OpenAPI HTTP REST interface to the pVACtoolssuite.

pVACseqpVACseq (48) has been reimplemented in Python3 and extended to

include many new features since our initial report of its release.pVACseq no longer requires a custom input format for variants, andnow uses a standard VCF file annotated with VEP (38). In our ownneoantigen identification pipeline, this VCF is obtained by mergingresults frommultiple somatic variant callers and RNA expression tools(as described inMaterials andMethods: Example pipeline for creationof pVACtools input files). Information that is not natively available inthe VCF output from somatic variant callers (such as coverage andVAFs for RNA and DNA, as well as gene and transcript expressionvalues) can be added to the VCF using VAtools (http://vatools.org), asuite of accessory routines that we created to accompany pVACtools.pVACseq queries these features directly from the VCF, enablingprioritization and filtering of neoantigen candidates based on sequencecoverage and expression information. In addition, pVACseq nowmakes use of phasing information taking into account variants prox-imal to somatic variants of interest. Because proximal variants canchange the peptide sequence and affect neoantigen-binding predic-tions, this is important for ensuring that the selected neoantigenscorrectly represent the individual's genome (11).Wehave also expand-ed the supported variant types for neoantigen predictions to includein-frame indels and frameshift variants. These capabilities expand thepotential number of targetable neoantigens several-fold in manytumors (refs. 12, 14; Supplementary Fig. S1).

To prioritize neoantigens, pVACseq now offers support for eightdifferent MHC Class I antigen prediction algorithms and four MHCClass II prediction algorithms. The tool does this in part byleveraging the IEDB (49) and their suite of six different MHCClass I prediction algorithms, as well as three MHC Class IIalgorithms (as described in Materials and Methods: Neoantigenprediction). pVACseq supports local installation of these tools forhigh-throughput users, access through a docker container (https://hub.docker.com/r/griffithlab/pvactools), and ready-to-go accessvia the IEDB RESTful Web interface. In addition, pVACseq nowcontains an extensible framework for supporting new neoantigenprediction algorithms that has been used to add support fortwo new non-IEDB algorithms—MHCflurry (26) and MHCnuggets(27). By creating a framework that integrates many tools, we allowfor (i) a broader ensemble approach than IEDB and (ii) a systemthat other users can leverage to develop improved ensemble rank-ing, or to integrate proprietary or not-yet-public prediction soft-ware. This framework enables non-informatics expert users topredict neoantigens from sequence variant data sets.

Once neoantigens have been predicted, binding affinity values fromthe selected prediction algorithms are aggregated into a medianbinding score and a report file is generated. Finally, a rank is calculatedthat can be used to prioritize the predicted neoantigens. The rank takesinto account gene expression, sequence read coverage, binding affinitypredictions, and agretopicity (as described in the Materials and

Methods: Ranking of neoantigens). The pVACseq rank allows usersto pick the exemplar peptide for each variant from a large number oflengths, registers, alternative transcripts, and HLA alleles. In additionto applying strict binding affinity cutoffs, the pipeline also offerssupport for MHC allele-specific cutoffs (50). The allele-specific bind-ing filter incorporates allele-specific thresholds from IEDB for the38 most common HLA-A and HLA-B alleles, representative of thenine major supertypes, instead of the 500 nmol/L default cutoff.Because these allele-specific thresholds only include cutoffs forHLA-A and HLA-B alleles, we recommend evaluating binding pre-dictions for HLA-C alleles separately with a less stringent cutoff (suchas 1,000 nmol/L) and merging results when picking final candidates.

We also offer cleavage position predictions via optional processingthrough NetChop (51) as well as stability predictions made byNetMHCstabPan (52).

pVACfusePrevious studies show that the novel protein sequences pro-

duced by known driver gene fusions frequently produce neo-antigen candidates (53). pVACfuse provides support for predictingneoantigens from such gene fusions. One possible input topVACfuse is a set of annotated fusion files from AGFusion (54),thus enabling support for a variety of fusion callers such as STAR-Fusion (55), FusionCatcher (56), JAFFA (57), and TopHat-Fusion (58). pVACfuse also supports annotated BEDPE formatfrom any fusion caller which can, for example, be generated usingINTEGRATE-Neo (53). These variants are then assessed forpresence of fusion neoantigens using predictions from any of thepVACseq-supported binding prediction algorithms.

pVACviz and pVACapiImplementing cancer vaccines in a clinical setting requires

multidisciplinary teams, which may not include informatics experts.To support this growing community of users, we developed pVAC-viz, which is a browser-based user interface that assists in launching,managing, reviewing, and visualizing the results of pVACtoolsprocesses. Instead of interacting with the tools via terminal/shellcommands, the pVACviz client provides a modern Web-based userexperience. Users complete a pVACseq (Fig. 2) process setup formthat provides helpful documentation and suggests valid values forinputs. The client also provides views showing ongoing processes,their logs, and interim data files to aid in managing and trouble-shooting. After a process has completed, users may examine theresults as a filtered data table or as a scatterplot visualization—allowing them to curate results and save them as a CSV file forfurther analysis. Extensive documentation of the visualizationinterface can be found in the online documentation (https://pvactools.readthedocs.io/en/latest/pvacviz.html).

To support informatics groups that want to incorporate or buildupon the pVACtools features, we developed pVACapi, whichprovides an HTTP REST interface to the pVACtools suite. Itprovides the API that pVACviz uses to interact with the pVACtoolssuite. Advanced users could develop their own user interfaces oruse the API to control multiple pVACtools installations remotelyover an HTTP network.

pVACvectorOnce a list of neoantigen candidates has been prioritized and

selected, the pVACvector utility can be used to aid in the construc-tion of DNA-based cancer vaccines. The input is either the outputfile from pVACseq or a FASTA file containing peptide sequences.

pVACtools: Toolkit for Tumor Neoantigen Characterization

AACRJournals.org Cancer Immunol Res; 8(3) March 2020 413

Page 84: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

pVACvector returns a neoantigen sequence ordering that mini-mizes the effects of junctional peptides (which may create novelantigens) between the sequences (Fig. 3). This is accomplished byusing the core pVACseq module to predict the binding scores foreach junctional peptide and by modifying junctions with spac-er (59, 60) amino acid sequences, or by trimming the ends of thepeptides in order to reduce reactivity. The final vaccine ordering isachieved through a simulated annealing procedure (61) that returns

a near-optimal solution, when one exists (details can be found in theMaterials and Methods: Implementation).

User communitypVACtools has been used to predict and prioritize neoantigens

for several immunotherapy studies (62–64) and cancer vaccineclinical trials (e.g., NCT02348320, NCT03121677, NCT03122106,NCT03092453, and NCT03532217). We also have a large external

Figure 2.

pVACviz GUI client. pVACtools offers a browser-based graphical user interface, pVACviz, which provides an intuitive means to launch pipeline processes, monitortheir execution, and analyze, export, or archive their results. To launch a process, users navigate to the Start Page (A) and complete a form containing all of therelevant inputs and settings for a pVACseq process. Each form field includes help text and provides typeahead completionwhere applicable. For instance, the allelesfield provides a typeahead dropdown menu that filters available alleles to display only those alleles that match the text that the user has entered. Once a process islaunched, a usermaymonitor its progress on theManage Page (B), which lists all running, stopped, and completed processes. TheDetails Page (C) shows a process’scurrent log, attributes, and any results files aswell as provides controls for stopping, restarting, exporting, and archiving the process. The results of pipeline processesmay be analyzed on the Visualize Page (D), which displays a customizable scatterplot of a file's rows. The x- and y-axes may be set to any column in the resultset, and filtersmay be applied to values in any column. In addition, pointsmaybe selected on the scatter plot or data grid (not visible in this figure) for further analysisor export as CSV files.

Figure 3.

An example pVACvector output showing the optimumarrangement of candidate neoantigens for aDNAvector–based vaccine design. The figure depicts a circularizedDNA insert carrying 10 encoded neoantigenic peptidesequences to be synthesized and encoded/cloned into aDNAplasmid. DNA sequences encoding each peptide areordered (with use of spacer sequenceswhere needed) toensure there are no strong-binding junctional epitopes.Each neoantigenic peptide candidate is shown in blue,green, red, orange, purple, and brown. To minimizejunctional epitope affinity, pVACvector adds spacersequences where needed. These are depicted in black,along with the binding affinity value of the junctionalepitope. Labels represent the gene name and amino acidchange for each candidate.

Hundal et al.

Cancer Immunol Res; 8(3) March 2020 CANCER IMMUNOLOGY RESEARCH414

Page 85: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

user community that has been actively evaluating and using thesepackages for their neoantigen analysis and has aided in subsequentrefinement of pVACtools through extensive feedback. The original“pvacseq” package has been downloaded over 48,000 times from PyPI,the “pvactools” package has been downloaded over 34,000 times, andthe “pvactools” docker image has been pulled over 37,000 times.

Analysis of TCGA data using pVACtoolsTo demonstrate the utility and performance of the pVACtools

package, we downloaded exome sequencing and RNA-seq data fromTCGA (33) from 100 cases each of melanoma, hepatocellular carci-noma, and lung squamous cell carcinoma and used patient-specificMHC Class I alleles (Supplementary Fig. S2) to determine neoantigencandidates for each patient. There were a total of 64,422 VEP-annotated variants reported across 300 samples, with an average of214 variants per sample. Of these, 61,486 were single nucleotidevariants (SNV), 479 were in-frame insertions and deletions, and2,465 were frameshift mutations (Supplementary Fig. S1). We usedthis annotated list of variants as input to the pVACseq component ofpVACtools to predict neoantigenic peptides. pVACseq reported14,599,993 unfiltered neoantigen candidates. The original version ofpVACseq (20) reported 10,284,467 neoantigens. This demonstratedthat, by extending support for additional variant types as well asprediction algorithms (due to support for additional alleles), weproduced 42% more raw candidate neoantigens.

After applying our default median binding affinity cutoff of 500nmol/L across all eight MHC Class I prediction algorithms, therewere 96,235 predicted strong-binding neoantigens, derived from34,552 somatic variants (32,788 missense SNVs, 1,603 frameshiftvariants, and 131 in-frame indels). This set of strong binders wasfurther reduced by filtering out mutant peptides with medianpredicted binding affinities (across all prediction algorithms) great-er than that of the corresponding wild-type peptide (i.e., mutant/wild-type binding affinity fold change >1), resulting in 70,628neoantigens from 28,588 variants (26,880 SNVs, 1,583 frameshift,and 125 in-frame indels).

This set was subsequently filtered by evaluation of exome sequenc-ing data coverage and our recommended defaults as follows: byapplying the default criteria of VAF cutoff of >25% in tumors and<2% in normal samples, with at least 10� tumor coverage and at least5� normal coverage, 10,730 neoantigens from 4,891 associated var-iants (4,826 SNVs, 56 frameshift, and 9 in-frame indels) were obtained,with an average of 36 neoantigens predicted per case. Because RNA-seq data also were available, the filtering criteria included RNA-basedcoverage filters (tumor RNA VAF >25% and tumor RNA coverage>10�) as well as a gene expression filter (FPKM >1). To condense theresults even further, only the top ranked neoantigen was selected pervariant (“top-score filter”) across all alleles, lengths, and registers(position of amino acid mutation within peptide sequence), resultingin 4,891 total neoantigens with an average of 16 neoantigens per case.This list was then processed with pVACvector to determine theoptimum arrangement of the predicted high-quality neoantigens fora DNA vector–based vaccine design (Fig. 3).

Correlations and bias among binding prediction toolsBecause we offer support for as many as eight different Class I

binding prediction tools, we assessed agreement in binding affinitypredictions (IC50) between these algorithms from a random subset of100,000 neoantigen peptides fromourTCGAanalysis, described above(Fig. 4). The highest correlation was observed between the twoalgorithms that are based on a stabilization matrix method (SMM)

—SMM and SMMPMBEC. The next best correlation was observedbetween NetMHC and MHCflurry, possibly due to both being allele-specific predictors employing neural network-based models. Overall,the correlation between prediction algorithms is low (mean correlationof 0.388 and range of 0.18–0.89 between all pairwise comparisons ofalgorithms).

We also evaluated if there were any biases among the algorithms topredict strong-binding epitopes (i.e., binding affinity�500 nmol/L) byplotting the number of predicted peptides based on their respectivestrong-predicting algorithms (Fig. 5; Supplementary Fig. S3). Wefound that MHCnuggets (v2.2) predicted the highest number ofstrong-binding candidates alone (Fig. 5A). Of the total number ofstrong-binding candidates predicted, 64.7% of these candidates werepredicted by a single algorithm (any one of the eight algorithms),35.2% were predicted as strong-binders by two to seven algorithms,and only 1.8% of the strong-binding candidates were predicted asstrong binders by the combination of all eight algorithms (Supple-mentary Fig. S3). In fact, as shown in Fig. 6, even if one (or more)algorithms predict a peptide to be a strong binder, often anotheralgorithmwill disagree by a largemargin, in some cases predicting thatsame peptide as a very weak binder. This remarkable lack of agreementunderscores the potential value of an ensemble approach that con-siders multiple algorithms.

Next we determined if the number of humanHLA alleles supportedby these eight algorithms differed. As shown (Supplementary Fig. S4),MHCnuggets supports the highest number of human HLA alleles.

Comparison of filtering criteriaBecause pVACtools offers a multitude of ways to filter a list of

predicted neoantigens, we evaluated and compared the effect of eachfilter for selecting high-quality neoantigen candidates. There were14,599,993 unfiltered neoantigen candidates predicted by pVACseqresulting from 64,422 VEP-annotated variants reported across300 samples.

We first compared the effect of running pVACtools with thecommonly used standard binding score cutoff of �500 nmol/L(parameters: -b 500) to the newly added allele-specific score filter(parameters: -a). These cutoffs were, by default, applied to the

Figure 4.

Correlation between prediction algorithms. Spearman correlation betweenprediction values from all eight Class I prediction algorithms (MHCflurry,MHCnuggets, NetMHC, NetMHCcons, NetMHCpan, PickPocket, SMM, andSMMPMBEC) generated from a random subsample of 100,000 peptides.

pVACtools: Toolkit for Tumor Neoantigen Characterization

AACRJournals.org Cancer Immunol Res; 8(3) March 2020 415

Page 86: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

“median” binding score of all prediction algorithms. In total, 96,235neoantigens (average 320.8 per patient) were predicted using the 500nmol/L binding score cutoff compared with 94,068 neoantigens(average 313.6 per patient) using allele-specific filters. About 79% ofneoantigens were shared between the two sets.

We also further narrowed these sets to include only thosepredictions where the default “median” predicted binding affinities

(�500 nmol/L) were lower than each correspondingmedian wild-typepeptide affinity (parameters: -b 500 -c 1; i.e., a binding affinity mutant/wild-type ratio or differential agretopicity value indicating that themutant version of the peptide is a stronger binder; ref. 21). Using theagretopicity value filter, 70,628 neoantigens (average 235.43 perpatient) were predicted versus the previously reported 96,235 neoanti-gens without this filter.

Figure 5.

Intersection of peptide sequences pre-dicted by different algorithms(MHCflurry, MHCnuggets, NetMHC,NetMHCcons, NetMHCpan, PickPock-et, SMM, and SMMPMBEC). The y-axisdisplays the number of overlappingunique neoantigenic peptides pre-dicted for each combination of algo-rithm depicted on the x-axis. Eachcollection of connected circles showsthe set contained in an exclusive inter-section (i.e., the identity of each algo-rithm), whereas the light gray circlesrepresent the algorithm(s) that doesnot participate in this exclusive inter-section. A, Upset plot for the top20 algorithm combinations ranked bythe number of peptides predictedto be a good binder (mutant IC50 score<500 nmol/L). The combination of alleight algorithms (highlighted orange)ranks eighth highest. B, Upset plotfor algorithm combinations where atleast six algorithms agree on predict-ing a peptide to be a good binder(mutant IC50 score <500 nmol/L).The combination of all eight algo-rithms (highlighted orange) ranksthe highest.

Hundal et al.

Cancer Immunol Res; 8(3) March 2020 CANCER IMMUNOLOGY RESEARCH416

Page 87: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

We also evaluated the effect of the agretopicity value filter whenapplied to the set of peptides filtered on “lowest” binding score of�500 nmol/L (parameters: -b 500 -c 1 -m lowest). This filter limitspeptides to those where at least one of the algorithms predicts astrong binder, instead of calculating a median score and requiringthat to meet the 500 nmol/L threshold (Fig. 6). Using the lowestbinding score filter resulted in an 11-fold increase in the number ofcandidates predicted (827,423 candidates, average 2,758.08 perpatient). This approach may be useful for finding candidates intumors with low mutation burden (but may also presumably lead toa higher rate of false positives).

Using the median (default) binding affinity filtering criteria,we next applied coverage and expression-based filters. First, wefiltered using the recommended defaults, i.e., greater than 5�normal DNA coverage, less than 2% normal VAF, greater than10� tumor RNA and DNA coverage, and greater than 25% tumorRNA and DNA VAF, along with FPKM >1 for transcript levelexpression (parameters: --normal-cov 5 --tdna-cov 10 --trna-cov10 --normal-vaf 0.02 --tdna-vaf 0.25 --trna-vaf 0.25 --expn-val 1).A total of 10,730 neoantigens were shortlisted across all sampleswith an average of 35 neoantigens per case. We then comparedthis set with slightly more stringent criteria using tumor DNA andRNA VAF of 40% (parameters: --tdna-vaf 0.40 --trna-vaf 0.40).This shortened our list of predicted neoantigens to 4,073 candidateswith an average of 13 candidates per patient.

Demonstration of neoantigen analysis using pVACfuseTo demonstrate the potential of neoantigens resulting from gene

fusions, we analyzed TCGA prostate cancer RNA-seq data from 302patients. This dataset was previously used as a demonstration set forthe fusion neoantigen prediction supported by INTEGRATE-

Neo (22). We wanted to assess the difference (if any) in neoantigencandidates reported by INTEGRATE-Neo using the one MHC Class Iprediction algorithm it supports (NetMHC) versus an ensemble ofeight Class I prediction algorithms supported in pVACfuse.

Using 1,619 gene fusions across 302 samples as input, pVACfusereported 2,104 strong-binding neoantigens (binding affinity�500 nmol/L) resulting from 739 gene fusions. On average, therewere about seven neoantigens per sample resulting from an averageof two fusions per case. This is an 8-fold increase in the number ofstrong-binding neoantigens predicted by pVACfuse versus thosereported by INTEGRATE-Neo alone, which reported 261 neoanti-gens across 210 fusions.

Validation of pVACtools resultsTo assess the real-world validity of pVACtools results, we reana-

lyzed raw sequencing data from published studies that had performedimmunologic validation of candidate neoantigens (SupplementaryTables S2 and S3). Analysis of a patient with melanoma (pt3713;ref. 65) showed that pVACtools was able to recapitulate eight (80%) ofthe positively validated peptides in the filtered list of neoantigensusing our suggested filtering methodology (median binding predictedIC50 <500 nmol/L, >5� DNA and RNA coverage, >10% DNA andRNA VAF, >1 TPM gene expression). Two of the positively validatedpeptides were filtered out due to low RNA depth and VAF values buthad median binding affinity predictions of 13.707 and 4.799 nmol/L.We were also able to eliminate 68% of negatively validated peptidesusing these filters. In an additional analysis of a patient with chroniclymphocytic leukemia (pt5002; ref. 66), we were able to recapitulatetwo (100%) of the positively validated peptides in our filtered list ofneoantigens using our suggested filtering methodology (median bind-ing predicted IC50 <500 nmol/L, >5� DNA coverage, >10% DNA

Figure 6.

Overall distribution of binding affinityscores (nmol/L) for peptideswhere atleast oneof the algorithmspredicted astrong binder. Out of 14,599,993 pep-tides predicted, 126,648 peptideswere predicted to be strong bindersby at least one of the following algo-rithms: MHCflurry, MHCnuggets,NetMHC, NetMHCcons, NetMHCpan,PickPocket, SMM, or SMMPMBEC. Todefine the set of peptides that werestrong binders according to at leastone algorithm, HLA allele subtype–specific thresholdswere appliedwhenavailable; otherwise, the default cutoffbinding affinity of 500 nmol/L wasused. The peptides were further fil-tered using the default coverage-based filters (normal coverage >5�and VAF <2%, tumor DNA coverage>10� and VAF >25%, tumor RNA cov-erage >10� and VAF >25%, transcriptexpression >1 FPKM). Peptides withpredicted mutant IC50 scores lowerthan their respective cutoff scores arehighlighted in orange, whereas thosehigher are plotted in gray. Themedianmutant IC50 scores of each algorithm'sprediction (purple line) and the500 nmol/L binding affinity cutoff(red dotted line) are indicated forreference.

pVACtools: Toolkit for Tumor Neoantigen Characterization

AACRJournals.org Cancer Immunol Res; 8(3) March 2020 417

Page 88: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

VAF). No RNA data were available for this sample. We were also ableto eliminate 57% of negatively validated peptides using these filters.

DiscussionAs reported from our demonstration analysis, a typical tumor has

too many possible neoantigen candidates to be practical for a vaccine.There is therefore a critical need for a tool that takes in the input from atumor sequencing analysis pipeline and reports a filtered and prior-itized list of neoantigens. pVACtools enables a streamlined, accurate,and user-friendly analysis of neoantigenic peptides from NGS cancerdatasets. This suite offers a complete and easily configurable end-to-end analysis, starting from somatic variants and gene fusions (pVAC-seq and pVACfuse respectively), through filtering, prioritization, andvisualization of candidates (pVACviz), and determining the bestarrangement of candidates for a DNA vector vaccine (pVACvector).By supporting additional classes of variants as well as gene fusions, wepredict an increased number of predicted neoantigens which may becritical for low mutational burden tumors. Finally, by extendingsupport for multiple binding prediction algorithms, we allow for aconsensus approach. The need for this integrated approach is madeabundantly clear by the disagreement between the candidate neoanti-gens identified by these algorithms, as observed in our demonstrationanalyses.

To support a wider range of variant events not easily representablein VCF format such as alternatively spliced transcripts that may createtumor antigens (67), upcoming versions of pVACtools will add a newtool, pVACbind, which can be used to execute our prediction pipelinefrom sequences contained in a FASTA input file. Another majorplanned feature is the addition of reference proteome similarityassessment of predicted peptides in order to exclude peptides thatare not true neoantigens because they occur in other parts of theproteome. Upcoming versions will also include the calculation ofvarious metrics for peptide manufacturability for use in synthetic longpeptide-based vaccines. In addition, we are planning on addingbinding affinity percentile ranks to our prediction outputs for thoseprediction algorithms that support it. This would also allow us tosupport additional IEDBprediction algorithms, such asCombinatorialLibrary (68) and Sturniolo (69). The IEDB software added support forvarious Class II epitope lengths starting with version 2.22. We plan onupdating pVACtools in an upcoming version to take advantage of thisnew functionality. Lastly, we are working on a CWL immunotherapypipeline that will execute all steps starting with alignment, somaticand germline variant calling, HLA-typing, and RNA-seq analysis, in-cluding fusion calling, followed by running pVACseq and pVACfuse.This will support users that would like a full end-to-end solution notsupported by pVACtools itself.

The results from pVACtools analyses are already being used indozens of cancer immunology studies, including studying the rela-tionship between tumor mutation burden and neoantigen load topredict response to immunotherapies (64, 70, 71), and the design of

cancer vaccines in ongoing clinical trials (e.g., NCT02348320,NCT03121677, NCT03122106, NCT03092453, and NCT03532217).Whether the predicted neoantigens will induce T-cell responsesremains an open question for the field (72–74). Pipelines like pVAC-tools are meant to assist these groups to obtain a prioritized set ofneoantigen candidates from among a myriad of possible epitopepredictions. Pursuant to that, they should perform additional func-tional validations according to the standards in this field (73).Whereasit may not be possible to evaluate a de novo response to all predictedcandidate neoantigens prior to vaccination, we highly recommendadding a similar validation step as part of the clinical workflow todetermine those candidates that have a preexisting immune response.Validation data from the studies currently using pVACtools will beincorporated into our toolkit as they become available, and we hopethat over time this will improve the prioritization process. We antic-ipate that pVACtools will make such analyses more robust, repro-ducible, and facile as these efforts continue.

Disclosure of Potential Conflicts of InterestNo potential conflicts of interest were disclosed.

Authors’ ContributionsConception and design: J. Hundal, S. Kiwala, J. McMichael, W.E. Gillanders,E.R. Mardis, O.L. Griffith, M. GriffithDevelopment of methodology: J. Hundal, S. Kiwala, J. McMichael, C.A. Miller,Y.-Y. Feng, A.P. Graubert, A.Z. Wollam, J. Neichin, O.L. Griffith, M. GriffithAcquisition of data (provided animals, acquired and managed patients, providedfacilities, etc.): J. Hundal, C.J. LiuAnalysis and interpretation of data (e.g., statistical analysis, biostatistics,computational analysis): J. Hundal, H. Xia, C.J. Liu, S. Zhao, M. GriffithWriting, review, and/or revision of the manuscript: J. Hundal, S. Kiwala,C.A. Miller, H. Xia, C.J. Liu, W.E. Gillanders, E.R. Mardis, O.L. Griffith, M. GriffithAdministrative, technical, or material support (i.e., reporting or organizingdata, constructing databases): J. Hundal, S. Kiwala, J. McMichael, A.T. Wollam,Y.-Y. Feng, A.Z. Wollam, M. Neveau, J. WalkerStudy supervision: J. Walker, E.R. Mardis, O.L. Griffith, M. Griffith

AcknowledgmentsWe thank the patients and their families for donation of their samples and

participation in clinical trials. We also thank our growing user community for testingthe software and providing useful input, critical bug reports, as well as suggestions forimprovement and new features. We are grateful to Drs. Robert Schreiber, GavinDunn, and Beatriz Carreno for their expertise and guidance on foundational work oncancer immunology using neoantigens and suggestions on improving the software.O.L. Griffith was supported by the NCI of the NIH under award numbersU01CA209936, U01CA231844, and U24CA237719. M. Griffith was supported bythe National Human Genome Research Institute (NHGRI) of the NIH under awardnumber R00HG007940 and the V Foundation for Cancer Research under awardnumber V2018-007.

The costs of publication of this article were defrayed in part by the payment of pagecharges. This article must therefore be hereby marked advertisement in accordancewith 18 U.S.C. Section 1734 solely to indicate this fact.

Received May 29, 2019; revised October 6, 2019; accepted December 30, 2019;published first January 6, 2020.

References1. Liu XS, Shirley Liu X, Mardis ER. Applications of immunogenomics to cancer.

Cell 2017;168:600–12.2. Bjerregaard A-M, Nielsen M, Hadrup SR, Szallasi Z, Eklund AC. MuPeXI:

prediction of neo-epitopes from tumor sequencing data. Cancer ImmunolImmunother 2017;66:1123–30.

3. Rubinsteyn A, Hodes I, Kodysh J, Hammerbacher J. Vaxrank: a computationaltool for designing personalized cancer vaccines. BioRxiv 142919 [Preprint].2017. Available from: http://dx.doi.org/10.1101/142919.

4. Bais P, Namburi S, Gatti DM, Zhang X, Chuang JH. CloudNeo: a cloud pipelinefor identifying patient-specific tumor neoantigens. Bioinformatics 2017;33:3110–2.

5. Andreatta M, Nielsen M. Gapped sequence alignment using artificial neuralnetworks: application to theMHC class I system. Bioinformatics 2016;32:511–7.

6. Jurtz V, Paul S, AndreattaM,Marcatili P, Peters B, NielsenM. NetMHCpan-4.0:improved peptide–MHC class I interaction predictions integrating eluted ligandand peptide binding affinity data. J Immunol 2017;199:3360–8.

Hundal et al.

Cancer Immunol Res; 8(3) March 2020 CANCER IMMUNOLOGY RESEARCH418

Page 89: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

7. NielsenM, Lundegaard C,Worning P, Lauemøller SL, Lamberth K, Buus S, et al.Reliable prediction of T-cell epitopes using neural networks with novel sequencerepresentations. Protein Sci 2003;12:1007–17.

8. Duan F, Duitama J, Al Seesi S, Ayres CM, Corcelli SA, Pawashe AP, et al.Genomic and bioinformatic profiling of mutational neoepitopes revealsnew rules to predict anticancer immunogenicity. J Exp Med 2014;211:2231–48.

9. Schubert B, Walzer M, Brachvogel H-P, Szolek A, Mohr C, Kohlbacher O.FRED 2: an immunoinformatics framework for Python. Bioinformatics 2016;32:2044–6.

10. Rao AA, Madejska AA, Pfeil J, Paten B, Salama SR, Haussler D. ProTECT –

prediction of T-cell epitopes for cancer therapy. BioRxiv 696526 [Preprint].2019. Available from: https://doi.org/10.1101/696526.

11. Hundal J, Kiwala S, Feng Y-Y, Liu CJ, Govindan R, Chapman WC, et al.Accounting for proximal variants improves neoantigen prediction. Nat Genet2019;51:175–9.

12. Turajlic S, Litchfield K, Xu H, Rosenthal R, McGranahan N, Reading JL, et al.Insertion-and-deletion-derived tumor-specific neoantigens and the immuno-genic phenotype: a pan-cancer analysis. Lancet Oncol 2017;18:1009–21.

13. Zamora AE, Crawford JC, Allen EK, Guo X-ZJ, Bakke J, Carter RA, et al.Pediatric patients with acute lymphoblastic leukemia generate abundantand functional neoantigen-specific CD8þ T cell responses. Sci Transl Med2019;11. pii: eaat8549.

14. Yang W, Lee K-W, Srivastava RM, Kuo F, Krishna C, Chowell D, et al.Immunogenic neoantigens derived from gene fusions stimulate T cell responses.Nat Med 2019;25:767–75.

15. Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheelertransform. Bioinformatics 2009;25:1754–60.

16. Koboldt DC, Larson DE, Wilson RK. Using VarScan 2 for germline variantcalling and somatic mutation detection. Curr Protoc Bioinformatics 2013;44:15.4.1–17.

17. Koboldt DC, ZhangQ, LarsonDE, ShenD,McLellanMD, Lin L, et al. VarScan 2:Somatic mutation and copy number alteration discovery in cancer by exomesequencing. Genome Res 2012;22:568–76.

18. DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, Hartl C, et al. Aframework for variation discovery and genotyping using next-generation DNAsequencing data. Nat Genet 2011;43:491–8.

19. Van der Auwera GA, Carneiro MO, Hartl C, Poplin R, Del Angel G, Levy-Moonshine A, et al. From FastQ data to high confidence variant calls: theGenome Analysis Toolkit best practices pipeline. Curr Protoc Bioinformatics2013;43:11.10.1–33.

20. Charoentong P, Finotello F, Angelova M, Mayer C, Efremova M, Rieder D, et al.Pan-cancer immunogenomic analyses reveal genotype-immunophenotype rela-tionships and predictors of response to checkpoint blockade. Cell Rep 2017;18:248–62.

21. Hoof I, Peters B, Sidney J, Pedersen LE, Sette A, Lund O, et al. NetMHCpan, amethod for MHC class I binding prediction beyond humans. Immunogenetics2009;61:1–13.

22. Karosiene E, Lundegaard C, Lund O, Nielsen M. NetMHCcons: a consensusmethod for the major histocompatibility complex class I predictions. Immu-nogenetics 2011;64:177–86.

23. Zhang H, Lund O, Nielsen M. The PickPocket method for predicting bindingspecificities for receptors based on receptor pocket similarities: application toMHC-peptide binding. Bioinformatics 2009;25:1293–9.

24. Peters B, Sette A. Generating quantitative models describing the sequencespecificity of biological processes with the stabilized matrix method.BMC Bioinformatics 2005;6:132.

25. Kim Y, Sidney J, Pinilla C, Sette A, Peters B. Derivation of an amino acidsimilarity matrix for peptide: MHC binding and its application as a Bayesianprior. BMC Bioinformatics 2009;10:394.

26. O’Donnell TJ, Rubinsteyn A, Bonsack M, Riemer AB, Laserson U,Hammerbacher J. MHCflurry: open-source class I MHC binding affinityprediction. Cell Syst 2018;7:129–32.e4.

27. Bhattacharya R, Sivakumar A, Tokheim C, Guthrie VB, Anagnostou V,Velculescu VE, et al. Evaluation of machine learning methods to predictpeptide binding to MHC class I proteins. BioRxiv 154757 [Preprint]. 2017.Available from: https://doi.org/10.1101/154757.

28. Nielsen M, Lundegaard C, Blicher T, Peters B, Sette A, Justesen S,et al. Quantitative predictions of peptide binding to any HLA-DRmolecule of known sequence: NetMHCIIpan. PLoS Comput Biol 2008;4:e1000107.

29. Nielsen M, Lundegaard C, Lund O. Prediction of MHC class II binding affinityusing SMM-align, a novel stabilization matrix alignment method.BMC Bioinformatics 2007;8:238.

30. Nielsen M, Lund O. NN-align. An artificial neural network-based alignmentalgorithm for MHC class II peptide binding prediction. BMC Bioinformatics2009;10:1–10.

31. Shao XM, Bhattacharya R, Huang J, Sivakumar IKA, Tokheim C, Zheng L, et al.High-throughput prediction of MHC class I and class II neoantigens withMHCnuggets. Cancer Immunol Res 2020;8:396–408.

32. Fritsch EF, Rajasagi M, Ott PA, Brusic V, Hacohen N, Wu CJ. HLA-bindingproperties of tumor neoepitopes in humans. Cancer Immunol Res 2014;2:522–9.

33. Cancer Genome Atlas Research Network, Weinstein JN, Collisson EA,Mills GB,Shaw KRM, Ozenberger BA, et al. The Cancer Genome Atlas Pan-Canceranalysis project. Nat Genet 2013;45:1113–20.

34. Amstutz P, Crusoe MR, Tijani�c N, Chapman B, Chilton J, Heuer M, et al.Common Workflow Language, v1.0. 2016. Available from: http://dx.doi.org/10.6084/m9.figshare.3115156.v2.

35. Voss K, Gentry J, Van der Auwera G. Full-stack genomics pipelining withGATK4 þ WDL þ Cromwell [version 1; not peer reviewed]. 2017. Availablefrom: http://dx.doi.org/10.7490/f1000research.1114631.1.

36. Cabanski CR, Magrini V, Griffith M, Griffith OL, McGrath S, Zhang J, et al.cDNA hybrid capture improves transcriptome analysis on low-input andarchived samples. J Mol Diagn 2014;16:440–51.

37. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The sequencealignment/map format and SAMtools. Bioinformatics 2009;25:2078–9.

38. McLaren W, Gil L, Hunt SE, Riat HS, Ritchie GRS, Thormann A, et al. TheEnsembl Variant Effect Predictor. Genome Biol 2016;17:122.

39. Cibulskis K, Lawrence MS, Carter SL, Sivachenko A, Jaffe D, Sougnez C, et al.Sensitive detection of somatic point mutations in impure and heterogeneouscancer samples. Nat Biotechnol 2013;31:213–9.

40. Saunders CT, Wong WSW, Swamy S, Becq J, Murray LJ, Cheetham RK. Strelka:accurate somatic small-variant calling from sequenced tumor-normal samplepairs. Bioinformatics 2012;28:1811–7.

41. YeK, SchulzMH, LongQ,Apweiler R,NingZ. Pindel: a pattern growth approachto detect break points of large deletions and medium sized insertions frompaired-end short reads. Bioinformatics 2009;25:2865–71.

42. Tan A, Abecasis GR, Kang HM. Unified representation of genetic variants.Bioinformatics 2015;31:2202–4.

43. Roehr JT, Dieterich C, Reinert K. Flexbar 3.0 - SIMD and multicore paralleliza-tion. Bioinformatics 2017;33:2941–2.

44. KimD, Langmead B, Salzberg SL. HISAT: a fast spliced aligner with lowmemoryrequirements. Nat Methods 2015;12:357–60.

45. Pertea M, Pertea GM, Antonescu CM, Chang T-C, Mendell JT, Salzberg SL.StringTie enables improved reconstruction of a transcriptome from RNA-seqreads. Nat Biotechnol 2015;33:290–5.

46. Bray NL, Pimentel H,Melsted P, Pachter L. Near-optimal probabilistic RNA-seqquantification. Nat Biotechnol 2016;34:525–7.

47. Szolek A, Schubert B, Mohr C, SturmM, FeldhahnM, Kohlbacher O. OptiType:precision HLA typing from next-generation sequencing data. Bioinformatics2014;30:3310–6.

48. Hundal J, Carreno BM, Petti AA, Linette GP, Griffith OL, Mardis ER, et al.pVAC-Seq: a genome-guided in silico approach to identifying tumor neoanti-gens. Genome Med 2016;8:11.

49. Vita R, Zarebski L, Greenbaum JA, Emami H, Hoof I, Salimi N, et al. TheImmune Epitope Database 2.0. Nucleic Acids Res 2009;38:D854–62.

50. Paul S,Weiskopf D,AngeloMA, Sidney J, Peters B, Sette A.HLA class I alleles areassociated with peptide-binding repertoires of different size, affinity, and immu-nogenicity. J Immunol 2013;191:5831–9.

51. Kesmir C, Nussbaum AK, Schild H, Detours V, Brunak S. Prediction ofproteasome cleavage motifs by neural networks. Protein Eng 2002;15:287–96.

52. Rasmussen M, Fenoy E, Harndahl M. Pan-specific prediction of peptide–MHCclass I complex stability, a correlate of T cell immunogenicity. J Immunol 2016;197:1517–24.

53. Zhang J, Mardis ER, Maher CA. INTEGRATE-neo: a pipeline for personalizedgene fusion neoantigen discovery. Bioinformatics 2017;33:555–7.

54. Murphy C, Elemento O. AGFusion: annotate and visualize gene fusions.BioRxiv 080903 [Preprint]. 2016. Available from: https://doi.org/10.1101/080903.

55. Haas BJ, DobinA, StranskyN, Li B, YangX, Tickle T, et al. STAR-Fusion: fast andaccurate fusion transcript detection from RNA-Seq. BioRxiv 120295 [Preprint].2017. Available from: https://doi.org/10.1101/120295.

AACRJournals.org Cancer Immunol Res; 8(3) March 2020 419

pVACtools: Toolkit for Tumor Neoantigen Characterization

Page 90: MATHEMATICAL o articles online: opics MODELING & AI · 1/5/2020  · Chen Lin1, Danielle S. Bitterman2,4, Georgia Tourassi3, and Jeremy L.Warner5 Abstract Current models for correlating

56. Nicorici D, S atalan M, Edgren H, Kangaspeska S, Murum€agi A, Kallioniemi O,et al. FusionCatcher – a tool for finding somatic fusion genes in paired-end RNA-sequencing data. BioRxiv 011650 [Preprint]. 2014. Available from: https://doi.org/10.1101/011650.

57. Davidson NM, Majewski IJ, Oshlack A. JAFFA: high sensitivity transcriptome-focused fusion gene detection. Genome Med 2015;7:43.

58. Kim D, Salzberg SL. TopHat-Fusion: an algorithm for discovery of novel fusiontranscripts. Genome Biol 2011;12:R72.

59. Schubert B, Kohlbacher O. Designing string-of-beads vaccines with optimalspacers. Genome Med 2016;8:9.

60. Negahdaripour M, Nezafat N, Eslami M, GhoshoonMB, Shoolian E, NajafipourS, et al. Structural vaccinology considerations for in silico designing of a multi-epitope vaccine. Infect Genet Evol 2018;58:96–109.

61. Kirkpatrick S, Gelatt CD Jr, Vecchi MP. Optimization by simulated annealing.Science 1983;220:671–80.

62. Miller A, AsmannY,Cattaneo L, Braggio E, Keats J, AuclairD, et al. High somaticmutation and neoantigen burden are correlated with decreased progression-freesurvival in multiple myeloma. Blood Cancer J 2017;7:e612.

63. Balachandran VP, èuksza M, Zhao JN, Makarov V, Moral JA, Remark R, et al.Identification of uniqueneoantigen qualities in long-term survivors of pancreaticcancer. Nature 2017;551:512.

64. Formenti SC, Rudqvist N-P, Golden E, Cooper B, Wennerberg E, Lhuillier C,et al. Radiotherapy induces responses of lung cancer to CTLA-4 blockade.Nat Med 2018;24:1845.

65. Prickett TD, Crystal JS, Cohen CJ, Pasetto A, Parkhurst MR, Gartner JJ,et al. Durable complete response from metastatic melanoma after transfer

of autologous T cells recognizing 10 mutated tumor antigens. CancerImmunol Res 2016;4:669–78.

66. Rajasagi M, Shukla SA, Fritsch EF, Keskin DB, DeLuca D, Carmona E, et al.Systematic identification of personal tumor-specific neoantigens in chroniclymphocytic leukemia. Blood 2014;124:453–62.

67. Kahles A, Lehmann K-V, Toussaint NC,H€userM, Stark SG, Sachsenberg T, et al.Comprehensive analysis of alternative splicing across tumors from 8,705patients. Cancer Cell 2018;34:211–24.e6.

68. Sidney J, Peters B, FrahmN, Brander C, Sette A. HLA class I supertypes: a revisedand updated classification. BMC Immunol 2008;9:1.

69. Sturniolo T, Bono E, Ding J, Raddrizzani L, Tuereci O, Sahin U, et al.Generation of tissue-specific and promiscuous HLA ligand databases usingDNA microarrays and virtual HLA class II matrices. Nat Biotechnol 1999;17:555–61.

70. Johanns TM, Miller CA, Dorward IG, Tsien C, Chang E, Perry A, et al.Immunogenomics of hypermutated glioblastoma: a patient with germline POLEdeficiency treated with checkpoint blockade immunotherapy. Cancer Discov2016;6:1230–6.

71. LaussM,DoniaM,Harbst K, Andersen R,Mitra S, Rosengren F, et al.Mutationaland putative neoantigen load predict clinical benefit of adoptive T cell therapy inmelanoma. Nat Commun 2017;8:1738.

72. Linette GP, Carreno BM. Neoantigen vaccines pass the immunogenicity test.Trends Mol Med 2017;869–71.

73. Vitiello A, Zanetti M. Neoantigen prediction and the need for validation.Nat Biotechnol 2017;35:815–7.

74. The problem with neoantigen prediction. Nat Biotechnol 2017;35:97.

Cancer Immunol Res; 8(3) March 2020 CANCER IMMUNOLOGY RESEARCH420

Hundal et al.