On the Road to Genomic Predictive Medicine An Interim Analysis Richard Simon Chief, Biometric Research Branch National Cancer Institute.

On the Road to Genomic Predictive Medicine

An Interim Analysis

Richard SimonChief, Biometric Research Branch

National Cancer Institute

How I got involved in genomics• In the late 1990’s genomic data was for me

the most exciting scientific data of our generation– Analysis of that data shouldn’t be left to amateurs

• We had a great cadre of statisticians involved in clinical trials and we know how to do reliable clinical trials, but the drugs are often disappointing– Statisticians should be involved in basic research,

pre-clinical target discovery and policy2

• Biomedical leaders were looking to computer scientists and physicists for help, not to statisticians

• Statisticians were viewed as useful for testing hypotheses and computing p values, not for discovery

• Many statisticians tend to see themselves as methods developers not as scientists focused on subject matter area

Imatinib chronology• 1960 - Philadelphia chromosome described (P

Nowell)• 1973 – Ph characterized as translocation of

BCR on chromosome 9 with ABL on chromosome 22 (J Rowley)

• 1986 – BCR-ABL fusion gene characterized as constituatively activated kinase (D Baltimore)

Imatinib chronology• 1988 -1995 CIBA-GEIGY develops kinase

inhibitors (A Matter, N Lydon, J Zimmermann, E Buchdunger)

• 1996 B Drucker (Dana Farber -> Oregon) screens in ex-vivo tumors and normal lymphocytes against compounds provided by Novartis and convinces company to sponsor clinical trials in CML in spite of only 5000 cases/yr in US

• Success depended on collaboration between industry and academia

• Delayed development resulted from reluctance of field to accept hypothesis that kinases can be selectively inhibited or that inhibiting a single gene could be very effective

• Industry involvement dependent on vision of a small leadership group in one company

• Clinical translation dependent on vision of one oncologist

• Success depends on serendiptiy • Academic medicine (NIH) is a bottom-up

system not optimized for risk taking or exploiting scientific leads for translating basic research to clinical products or for mounting large cooperative programs for overcoming bottlenecks in translation

• Academic medicine is very dependent on industry but industry has its own constraints

Predictive Medicine

• Germline genetics– GWAS– 23andMe

• Tumor genomics– Tumor Cell Genome Atlas

Ioannidis et al.JNCI 102:846(2010)

• 56 GWAS• 92 statistically significant associations

between cancer phenotype and genetic variant

• Median OR = 1.22• IQR OR = 1.15 – 1.36

AR = RR*Pr[disease | test -]

≤ RR*disease prevalence AR=absolute risk of disease for subject with high risk alleleRR=relative risk of disease for subject with high risk allele

Relative Risk of Disease Pr[Disease] Pr[Disease | test +]

1.3 0.01 0.013

1.3 0.10 0.13

• Cancers of a given histologic diagnosis are genomically heterogeneous

• Cancers are mostly caused by somatic mutations not genetic polymorphisms

• Most of the information about the disease is in the tumor genome, not the germ-line genome

Biomarkers for Early Detection

• Because of the long time between first mutation and clinical diagnosis of human solid tumors, there would seem to be great opportunity for early detection

• Phase II trials of early detection have used samples from patients at diagnosis

• Effective detection must have long lead time and high specificity for tumors which will evolve to be life threatening

Biomarkers for Informing Treatment Selection

• Prognostic biomarkers– Measured before treatment to indicate long-term

outcome for patients untreated or receiving standard treatment

• To identify which patients have excellent prognosis on conservative treatment

• Predictive biomarkers– Measured before treatment to identify who is

likely or unlikely to benefit from a particular treatment

Prognostic Markers

• Vast literature on prognostic markers• Very few used in practice

• Most studies motivated by desire to learn about disease biology• Broad selection of cases• Little focus on intended use• Little focus on analytical validation of assay

Validation of Biomarkers• Analytical validity

– Measures what it supposed to – Reproducible

• Clinical validity– Correlates with something clinically

• Clinical utility– Is actionable– Measuring marker leads to action that benefits patient– Requires clarity on intended use

If you don’t know where you are going, you might not get there

Yogi Berra

Prognostic Markers

• OncotypeDx: Which patients with node negative ER positive breast cancer who are receiving tamoxifin will have such good prognosis that they do not need cytotoxic chemotherapy?

• Analysis focused on whether marker identifies such a subset, not on statistical significance

B-14 Results—Relapse-Free Survival

338 pts

149 pts

181 pts

0 2 4 6 8 10 12 14 16

Time (yrs)

se-Free S

urvival

Low R isk (R S < 18) Intermediate R isk (R S 18 - 30) H igh R isk (RS 31)

p<0.0001

Paik et al, SABCS 2003

Major problems with prognostic studies of gene expression signatures

• Inadequate focus on intended use– Cases selected based on availability of specimens rather

than for relevance to intended use– Heterogeneous sample of patients with mixed stages and

treatments. Attempt to disentangle effects using regression modeling

– Overemphasis on statistical significance and hazard ratios.

• Over-fitting data

For p>n problems

• Fit of a model to the same data used to develop it is no evidence of prediction accuracy for independent data

Simulation Training Validation

p=7.0e-05

p=0.70

p=4.2e-07

p=0.54

p=2.4e-13

p=0.60

p=1.3e-10

p=0.89

p=1.8e-13

p=0.36

p=5.5e-11

p=0.81

p=3.2e-09

p=0.46

p=1.8e-07

p=0.61

p=1.1e-07

p=0.49

p=4.3e-09

p=0.09

Validation of Prognostic Model

• Completely independent validation dataset

• Splitting dataset into training and testing sets

• Cross-validation

• Partition data set D into K equal parts D1,D2,...,DK

• First training set T1=D-D1

• Develop completely specified prognostic model M1 using only data T1

• Compute prognostic score for cases in D1

• Develop model M2 using only T2 and then score cases in D2

• Repeat for ... TK -> MK -> DK

• Group patients into risk groups (e.g. 2 or more) based on their cross-validated scores

• Calculate Kaplan-Meier survival curve for each risk-group

Complete cross Validation

• Cross-validation simulates the process of separately developing a model on one set of data and predicting for a test set of data not used in developing the model– All aspects of the model development process must

be repeated for each loop of the cross-validation• Feature selection• Tuning parameter optimization

Prediction on Simulated Null DataSimon et al. J Nat Cancer Inst 95:14, 2003

Generation of Gene Expression Profiles

• 20 specimens (Pi is the expression profile for specimen i)

• Log-ratio measurements on 6000 genes

• Pi ~ MVN(0, I6000)

• Can we distinguish between the first 10 specimens (Class 1) and the last 10 (Class 2)?

Prediction Method

• Compound covariate predictor built from the log-ratios of the 10 most differentially expressed genes.

Number of misclassifications

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Cross-validation: none (resubstitution method)Cross-validation: after gene selectionCross-validation: prior to gene selection

Cross Validation

• The cross-validated estimate of misclassification error is an estimate of the prediction error for the model fit applying the specified algorithm to full dataset

• Statistical significance of the difference in survival among risk groups is usually not the point

• But to evaluate significance, the log-rank test cannot be used for cross-validated Kaplan-Meier curves because the survival times are not independent

• Statistical significance can be properly evaluated by approximating the null distribution of the cross-validated log-rank statistic

• Permute the survival times and repeat the entire cross-validation procedure to generate new cross-validated K-M curves for low risk and high risk groups– Compute log-rank statistic for the curves

• Repeat for many sets of permutations

Predictive Biomarkers• Single gene or protein measurement

– ER protein expression– HER2 amplification– EGFR mutation– KRAS mutation– V600E mutation– ALK translocation

• Index or classifier that summarizes expression levels of multiple genes

The standard approach to designing phase III clinical trials is based on two assumptions

• Qualitative treatment by subset interactions are unlikely

• “Costs” of over-treatment are less than “costs” of under-treatment

• Cancers of a primary site often represent a heterogeneous group of diverse molecular diseases which vary fundamentally with regard to – the oncogenic mutations that cause them – their responsiveness to specific drugs

• Most new cancer drugs are very expensive– the aspirin paradigm on which current clinical trial

dogma is based can be a roadblock to progress

Standard Clinical Trial Approaches

• Have led to widespread over-treatment of patients with drugs to which few benefit

• Are not scientifically well founded nor economically sustainable for future cancer therapeutics

• Neither current practices of subset analysis nor current practices of ignoring subset analysis are effective for evaluating treatments and informing physicians in heterogeneous diseases

• How can we develop new drugs in a manner more consistent with modern tumor biology and obtain reliable information about what regimens work for what kinds of patients?

Using phase II data, develop predictor of response to new drug

Develop Predictor of Response to New Drug

Patient Predicted Responsive

New Drug Control

Patient Predicted Non-Responsive

Off Study

Targeted (Enrichment) Design

Evaluating the Efficiency of Targeted Design

• Simon R and Maitnourim A. Evaluating the efficiency of targeted designs for randomized clinical trials. Clinical Cancer Research 10:6759-63, 2004; Correction and supplement 12:3229, 2006

• Maitnourim A and Simon R. On the efficiency of targeted clinical trials. Statistics in Medicine 24:329-339, 2005.

• http://brb.nci.nih.gov

• Relative efficiency of targeted design depends on – proportion of patients test positive– effectiveness of new drug (compared to control) for test

negative patients• When less than half of patients are test positive and

the drug has little or no benefit for test negative patients, the targeted design requires dramatically fewer randomized patients than the standard design in which the marker is not used

• Companion diagnostic test with intended use of identifying patients who have disease subtype for which the drug is proven effective

Stratification Design for New Drug Development with Companion Diagnostic

Fallback Analysis Plan

• Compare the new drug to the control overall for all patients ignoring the classifier.– If poverall ≤ 0.01 claim effectiveness for the eligible population

as a whole• Otherwise perform a single subset analysis evaluating

the new drug in the classifier + patients– If psubset ≤ 0.04 claim effectiveness for the classifier + patients.

• The test in the subset is not dependent on finding an overall significant finding or a significant interaction

• The trial is sized for powering both tests• The validity of the analysis does not depend on

stratifying the randomization by the test

δ+ =treatment effect in test + patients

δ− = treatment effect in test - patients

Two-point priors for δ+ and δ− with values {0,δ*}

Pr[δ+ =δ− =0] =p00

Pr[δ− =0 |δ+ =δ*] =r1

Pr[δ+ =0 |δ− =δ*] =r2

Strong confidence in test: Small r2 and large r1

Weak confidence in test: Small r2 and small r1

p00 selected to control type I error rates

The Objectives of a Phase III Clinical Trial

• Test the strong null hypothesis that the new treatment E is uniformly ineffective relative to a control C while preserving the type I error of the study

• If the null hypothesis is rejected, develop an internally validated labeling indication for informing physicians in their decisions about which patients they treat with the drug.– Not a hypothesis testing problem

The keys to developing effective drugs

• The target of the drug must be central to the progression of the disease

• Drug should be selective for the target so that it can be administered at a concentration that totally shuts down the de-regulated pathway

• Need a test that identifies the patients who have disease driven by de-regulation of the target

Tumors can contain large numbers of genetic alterations

• Passenger mutations– Occur at rate of non-synonymous mutations

• Driver mutations– Occur more frequently than non-synonymous mutations and presumably have a

functional role in oncogenesis and pathogenesis of the tumor– Determined from sequencing of many tumors of a histological type

• Founder Mutations• Originating mutation

Extend Previous Methods to Allow

• Background mutation rate to vary among tumors

• Background mutation rate to depend on sequence context of mutation

Founder mutations

• Mathematical modeling studies indicate that 2-4 rate-limiting events occurring at normal mammalian mutation rates can account for age-incidence statistics for many types of human solid tumors

• These rate limiting events may correspond to the founder mutations of a tumor • They may be rate-limiting because they occur when the tumor is restricted to small

anatomic compartments prior to the occurrence of genome destabilizing mutations• They may permit the tumor to grow to a size in which acquisition of subsequent

mutations is not rate-limiting

• Additional driver mutations occur that aid tumor invasion and metastatic

dissemination

Founder mutations may be of special importance

• They exist in all sub-clones of the tumor and so all tumor cells may be susceptible to founder mutation targeted treatment

• Subsequent mutations develop in the context of the founder mutations and be viable only in that context rendering the tumors “addicted” to the early mutations

Closing• Germ-line genomics has so far had a limited impact

on individual risk prediction in oncology and in understanding the nature of oncogenesis

• Tumor genomics is revolutionizing our

understanding of cancer and is providing important opportunities to identify key molecular targets and for improving therapeutic decision making

Closing• Treatment of broad populations with regimens that

do not benefit most patients is increasingly less necessary nor economically sustainable

• The established molecular heterogeneity of cancer

requires the use new approaches to clinical trial design and analysis

• Developments in high dimensional assays and NGS have stimulated many areas of biostatistics and placed greater emphasis on discovery and prediction

To Meet the Challenges and Opportunities Available to Impact on Human Disease Biostatistics Should

• Continue to broaden identity beyond probabilistic inference and methods development– Reward statistical scientists, not just statistical mathematicians

• Embrace information technology– Include the informatics end of bioinformatics

• Transcend hypothesis testing to include discovery, prediction and decision making

Joi Ito – Director, MIT Media Lab

• In the old days, the world didn’t change very much, so once you became a plumber, you didn’t really need to learn that much more about plumbing. Today you have to keep learning and learning is somewhat of a childlike behavior. We want the Media Lab to me more like kindergarten and less like a lumber mill.

• “Prediction is difficult; particularly the future.”– Neils Bohr

Acknowledgements

• Boris Freidlin• Wenyu Jiang• Stella Karuri• Aboubakar Maitournam• Michael Radmacher• Jyothi Subramanian• Ahrim Youn• Yingdong Zhao

On the Road to Genomic Predictive Medicine An Interim Analysis Richard Simon Chief, Biometric Research Branch National Cancer Institute.

reliable clinical trials

clinical products

trials of early detection

basic research

genomic predictive medicine

preclinical target discovery

long time

translationacademic

Documents

A PREDICTIVE MODEL FOR TYPE 2 DIABETES MELLITUS BASED...

SLNO.Name of the staff Particulars Biometric images...

Statistical Challenges for Predictive Onclogy Richard Simon,...

Security and privacy issues in biometric systems ·...

Use of Prognostic & Predictive Biomarkers in Clinical Trial....

Biometric Standards documents/Standards... · Biometric...

Introduction to Design of Genomic Clinical Trials Richard...

Steps on the Road to Predictive Oncology Richard Simon,...

The effect of menstrual cycling on genomic predictive ...

Biometric Recognition Technologies and Biometric...

Increasing Predictive Ability using Dominance in Genomic...

Statistical Aspects of the Development and Validation of...

Genomic risk prediction of coronary artery disease in...

Using Predictive Classifiers in the Design of Phase III...

Genome-Wide Association Study towards Genomic...

Moving from Correlative Studies to Predictive Medicine...