BreastPRS is a Gene Expression Assay that Stratifies Intermediate-Risk Oncotype DX Patients into High or Low-Risk for Disease Recurrence

PRECLINICAL STUDY

BreastPRS is a gene expression assay that stratifies intermediate-risk Oncotype DX patients into high- or low-risk for diseaserecurrence

Timothy M. D’Alfonso • Ryan K. van Laar • Linda T. Vahdat • Wasay Hussain •

Rachel Flinchum • Nathan Brown • Linda Saint John • Sandra J. Shin

Received: 2 May 2013 / Accepted: 8 June 2013 / Published online: 18 June 2013

� Springer Science+Business Media New York 2013

Abstract Molecular prognostic assays, such as Oncotype

DX, are increasingly incorporated into the management of

patients with invasive breast carcinoma. BreastPRS is a new

molecular assay developed and validated from a meta-

analysis of publically available genomic datasets. We

applied the assay to matched fresh-frozen (FF) and formalin-

fixed paraffin-embedded (FFPE) tumor samples to translate

the assay to FFPE. A linear relationship of the BreastPRS

prognostic score was observed between tissue preservation

formats. BreastPRS recurrence scores were compared with

Oncotype DX recurrence scores from 246 patients with

invasive breast carcinoma and known Oncotype DX results.

Using this series, a 120-gene Oncotype DX approximation

algorithm was trained to predict Oncotype DX risk groups

and then applied to series of untreated, node-negative,

estrogen receptor (ER)-positive patients from previously

published studies with known clinical outcomes. Correlation

of recurrence score and risk group between Oncotype DX

and BreastPRS was statistically significant (P \ 0.0001). 59

of 260 (23 %) patients from four previously published

studies were classified as intermediate-risk when the

120-gene Oncotype DX approximation algorithm was

applied. BreastPRS reclassified the 59 patients into binary

risk groups (high- vs. low-risk). 23 (39 %) patients were

classified as low-risk and 36 (61 %) as high-risk (P = 0.029,

HR: 3.64, 95 % CI: 1.40–9.50). At 10 years from diagnosis,

the low-risk group had a 90 % recurrence-free survival

(RFS) rate compared to 60 % for the high-risk group.

BreastPRS recurrence score is comparable with Oncotype

DX and can reclassify Oncotype DX intermediate-risk

patients into two groups with significant differences in RFS.

Further studies are needed to validate these findings.

Keywords BreastPRS � Oncotype DX � Breast cancer

recurrence � Microarray

Introduction

Clinicians are increasingly incorporating and utilizing

genomic information of patients’ breast tumors via multi-

gene prognostic signatures to guide treatment recommen-

dations. These molecular assays, combined with traditional

clinical and pathologic variables, are used to determine the

risk of cancer recurrence and the benefits of adding che-

motherapy to a patient’s treatment regimen. A number of

prognostic and predictive multigene assays are commer-

cially available for this purpose, of which the Oncotype DX

is currently the most popular for luminal subtypes of node-

negative (N0) breast cancer patients.

BreastPRS (Signal Genetics) is a new molecular char-

acterization assay developed and validated from a meta-

analysis of publically available genomic datasets. Breast-

PRS is unique in that the 200 genes utilized in its algorithm

Electronic supplementary material The online version of thisarticle (doi:10.1007/s10549-013-2604-0) contains supplementarymaterial, which is available to authorized users.

T. M. D’Alfonso (&) � S. J. Shin

Department of Pathology and Laboratory Medicine,

New York-Presbyterian Hospital, Weill Cornell Medical

College, 525 East 68th Street, Starr 1031E, New York,

NY 10065, USA

e-mail: [email protected]

R. K. van Laar � W. Hussain � R. Flinchum � N. Brown �L. S. John

Signal Genetics, LLC, New York, NY, USA

L. T. Vahdat

Department of Hematology/Oncology, Weill Cornell Medical

College, New York, NY, USA

123

Breast Cancer Res Treat (2013) 139:705–715

DOI 10.1007/s10549-013-2604-0

http://dx.doi.org/10.1007/s10549-013-2604-0

(validated in a large series of breast cancer patients) were

significantly associated with RFS, independent of tradi-

tional prognostic variables including age, tumor size, ER

status, tumor grade, and nodal involvement [1]. In contrast

to Oncotype DX, BreastPRS is a binary assay which

stratifies patients into low- and high-risk groups.

In this study, we sought to (i) translate the previously

published 200-gene prognostic signature from fresh frozen

(FF) to formalin-fixed paraffin-embedded (FFPE) tissue,

(ii) compare the BreastPRS prognostic index to the Onc-

otype DX assay using FFPE patient specimens analyzed by

both methods and correlate recurrence scores with clin-

icopathogic features, and (iii) use publically available

whole genome profiles from series of untreated ER? N0

patients to investigate the ability of BreastPRS to reclassify

Oncotype DX intermediate-risk patients into binary risk

categories (high- vs. low-risk) with clinically significant

differences in outcome. The ultimate goal was to assist

clinicians with decision making for patients whose tumors

are classified as intermediate-risk by Oncotype DX.

Materials and methods

Translation of BreastPRS from FF to FFPE tissue

The 200-gene prognosis signature within BreastPRS was

originally developed from gene expression data generated

from FF breast cancer tissue. In order to translate this signature

for use with FFPE tissue, RNA from FF and FFPE portions of

the same 35 individual breast tumors was obtained from a

commercial tissue repository (BioServe, Beltsville, MD, USA)

(Table 1). Pre-isolated RNA from the FF portion of each tumor

was supplied by BioServe and hybridized to Affymetrix U133

GeneChips according to manufacturer recommendations. For

the FFPE counterparts, RNA was isolated and amplified from

five 10-lM sections of each specimen using the Ovation FFPE

WTA System (NuGen Inc., San Carlos, CA, USA). A mini-

mum tumor cell content of[50 % was verified by the supplier.

Amplified cDNA was fragmented, labeled, and hybridized to a

Human Genome U133 Plus 2.0 GeneChip according to man-

ufacturer recommendations.

As an additional FF to FFPE validation series, a 20-patient

series of matched FF and FFPE breast cancer data was

downloaded from National Center for Biotechnology Infor-

mation (NCBI) Gene Expression Omnibus (GEO) [2]. In both

series, the prognostic signature was applied to each genomic

profile (averaged across technical replicates where present)

and the resulting risk scores were compared between tissue

preservation methods. Passing and Bablok regression, a pro-

cedure with no special assumptions regarding the distribution

of the samples and the measurement errors, was used to assess

linearity and consistency [3]. The regression equation from

this analysis was then used to adjust the low-/high-risk

threshold for FFPE tissue, to maintain the previously pub-

lished characteristics of the signature [1].

Gene expression profiling of FFPE breast tumors

previously analyzed with Oncotype DX

Two hundred eighty-four patients with consecutively

diagnosed invasive breast carcinoma and known Oncotype

DX recurrence scores performed as part of their routine

clinical care were identified and retrieved from pathology

files at Weill Cornell Medical College. Unstained FFPE

slides from representative tumor blocks were used for gene

array analysis. RNA was isolated after manual microdis-

section from tissue slides using Prelude FFPE RNA Iso-

lation Module, part no. 1410-50 (NuGen Inc., San Carlos,

CA, USA). The isolated RNA was converted to cDNA,

amplified, and then hybridized to Affymetrix 133 Plus 2.0

Whole Genome microarrays. Normalized gene expression

profiles were generated using the MAS5 probe summari-

zation algorithm and annotated with Bioconductor ‘‘An-

notationData’’ package, release 2.11 [4]. For quality-

control purposes, gene chips with fewer than 25 % pro-

beset detection were excluded from further analysis.

Creation of a novel gene expression signature to predict

a patient’s Oncotype DX risk group from microarray

data

After excluding the 21 genes used by the proprietary

Genomic Health recurrence score algorithm, a new signa-

ture was developed to predict the Oncotype DX risk group

of a breast cancer specimen using Affymetrix U133 Plus

2.0 data. To create the signature, the significance of asso-

ciation between each Affymetrix probe and Oncotype DX

risk group (low, intermediate, or high) was determined,

using an F test. A strict P value threshold of 1 9 10-6 to

reduce the number of false positive probes was used to

select the most discriminatory genes whose association

with Oncotype DX risk group in the training series was

visualized with hierarchical clustering. In order to apply

the gene signature to archival breast cancer specimens with

outcome data, the gene signature was used to train a

diagonal Linear Discriminant Analysis classifier (LDA)

using partial cross validation [5]. The trained algorithm,

which will be referred to as the Oncotype DX approxi-

mation algorithm, was applied to gene expression profiles

assembled from public repositories (NCBI GEO) from

patients from four previously published studies [6–9] and

classified each patient as low, intermediate, or high Onc-

otype DX risk group. Only those patients with ER?, N0

disease who did not receive adjuvant therapy were included

in this analysis. Those predicted to be intermediate-risk

706 Breast Cancer Res Treat (2013) 139:705–715

123

were reanalyzed using the BreastPRS prognostic signature

and the reclassified high- and low-risk groups were com-

pared to known outcomes. A flow diagram of the study

design is shown in Fig. 1.

Data management and statistical analysis

Statistical analyses were performed using Microsoft Excel,

R and Medcalc (version 12.3.0). All t tests were two sided

and P values below 0.05 were considered statistically

significant. To compare differences in RFS between risk

groups on a univariate level, Kaplan–Meier analysis and

log-rank testing were performed. To evaluate the differ-

ences between risk groups on a multivariate level, Cox

proportional hazards regression analysis was used, includ-

ing tumor grade, size, and BreastPRS risk group.

Results were analyzed using the Signal Genetics

ResultsPX platform and R (www.r-project.org). The

Table 1 Patient demographics

from breast cancer series used in

this study

FFPE formalin-fixed paraffin

embedded, ER estrogen

receptor, NA not available,

Ax axillary, CT classical type,

PT pleomorphic type

Characteristic Fresh frozen to

FFPE series

Weill Cornell

Oncotype DX series

Archival ER?,

node-negative series

Number: 55 246 260

Age (mean years) 60 (NA: 20) 57 49 (NA: 34)

Histologic grade

1 – 43 54

2 – 116 150

3 – 50 42

Lobular (CT/PT) – 32/5 NA

NA – 0 14

T Size

T1 – 214 146

T2 – 31 110

T3 – 1 3

NA

Ax. lymph node status

Positive – 37 0

Negative – 209 260

ER status

Positive 32 242 260

Negative 20 3 0

NA 3 1 0

Stage

I 1 215 146

II 28 31 110

III 0 0 3

NA 26 0 1

Recurrence

Yes – – 56

No – – 204

Follow-up (median years) – – 9.9

Gene expression repository IDs ArrayExpress:

E-TABM-108

GSE47109 NCBI GEO:

GSE11121,

GSE4922,

GSE6352,

GSE7390

Gene expression platform Affymetrix

GeneChip U133

Plus 2.0 (100)

Illumina

HumanRef-8

Expression

BeadChip (90)

Affymetrix

GeneChip U133

Plus 2.0

Affymetrix

GeneChip U133

Plus 2.0

Breast Cancer Res Treat (2013) 139:705–715 707

123

http://www.r-project.org

BreastPRS prognostic score was compared with known

clinicopathologic variables (age, tumor size, nodal status,

ER/PR/HER2 status, Ki-67, lymphovascular invasion) and

Oncotype DX recurrence score for each patient. Data from

the 284 patient training series have been deposited at NCBI

GEO under accession GSE 47109.

Results

Patient demographics from the series used in this study are

summarized in Table 1.

Translation of BreastPRS from FF to FFPE tissue

The previously published 200-gene prognosis score was

calculated for 35 matched pairs of microarray profiles for

FF and FFPE breast cancer specimens which passed RNA

and GeneChip quality metrics, as described. The r2 and

intraclass correlation coefficients for paired measurements

were 0.67 and 0.90, respectively. Passing and Bablock

regression analysis revealed the relationship between tissue

types exhibited no significant deviation from linearity

(P = 0.91), as shown in Fig. 2a. When the regression

equation [FFPE score = -0.265 ? 1.051 9 FF score] is

applied to the previously determined FF-based classifica-

tion threshold for high-/low-risk (i.e., [-0.38 = high-

risk), the FFPE threshold was determined to be [-0.63.

This adjustment ensures that the performance of the high-/

low-risk groups is consistent with those previously pub-

lished for FF tissue, for example in terms of recurrence-free

and overall survival.

In order to further verify the threshold adjustment,

microarray data from a published series of 20 paired FF

and FFPE breast cancer specimens were analyzed [2]

(Fig. 2b). These specimens were profiled (with technical

replication) using the Illumina Whole-Genome DASL

Assay which contains 24,526 transcripts. Probe mapping

identified 169 of the 200 (85 %) genes in the BreastPRS

prognostic signature, therefore the complete algorithm

could not be applied. The prognostic signature was calcu-

lated on this subset and Passing and Bablok regression

showed the relationship between tissue preservation

methods to be [FFPE-score = 1.038 9 FF score - 0.154].

In general, both datasets revealed a linear relationship

between the BreastPRS prognostic score when measured in

paired FF and FFPE specimens. Regression analysis

showed that an adjustment to the classification threshold of

-0.246 was necessary to perform BreastPRS using FFPE

tissue and maintain the previously published performance

characteristics of the low- and high-risk groups. Perform-

ing a similar comparison on a different patient series pro-

filed on a different platform (Illumina) showed an

adjustment of a similar magnitude would be necessary if

BreastPRS was to be performed on this system.

For comparison purposes with our own dataset (Af-

fymetrix), we computed a 169-gene version of the prognostic

score and applied it to the 35 pairs of FF and FFPE specimens

in this validation series (Fig. 2c). Regression analysis of this

subset algorithm revealed the regression analysis to be

similar to that observed for the Mittempergher/Illumina

series: FFPE score = -0.214 ? 1.005 9 FF score.

Comparison of BreastPRS prognostic indices to official

Oncotype DX recurrence scores

Of the 284 cases of invasive carcinoma with known Onco-

type DX recurrence scores, 246 cases passed all RNA and

Fig. 1 Flow diagram

summarizing analytical aspects

of validation study


123

chip-related quality controls. The two prognostic algorithms

were compared using whole genome profiles from these

cases (Fig. 3a). Both metrics were designed to be continu-

ously associated with risk of disease recurrence, reflected in

the interclass correlation coefficient of 0.73 (95 % CI:

0.65–0.79), indicating a moderately strong positive rela-

tionship between the two metrics. Figure 3b shows a poly-

nomial regression line fitted to the dataset, suggesting the

spread of risk scores may be greater in BreastPRS as com-

pared to Oncotype DX, where more scores are compressed in

the low-to-middle range distribution. For Oncotype DX,

low-risk = \18, intermediate-risk = 18–30, and high-

risk = [30. BreastPRS classified patients as only high- or

low-risk, based on the threshold of high-risk = 34 or greater.

Of the 30 high-risk Oncotype DX cases, 27 (90 %) were

classified as high-risk by BreastPRS (Table 2). Ninety-five

low-risk Oncotype DX cases (76 %) were classified as low-

risk by BreastPRS. Interestingly, a majority (60 %) of cases

classified as intermediate-risk by Oncotype DX were clas-

sified as low-risk by BreastPRS, a group which has previ-

ously been shown to not benefit from adjuvant chemotherapy

treatment [10].

Identification of Oncotype DX intermediate-risk

patients in public gene expression profile data

repository

After identifying a novel set of genes able to hierarchical

cluster Affymetrix profiles into three groups which corre-

spond to Oncotype DX low-, intermediate-, and high-risk

groups, an Oncotype DX approximation algorithm was

created to predict Oncotype DX risk groups of genomic

profiles from other datasets. Partial cross validation of the

120-gene Oncotype DX approximation algorithm on the

Oncotype DX training series reproduced accuracy of the

hierarchical clustering in terms of risk group assignment and

official Oncotype DX classification (Fig. 4). Overall, 72 %

of patients were predicted correctly by the 120-gene algo-

rithm with a mean cross-validation sensitivity of 68 %,

specificity of 84 %, positive predictive value of 69 %, and

negative predictive value of 85 %. Variation between the

official and predicted Oncotype DX risk groups may be

attributable to intratumoral heterogeneity, differences in

measuring gene expression using qPCR versus microarray,

or the general reproducibility of the 21-gene signature itself.

Fig. 2 Passing and Bablok Regression of the BreastPRS prognostic

score calculated on paired FF and FFPE specimens. a Affymetrix

200-gene signature in 35 pairs of FF/FFPE RNA. FFPE score =

-0.265 ? 1.051 9 FF score. b Illumina 169-gene signature in 20

pairs of FF/FFPE RNA. FFPE score = 1.038 9 FF score - 0.154.

c Affymetrix 169-gene signature in same series of patients as a. FFPE

score = -0.214 ? 1.005 9 FF score. FF fresh frozen, FFPE for-

malin-fixed paraffin-embedded, RNA ribonucleic acid


123

For comparison purposes, a separate cross-validation exer-

cise was performed using the 16 genes used by Genomic

Health to perform the Oncotype DX assay (after renormal-

izing to the 5 Oncotype normalization genes) as published

[11]. With this method, only an additional 3 % of patients

analyzed were classified into the same risk group as the

official Oncotype DX risk group (data not shown), a statis-

tically insignificant difference, as determined by a t test of

proportions (P = 0.63). Therefore, when a tumor is profiled

using the Affymetrix platform, it appears that using the 21

gene (patented) Oncotype DX signature does not reproduce

the commercial qPCR assay with significantly greater

accuracy than using a novel 120-gene Oncotype DX

approximation algorithm.

Within the 120-gene Oncotype DX approximation

algorithm predicted risk groups, it was the intermediate-risk

group which differed most between the official 21-gene

PCR method risk groups and those predicted by the

120-gene approximation algorithm. Of the ‘‘true’’ inter-

mediate-risk patients, 73 % were classified as intermediate-

risk by the 120-gene approximation algorithm. For gene set

comparison purposes, the Affymetrix implementation of the

‘‘official’’ 21-gene signature was also evaluated. Using the

21-gene set, 69 % of true intermediate-risk patients were

classified as such. This suggests that the 120-gene Oncotype

DX approximation algorithm is more accurate at identifying

‘‘true’’ Oncotype DX intermediate-risk patients in previ-

ously published breast cancer gene expression datasets.

Performing additional algorithm incorporating gene-rese-

lection in each loop of the cross-validation procedure did

not improve classification accuracy (data not shown).

Application of the 120-gene Oncotype DX

approximation algorithm to expression profiles

from historical datasets to classify patients

into predicted Oncotype DX risk groups

The 120-gene Oncotype DX approximation algorithm was

applied to the whole genome expression profiles of 260

ER?, N0, untreated (i.e., no Tamoxifen or chemotherapy)

breast cancer patients from four previously published

studies [6–9], with recurrence-related outcome data avail-

able. Application of the 120-gene Oncotype DX approxi-

mation algorithm classified 169 (65 %) as low-risk, 59

(23 %) as intermediate-risk, and 32 (12 %) as high-risk.

Kaplan–Meier analysis of these risk groups approached

statistical significance in 15-year RFS (log-rank test

P value 0.088) (Fig. 5a). In a multivariate analysis, the

difference in RFS between the high-risk and low-risk

groups was significant (P = 0.043), but not between the

high- and intermediate-risk groups (P = 0.55).

Because this series of patients were compiled from four

historical datasets, each with their own patient selection

criteria, the proportion of patients in each risk group was not

expected to resemble that observed in the general population

or target demographic of the Oncotype DX assay. Despite

this, the proportions observed are similar to that reported in

Fig. 3 a Scatter plot of 246 FFPE breast cancer specimens analyzed

by both Oncotype DX and BreastPRS. Cases are colored according to

the risk group of each assay. Both prognosis scores were created to be

continuous markers of worsening prognosis with higher scores

correlating with higher rates of outcome and poor overall survival.

b Polynomial regression fitted line exploring the relationship between

the Oncotype DX recurrence score and BreastPRS prognosis index.

This plot suggests that the spread of risk scores may be greater in

BreastPRS as compared to Oncotype DX, where more scores are

compressed in the low-to-middle range distribution. OT Oncotype

DX, FFPE formalin-fixed paraffin-embedded

Table 2 Direct comparison of risk group stratification by Oncotype

DX (21-gene qPCR) and BreastPRS (200-gene Affymetrix signature)

Oncotype DX Risk Total BreastPRS BreastPRS

Low-risk High-risk

Low 125 95 (76 %) 30 (24 %)

Intermediate 91 55 (60 %) 36 (40 %)

High 30 3 (10 %) 27 (90 %)

Total 246 153 93

v2 P value for risk group association \0.0001


123

the NSABP B-14 trial, i.e., low-risk (51 %), intermediate-

risk (22 %), and high-risk (27 %) [11]. When the 200-gene

BreastPRS algorithm is applied to expression profiles from

this set of patients (Fig. 5b), 129 (49.6 %) were classified as

low-risk and 131 (50.4 %) were classified as high-risk with a

significant difference in 15-year RFS (log-rank test

P value = 0.0001). In a multivariate model with grade and

tumor size, the BreastPRS signature remains statistically

significant (P \ 0.0001) with a hazard ratio of 3.94 (95 %

CI: 1.99–7.78) as shown in Table 3.

BreastPRS was then applied to the 169 cases classified

as low-risk by the 120-gene Oncotype DX approximation

Fig. 4 Hierarchical clustering of the 120-gene Oncotype DX approximation algorithm identified by comparing whole genome profiles of

patients’ samples previously analyzed by Oncotype DX


123

algorithm to investigate the inconsistent risk group

assignments of cases from the Weill Cornell series that

were classified as low-risk by Oncotype DX and high-risk

by BreastPRS (24 % of low-risk cases, Table 2). Sixty-

three of the 169 (37 %) low-risk cases were classified as

high-risk by BreastPRS. Kaplan–Meier analysis was

Fig. 5 Kaplan–Meier analyses of a all ER?, node-negative,

untreated archival patients stratified by 120-gene Oncotype DX

approximation algorithm risk groups (n = 260) P = 0.088 and b by

200-gene BreastPRS algorithm risk groups (n = 260) P = 0.0001,

HR: 3.00 (95 % CI: 1.77–5.08). c Archival ER?, node-negative,

untreated breast cancer patients predicted to be Oncotype DX

intermediate-risk by the 120-gene approximation signature (from a),

reclassified as high- or low-risk by BreastPRS (n = 59) P = 0.029,

HR: 3.64 (95 % CI: 1.40–9.50). ER estrogen receptor, HR hazard

ratio, CI confidence interval


123

performed on these 63 and compared with the remaining

106 that were classified as low-risk by both methods. The

63 cases classified as high-risk by BreastPRS experienced

significantly shorted RFS compared to the cases classified

as low-risk by both methods [hazard ratio: 2.96 (95 % CI:

1.39–6.28), P = 0.0010]. At 10 years from diagnosis, the

chance of recurrence is 25 % lower in the low-risk group

compared to the high-risk group (90 vs. 65 %) and 15 %

lower at 15 years from diagnosis.

Subgroup analysis of Oncotype DX intermediate-risk

patients by BreastPRS

To test the hypothesis that BreastPRS is able to reclassify

Oncotype DX intermediate-risk patients as high- or low-

risk, the 59 (ER?, N0, untreated) intermediate-risk patients

identified in the retrospective series were analyzed by

BreastPRS (Fig. 5c). Twenty-three (39 %) patients were

classified as low-risk and 36 (61 %) as high-risk. The

hazard ratio of a high-risk classification was 3.64 (95 % CI:

1.40 to 9.50) and log-rank testing indicated that the results

were statistically significant (P = 0.029). At 10 years from

diagnosis, the low-risk group had a 90 % RFS rate, com-

pared to 60 % for the high-risk group. When adjusted for

tumor grade and size in a multivariate cox proportional

hazards model (Table 3), BreastPRS was the closest vari-

able to achieving statistical significance (P = 0.055, HR:

3.58, 95 % CI: 0.97–13.14). These data indicate that

BreastPRS is able to reclassify Oncotype DX intermediate-

risk patients as high- or low-risk for disease recurrence,

independent of clinical variables of size and tumor grade.

Discussion

While current guidelines recommend consideration of

chemotherapy for the majority of patients with invasive

breast carcinoma [12], most patients with small, ER?

tumors will not gain additional benefit from adding adju-

vant chemotherapy to Tamoxifen, and can likely be spared

the toxicities of the former. Clinicians have traditionally

relied on clinical and pathologic factors including tumor

size, axillary lymph node status, histologic grade, and

hormone receptor status when assessing the need for

adjuvant chemotherapy in patients with early breast cancer.

Recent advances in gene expression profiling and micro-

array technology have led to a greater understanding of the

biology of breast cancer at the molecular level. This

coincides with advances in imaging techniques that have

led to an increase in the detection of smaller invasive

cancers. Clinicians are increasingly incorporating genomic

data from patients’ tumors via multigene prognostic sig-

natures to gather additional information, particularly the

risk of distant recurrence, to aid in deciding as to whether

Table 3 Multivariate analysis

of BreastPRS and the 120-gene

Oncotype DX approximation

algorithm

NCBI GEO National Center for

Biotechnology Information

Gene Expression Omnibus,

LDA linear discriminant

analysis, HR hazard ratio, CI

confidence interval

Description Cox proportional hazards ratio and P value

Covariate P value HR 95 % CI of

HR

NCBI GEO archival series with

BreastPRS (n = 260)

Grade 2 vs. 1 0.274 1.64 0.68–3.93

Grade 3 vs. 1 0.825 1.12 0.40–3.13

Size T2 vs. T1 0.011 2.15 1.19–3.85

Size T3 vs. T1 0.005 5.97 1.74–20.51

BreastPRS high- vs.

low-risk

\0.001 3.94 1.99–7.78

NCBI GEO archival series with

120-gene Oncotype DX

approximation algorithm

(n = 260)

Grade 2 vs. 1 0.224 1.73 0.7154–4.22

Grade 3 vs. 1 0.593 1.34 0.45–4.03

Size T2 vs. T1 0.037 1.85 1.04–3.29

Size T3 vs. T1 \0.001 13.08 3.71–46.08

120-gene Oncotype

approximation high- vs.

intermediate-risk

0.545 1.32 0.54–3.23

120 gene Oncotype

approximation high-

vs. low-risk

0.043 2.49 0.16–0.97

120-gene predicted Oncotype

intermediate-risk patients with

size and grade information

(n = 59)

Grade 2 vs. 1 0.926 0.90 0.11–7.37

Grade 3 vs. 1 0.834 0.77 0.07–8.40

Size T2 vs. T1 0.087 2.73 0.86–8.63

BreastPRS high-

vs. low-risk

0.055 3.58 0.97–13.14


123

or not to add chemotherapy to a patient’s treatment regi-

men. The 21-gene Oncotype DX RT-PCR test (Genomic

Health, Redwood City, CA, USA) is currently the most

widely used breast cancer prognostic assay, largely due to

its performance on routinely prepared FFPE tissue blocks.

The assay was developed and validated from large series of

patients from National Surgical Adjuvant Breast and

Bowel Project (NSABP) trials, as well as other independent

studies and has been shown to be both prognostic of breast

cancer recurrence and predictive of chemotherapy benefit

[10, 11, 13, 14]. The Oncotype DX assay stratifies patients

into low-, intermediate-, and high-risk groups, corre-

sponding to risk of recurrence at 10 years in ER?, N0

patients treated with Tamoxifen. High-risk patients have

been shown to benefit from chemotherapy. The benefit of

adjuvant chemotherapy in the intermediate-risk group is

uncertain and is currently under investigation [15, 16].

BreastPRS is a 200-gene microarray-based prognostic

signature that was generated and validated on multiple

independent series of breast cancer patients using publicly

available gene expression profiles [1]. In a prior study, the

BreastPRS algorithm was applied to expression profiles of

1,016 patients, and separated them into risk groups with

significant differences in recurrence-free and overall sur-

vival [1]. In untreated, N0 patients, the sensitivity and

specificity of the assay for predicting RFS were 88 and

44 %, respectively, with positive and negative predictive

values of 30.5 and 92 %, respectively. In this study, we

compared the BreastPRS and Oncotype DX assays using

FFPE tissue from a series of patients treated at our insti-

tution, as well as publically available gene expression

profiles from the GEO, a public repository maintained by

the NCBI (http://www.ncbi.nlm.nih.gov/geo/). GEO serves

as a central database for high-throughput microarray and

next-generation sequencing data that is submitted by

researchers and is freely available to the public.

A major strength of Oncotype DX is the ability to per-

form the assay on FFPE tissue, as preserving tumor sam-

ples as FF tissue is not practical for routine clinical care at

most institutions. Moreover, the ability of an assay to be

performed on FFPE tissue allows retrospective study of

large cohorts of patients for validation. The BreastPRS

algorithm was originally developed from expression pro-

files of FF tumor tissue. The aim of this study was to

determine whether the BreastPRS algorithm could be

translated to FFPE and this was accomplished using mat-

ched pairs of FF and FFPE tissue from the same tumors.

Here we show that a linear relationship exists between

BreastPRS scores generated from FF and FFPE tissue with

an intraclass correlation coefficient of 0.90, indicating

strong positive agreement. Overall the FFPE specimens

resulted in lower BreastPRS scores when compared to the

paired FF specimen, therefore linear regression analysis

was used to adjust the threshold for high-/low-risk group

classification accordingly.

Next, we compared the Oncotype DX and BreastPRS

algorithms on a series of patients with known Oncotype DX

results performed as a part of their clinical care. We found

significant correlation between the prognostic metrics gen-

erated by Oncotype DX and BreastPRS when applied to

genomic profiles from the studied patients, with 90 % of

Oncotype DX high-risk cases predicted to be also high-risk

by BreastPRS. Among low-risk Oncotype DX patients, 24 %

were reclassified as high-risk by BreastPRS, a change in

classification that would have important prognostic and

treatment implications. Because outcome data were not

available for the Weill Cornell series, we investigated this

inconsistency in patients from the archival ER?, untreated,

N0 series used in this study. BreastPRS was applied to

tumors classified as low-risk by the 120-gene Oncotype DX

approximation algorithm. Patients in this group that were

reclassified as high-risk by BreastPRS showed significantly

shorter RFS compared with those classified as low-risk by

both assays. This finding raises the possibility that there is a

subset of patients classified as low-risk by Oncotype DX who

may benefit from chemotherapy. Certainly, this observation

warrants further investigation and confirmation by other

independent studies.

Finally, a retrospective head-to-head comparison of

BreastPRS and Oncotype DX was performed by applying a

novel 120-gene Oncotype DX approximation signature,

trained on commercial Oncotype DX results, to previously

published series of untreated, N0, ER? patients with out-

come data. BreastPRS resulted in a more statistically sig-

nificant log-rank test P value and fewer recurrences in the

low-risk group, as compared to the microarray-based

120-gene approximation of Oncotype DX. Subgroup

analysis of the retrospective cases classified as Oncotype

DX intermediate-risk was then performed. BreastPRS was

able to reclassify these patients as either high- or low-risk

for recurrence, with the reclassified groups having a highly

significant difference in outcome (hazard ratio 3.64,

P = 0.029). We found that 61 % of those classified as

intermediate-risk using the 120-gene Oncotype DX

approximation algorithm were classified as high-risk by

BreastPRS and would likely benefit from adjuvant che-

motherapy. It will be interesting to see whether the findings

of the TAILORx trial mirror these findings.

The gene lists used by BreastPRS and Oncotype DX

overlap by two genes only [Cyclin B1 (CCNB1) and Ki67

(MKI67)]. Despite this, DAVID gene ontology analysis of

both gene lists reveals a number of common gene families

between the two sets (Supplementary Table). Genes

involved in cell cycle regulation and apoptosis are signif-

icantly represented in both sets, however, BreastPRS also

contains genes involved in metabolism, intracellular


123

http://www.ncbi.nlm.nih.gov/geo/

organization, hormone receptor binding, migration, and

immune function. Conversely, Oncotype contains genes

involved in metal binding and tissue development; cate-

gories not significantly represented in BreastPRS.

The submission and use of GEO data sets by researchers

continues to increase and these data sets represent a pow-

erful tool to perform meta-analyses using large volumes of

samples. The use of GEO data sets is likely to increase as

the availability of FFPE tissue from large trials with long-

term follow up becomes depleted. Using a combination of

FFPE material and expression profiles from GEO, we have

shown that the 200-gene BreastPRS prognosis algorithm is

comparable to the 21-gene Oncotype DX assay and

effectively separates Oncotype DX intermediate risk cases

into binary categories with significant differences in RFS.

Additional validation studies are forthcoming with the goal

of commercializing BreastPRS as a stand-alone breast

cancer prognostic assay that can be performed on FFPE

tissue.

Disclosures TMD declares no conflict of interest. RKVL is the

Head of Bioinformatics and New Product Development for Signal

Genetics and owns stock in the company. LTV declares no conflict of

interest. WH, RF, NB, and LSJ are employees of Signal Genetics. SJS

is a paid consultant of Signal Genetics.

References

1. Van Laar RK (2011) Design and multiseries validation of a web-

based gene expression assay for predicting breast cancer recur-

rence and patient survival. J Mol Diagn 13(3):297–304

2. Mittempergher L, de Ronde JJ, Nieuwland M et al (2011) Gene

expression profiles from formalin fixed paraffin embedded breast

cancer tissue are largely comparable to fresh frozen matched

tissue. PLoS One 6:e17163

3. Passing H, Bablok (1983) A new biometrical procedure for

testing the equality of measurements from two different analyti-

cal methods. Application of linear regression procedures for

method comparison studies in clinical chemistry, part I. J Clin

Chem Clin Biochem 21:709–720

4. Gentleman RC, Carey VJ, Bates DM et al (2004) Bioconductor:

open software development for computational biology and bio-

informatics. Genome Biol 5:R80

5. Dudoit S, Fridlyand J, Speed T (2002) Comparison of discrimi-

nation methods for the classification of tumors using gene

expression data. J Am Stat Assoc 97:77–87

6. Schmidt M, Bohm D, von Torne C et al (2008) The humoral

immune system has a key prognostic impact in node-negative

breast cancer. Cancer Res 68:5405–5413

7. Ivshina AV, George J, Senko O et al (2006) Genetic reclassifi-

cation of histologic grade delineates new clinical subtypes of

breast cancer. Cancer Res 66:10292–10301

8. Loi S, Haibe-Kains B, Desmedt C et al (2007) Definition of

clinically distinct molecular subtypes in estrogen receptor-posi-

tive breast carcinomas through genomic grade. J Clin Oncol

25:1239–1246

9. Desmedt C, Piette F, Loi S et al (2007) Strong time dependence

of the 76-gene prognostic signature for node-negative breast

cancer patients in the TRANSBIG multicenter independent val-

idation series. Clin Cancer Res 13:3207–3214

10. Paik S, Tang G, Shak S et al (2006) Gene expression and benefit

of chemotherapy in women with node-negative, estrogen recep-

tor-positive breast cancer. J Clin Oncol 24:3726–3734

11. Paik S, Shak S, Tang G et al (2004) A multigene assay to predict

recurrence of tamoxifen-treated, node-negative breast cancer.

N Engl J Med 351:2817–2826

12. Ma XJ, Patel R, Wang X et al (2006) Molecular classification of

human cancers using a 92-gene real-time quantitative polymerase

chain reaction assay. Arch Pathol Lab Med 130:465–473

13. Habel LA, Shak S, Jacobs MK et al (2006) A population-based

study of tumor gene expression and risk of breast cancer death

among lymph node-negative patients. Breast Cancer Res 8:R25

14. Albain KS, Barlow WE, Shak S et al (2010) Prognostic and

predictive value of the 21-gene recurrence score assay in post-

menopausal women with node-positive, oestrogen-receptor-

positive breast cancer on chemotherapy: a retrospective analysis

of a randomised trial. Lancet Oncol 11:55–65

15. Zujewski JA, Kamin L (2008) Trial assessing individualized

options for treatment for breast cancer: the TAILORx trial. Future

Oncol 4:603–610

16. Sparano JA (2006) TAILORx: trial assigning individualized

options for treatment (Rx). Clin Breast Cancer 7:347–350


123

BreastPRS is a Gene Expression Assay that Stratifies Intermediate-Risk Oncotype DX Patients into High or Low-Risk for Disease Recurrence

Documents

oncotype dx risk groupsand

oncotype dxand breastprs

known oncotype dx results

oncotype dxis

breastprs recurrence

lowrisk group

patients treatment regimen

gene expression assay