Page 1
PRECLINICAL STUDY
BreastPRS is a gene expression assay that stratifies intermediate-risk Oncotype DX patients into high- or low-risk for diseaserecurrence
Timothy M. D’Alfonso • Ryan K. van Laar • Linda T. Vahdat • Wasay Hussain •
Rachel Flinchum • Nathan Brown • Linda Saint John • Sandra J. Shin
Received: 2 May 2013 / Accepted: 8 June 2013 / Published online: 18 June 2013
� Springer Science+Business Media New York 2013
Abstract Molecular prognostic assays, such as Oncotype
DX, are increasingly incorporated into the management of
patients with invasive breast carcinoma. BreastPRS is a new
molecular assay developed and validated from a meta-
analysis of publically available genomic datasets. We
applied the assay to matched fresh-frozen (FF) and formalin-
fixed paraffin-embedded (FFPE) tumor samples to translate
the assay to FFPE. A linear relationship of the BreastPRS
prognostic score was observed between tissue preservation
formats. BreastPRS recurrence scores were compared with
Oncotype DX recurrence scores from 246 patients with
invasive breast carcinoma and known Oncotype DX results.
Using this series, a 120-gene Oncotype DX approximation
algorithm was trained to predict Oncotype DX risk groups
and then applied to series of untreated, node-negative,
estrogen receptor (ER)-positive patients from previously
published studies with known clinical outcomes. Correlation
of recurrence score and risk group between Oncotype DX
and BreastPRS was statistically significant (P \ 0.0001). 59
of 260 (23 %) patients from four previously published
studies were classified as intermediate-risk when the
120-gene Oncotype DX approximation algorithm was
applied. BreastPRS reclassified the 59 patients into binary
risk groups (high- vs. low-risk). 23 (39 %) patients were
classified as low-risk and 36 (61 %) as high-risk (P = 0.029,
HR: 3.64, 95 % CI: 1.40–9.50). At 10 years from diagnosis,
the low-risk group had a 90 % recurrence-free survival
(RFS) rate compared to 60 % for the high-risk group.
BreastPRS recurrence score is comparable with Oncotype
DX and can reclassify Oncotype DX intermediate-risk
patients into two groups with significant differences in RFS.
Further studies are needed to validate these findings.
Keywords BreastPRS � Oncotype DX � Breast cancer
recurrence � Microarray
Introduction
Clinicians are increasingly incorporating and utilizing
genomic information of patients’ breast tumors via multi-
gene prognostic signatures to guide treatment recommen-
dations. These molecular assays, combined with traditional
clinical and pathologic variables, are used to determine the
risk of cancer recurrence and the benefits of adding che-
motherapy to a patient’s treatment regimen. A number of
prognostic and predictive multigene assays are commer-
cially available for this purpose, of which the Oncotype DX
is currently the most popular for luminal subtypes of node-
negative (N0) breast cancer patients.
BreastPRS (Signal Genetics) is a new molecular char-
acterization assay developed and validated from a meta-
analysis of publically available genomic datasets. Breast-
PRS is unique in that the 200 genes utilized in its algorithm
Electronic supplementary material The online version of thisarticle (doi:10.1007/s10549-013-2604-0) contains supplementarymaterial, which is available to authorized users.
T. M. D’Alfonso (&) � S. J. Shin
Department of Pathology and Laboratory Medicine,
New York-Presbyterian Hospital, Weill Cornell Medical
College, 525 East 68th Street, Starr 1031E, New York,
NY 10065, USA
e-mail: [email protected]
R. K. van Laar � W. Hussain � R. Flinchum � N. Brown �L. S. John
Signal Genetics, LLC, New York, NY, USA
L. T. Vahdat
Department of Hematology/Oncology, Weill Cornell Medical
College, New York, NY, USA
123
Breast Cancer Res Treat (2013) 139:705–715
DOI 10.1007/s10549-013-2604-0
Page 2
(validated in a large series of breast cancer patients) were
significantly associated with RFS, independent of tradi-
tional prognostic variables including age, tumor size, ER
status, tumor grade, and nodal involvement [1]. In contrast
to Oncotype DX, BreastPRS is a binary assay which
stratifies patients into low- and high-risk groups.
In this study, we sought to (i) translate the previously
published 200-gene prognostic signature from fresh frozen
(FF) to formalin-fixed paraffin-embedded (FFPE) tissue,
(ii) compare the BreastPRS prognostic index to the Onc-
otype DX assay using FFPE patient specimens analyzed by
both methods and correlate recurrence scores with clin-
icopathogic features, and (iii) use publically available
whole genome profiles from series of untreated ER? N0
patients to investigate the ability of BreastPRS to reclassify
Oncotype DX intermediate-risk patients into binary risk
categories (high- vs. low-risk) with clinically significant
differences in outcome. The ultimate goal was to assist
clinicians with decision making for patients whose tumors
are classified as intermediate-risk by Oncotype DX.
Materials and methods
Translation of BreastPRS from FF to FFPE tissue
The 200-gene prognosis signature within BreastPRS was
originally developed from gene expression data generated
from FF breast cancer tissue. In order to translate this signature
for use with FFPE tissue, RNA from FF and FFPE portions of
the same 35 individual breast tumors was obtained from a
commercial tissue repository (BioServe, Beltsville, MD, USA)
(Table 1). Pre-isolated RNA from the FF portion of each tumor
was supplied by BioServe and hybridized to Affymetrix U133
GeneChips according to manufacturer recommendations. For
the FFPE counterparts, RNA was isolated and amplified from
five 10-lM sections of each specimen using the Ovation FFPE
WTA System (NuGen Inc., San Carlos, CA, USA). A mini-
mum tumor cell content of[50 % was verified by the supplier.
Amplified cDNA was fragmented, labeled, and hybridized to a
Human Genome U133 Plus 2.0 GeneChip according to man-
ufacturer recommendations.
As an additional FF to FFPE validation series, a 20-patient
series of matched FF and FFPE breast cancer data was
downloaded from National Center for Biotechnology Infor-
mation (NCBI) Gene Expression Omnibus (GEO) [2]. In both
series, the prognostic signature was applied to each genomic
profile (averaged across technical replicates where present)
and the resulting risk scores were compared between tissue
preservation methods. Passing and Bablok regression, a pro-
cedure with no special assumptions regarding the distribution
of the samples and the measurement errors, was used to assess
linearity and consistency [3]. The regression equation from
this analysis was then used to adjust the low-/high-risk
threshold for FFPE tissue, to maintain the previously pub-
lished characteristics of the signature [1].
Gene expression profiling of FFPE breast tumors
previously analyzed with Oncotype DX
Two hundred eighty-four patients with consecutively
diagnosed invasive breast carcinoma and known Oncotype
DX recurrence scores performed as part of their routine
clinical care were identified and retrieved from pathology
files at Weill Cornell Medical College. Unstained FFPE
slides from representative tumor blocks were used for gene
array analysis. RNA was isolated after manual microdis-
section from tissue slides using Prelude FFPE RNA Iso-
lation Module, part no. 1410-50 (NuGen Inc., San Carlos,
CA, USA). The isolated RNA was converted to cDNA,
amplified, and then hybridized to Affymetrix 133 Plus 2.0
Whole Genome microarrays. Normalized gene expression
profiles were generated using the MAS5 probe summari-
zation algorithm and annotated with Bioconductor ‘‘An-
notationData’’ package, release 2.11 [4]. For quality-
control purposes, gene chips with fewer than 25 % pro-
beset detection were excluded from further analysis.
Creation of a novel gene expression signature to predict
a patient’s Oncotype DX risk group from microarray
data
After excluding the 21 genes used by the proprietary
Genomic Health recurrence score algorithm, a new signa-
ture was developed to predict the Oncotype DX risk group
of a breast cancer specimen using Affymetrix U133 Plus
2.0 data. To create the signature, the significance of asso-
ciation between each Affymetrix probe and Oncotype DX
risk group (low, intermediate, or high) was determined,
using an F test. A strict P value threshold of 1 9 10-6 to
reduce the number of false positive probes was used to
select the most discriminatory genes whose association
with Oncotype DX risk group in the training series was
visualized with hierarchical clustering. In order to apply
the gene signature to archival breast cancer specimens with
outcome data, the gene signature was used to train a
diagonal Linear Discriminant Analysis classifier (LDA)
using partial cross validation [5]. The trained algorithm,
which will be referred to as the Oncotype DX approxi-
mation algorithm, was applied to gene expression profiles
assembled from public repositories (NCBI GEO) from
patients from four previously published studies [6–9] and
classified each patient as low, intermediate, or high Onc-
otype DX risk group. Only those patients with ER?, N0
disease who did not receive adjuvant therapy were included
in this analysis. Those predicted to be intermediate-risk
706 Breast Cancer Res Treat (2013) 139:705–715
123
Page 3
were reanalyzed using the BreastPRS prognostic signature
and the reclassified high- and low-risk groups were com-
pared to known outcomes. A flow diagram of the study
design is shown in Fig. 1.
Data management and statistical analysis
Statistical analyses were performed using Microsoft Excel,
R and Medcalc (version 12.3.0). All t tests were two sided
and P values below 0.05 were considered statistically
significant. To compare differences in RFS between risk
groups on a univariate level, Kaplan–Meier analysis and
log-rank testing were performed. To evaluate the differ-
ences between risk groups on a multivariate level, Cox
proportional hazards regression analysis was used, includ-
ing tumor grade, size, and BreastPRS risk group.
Results were analyzed using the Signal Genetics
ResultsPX platform and R (www.r-project.org). The
Table 1 Patient demographics
from breast cancer series used in
this study
FFPE formalin-fixed paraffin
embedded, ER estrogen
receptor, NA not available,
Ax axillary, CT classical type,
PT pleomorphic type
Characteristic Fresh frozen to
FFPE series
Weill Cornell
Oncotype DX series
Archival ER?,
node-negative series
Number: 55 246 260
Age (mean years) 60 (NA: 20) 57 49 (NA: 34)
Histologic grade
1 – 43 54
2 – 116 150
3 – 50 42
Lobular (CT/PT) – 32/5 NA
NA – 0 14
T Size
T1 – 214 146
T2 – 31 110
T3 – 1 3
NA
Ax. lymph node status
Positive – 37 0
Negative – 209 260
ER status
Positive 32 242 260
Negative 20 3 0
NA 3 1 0
Stage
I 1 215 146
II 28 31 110
III 0 0 3
NA 26 0 1
Recurrence
Yes – – 56
No – – 204
Follow-up (median years) – – 9.9
Gene expression repository IDs ArrayExpress:
E-TABM-108
GSE47109 NCBI GEO:
GSE11121,
GSE4922,
GSE6352,
GSE7390
Gene expression platform Affymetrix
GeneChip U133
Plus 2.0 (100)
Illumina
HumanRef-8
Expression
BeadChip (90)
Affymetrix
GeneChip U133
Plus 2.0
Affymetrix
GeneChip U133
Plus 2.0
Breast Cancer Res Treat (2013) 139:705–715 707
123
Page 4
BreastPRS prognostic score was compared with known
clinicopathologic variables (age, tumor size, nodal status,
ER/PR/HER2 status, Ki-67, lymphovascular invasion) and
Oncotype DX recurrence score for each patient. Data from
the 284 patient training series have been deposited at NCBI
GEO under accession GSE 47109.
Results
Patient demographics from the series used in this study are
summarized in Table 1.
Translation of BreastPRS from FF to FFPE tissue
The previously published 200-gene prognosis score was
calculated for 35 matched pairs of microarray profiles for
FF and FFPE breast cancer specimens which passed RNA
and GeneChip quality metrics, as described. The r2 and
intraclass correlation coefficients for paired measurements
were 0.67 and 0.90, respectively. Passing and Bablock
regression analysis revealed the relationship between tissue
types exhibited no significant deviation from linearity
(P = 0.91), as shown in Fig. 2a. When the regression
equation [FFPE score = -0.265 ? 1.051 9 FF score] is
applied to the previously determined FF-based classifica-
tion threshold for high-/low-risk (i.e., [-0.38 = high-
risk), the FFPE threshold was determined to be [-0.63.
This adjustment ensures that the performance of the high-/
low-risk groups is consistent with those previously pub-
lished for FF tissue, for example in terms of recurrence-free
and overall survival.
In order to further verify the threshold adjustment,
microarray data from a published series of 20 paired FF
and FFPE breast cancer specimens were analyzed [2]
(Fig. 2b). These specimens were profiled (with technical
replication) using the Illumina Whole-Genome DASL
Assay which contains 24,526 transcripts. Probe mapping
identified 169 of the 200 (85 %) genes in the BreastPRS
prognostic signature, therefore the complete algorithm
could not be applied. The prognostic signature was calcu-
lated on this subset and Passing and Bablok regression
showed the relationship between tissue preservation
methods to be [FFPE-score = 1.038 9 FF score - 0.154].
In general, both datasets revealed a linear relationship
between the BreastPRS prognostic score when measured in
paired FF and FFPE specimens. Regression analysis
showed that an adjustment to the classification threshold of
-0.246 was necessary to perform BreastPRS using FFPE
tissue and maintain the previously published performance
characteristics of the low- and high-risk groups. Perform-
ing a similar comparison on a different patient series pro-
filed on a different platform (Illumina) showed an
adjustment of a similar magnitude would be necessary if
BreastPRS was to be performed on this system.
For comparison purposes with our own dataset (Af-
fymetrix), we computed a 169-gene version of the prognostic
score and applied it to the 35 pairs of FF and FFPE specimens
in this validation series (Fig. 2c). Regression analysis of this
subset algorithm revealed the regression analysis to be
similar to that observed for the Mittempergher/Illumina
series: FFPE score = -0.214 ? 1.005 9 FF score.
Comparison of BreastPRS prognostic indices to official
Oncotype DX recurrence scores
Of the 284 cases of invasive carcinoma with known Onco-
type DX recurrence scores, 246 cases passed all RNA and
Fig. 1 Flow diagram
summarizing analytical aspects
of validation study
708 Breast Cancer Res Treat (2013) 139:705–715
123
Page 5
chip-related quality controls. The two prognostic algorithms
were compared using whole genome profiles from these
cases (Fig. 3a). Both metrics were designed to be continu-
ously associated with risk of disease recurrence, reflected in
the interclass correlation coefficient of 0.73 (95 % CI:
0.65–0.79), indicating a moderately strong positive rela-
tionship between the two metrics. Figure 3b shows a poly-
nomial regression line fitted to the dataset, suggesting the
spread of risk scores may be greater in BreastPRS as com-
pared to Oncotype DX, where more scores are compressed in
the low-to-middle range distribution. For Oncotype DX,
low-risk = \18, intermediate-risk = 18–30, and high-
risk = [30. BreastPRS classified patients as only high- or
low-risk, based on the threshold of high-risk = 34 or greater.
Of the 30 high-risk Oncotype DX cases, 27 (90 %) were
classified as high-risk by BreastPRS (Table 2). Ninety-five
low-risk Oncotype DX cases (76 %) were classified as low-
risk by BreastPRS. Interestingly, a majority (60 %) of cases
classified as intermediate-risk by Oncotype DX were clas-
sified as low-risk by BreastPRS, a group which has previ-
ously been shown to not benefit from adjuvant chemotherapy
treatment [10].
Identification of Oncotype DX intermediate-risk
patients in public gene expression profile data
repository
After identifying a novel set of genes able to hierarchical
cluster Affymetrix profiles into three groups which corre-
spond to Oncotype DX low-, intermediate-, and high-risk
groups, an Oncotype DX approximation algorithm was
created to predict Oncotype DX risk groups of genomic
profiles from other datasets. Partial cross validation of the
120-gene Oncotype DX approximation algorithm on the
Oncotype DX training series reproduced accuracy of the
hierarchical clustering in terms of risk group assignment and
official Oncotype DX classification (Fig. 4). Overall, 72 %
of patients were predicted correctly by the 120-gene algo-
rithm with a mean cross-validation sensitivity of 68 %,
specificity of 84 %, positive predictive value of 69 %, and
negative predictive value of 85 %. Variation between the
official and predicted Oncotype DX risk groups may be
attributable to intratumoral heterogeneity, differences in
measuring gene expression using qPCR versus microarray,
or the general reproducibility of the 21-gene signature itself.
Fig. 2 Passing and Bablok Regression of the BreastPRS prognostic
score calculated on paired FF and FFPE specimens. a Affymetrix
200-gene signature in 35 pairs of FF/FFPE RNA. FFPE score =
-0.265 ? 1.051 9 FF score. b Illumina 169-gene signature in 20
pairs of FF/FFPE RNA. FFPE score = 1.038 9 FF score - 0.154.
c Affymetrix 169-gene signature in same series of patients as a. FFPE
score = -0.214 ? 1.005 9 FF score. FF fresh frozen, FFPE for-
malin-fixed paraffin-embedded, RNA ribonucleic acid
Breast Cancer Res Treat (2013) 139:705–715 709
123
Page 6
For comparison purposes, a separate cross-validation exer-
cise was performed using the 16 genes used by Genomic
Health to perform the Oncotype DX assay (after renormal-
izing to the 5 Oncotype normalization genes) as published
[11]. With this method, only an additional 3 % of patients
analyzed were classified into the same risk group as the
official Oncotype DX risk group (data not shown), a statis-
tically insignificant difference, as determined by a t test of
proportions (P = 0.63). Therefore, when a tumor is profiled
using the Affymetrix platform, it appears that using the 21
gene (patented) Oncotype DX signature does not reproduce
the commercial qPCR assay with significantly greater
accuracy than using a novel 120-gene Oncotype DX
approximation algorithm.
Within the 120-gene Oncotype DX approximation
algorithm predicted risk groups, it was the intermediate-risk
group which differed most between the official 21-gene
PCR method risk groups and those predicted by the
120-gene approximation algorithm. Of the ‘‘true’’ inter-
mediate-risk patients, 73 % were classified as intermediate-
risk by the 120-gene approximation algorithm. For gene set
comparison purposes, the Affymetrix implementation of the
‘‘official’’ 21-gene signature was also evaluated. Using the
21-gene set, 69 % of true intermediate-risk patients were
classified as such. This suggests that the 120-gene Oncotype
DX approximation algorithm is more accurate at identifying
‘‘true’’ Oncotype DX intermediate-risk patients in previ-
ously published breast cancer gene expression datasets.
Performing additional algorithm incorporating gene-rese-
lection in each loop of the cross-validation procedure did
not improve classification accuracy (data not shown).
Application of the 120-gene Oncotype DX
approximation algorithm to expression profiles
from historical datasets to classify patients
into predicted Oncotype DX risk groups
The 120-gene Oncotype DX approximation algorithm was
applied to the whole genome expression profiles of 260
ER?, N0, untreated (i.e., no Tamoxifen or chemotherapy)
breast cancer patients from four previously published
studies [6–9], with recurrence-related outcome data avail-
able. Application of the 120-gene Oncotype DX approxi-
mation algorithm classified 169 (65 %) as low-risk, 59
(23 %) as intermediate-risk, and 32 (12 %) as high-risk.
Kaplan–Meier analysis of these risk groups approached
statistical significance in 15-year RFS (log-rank test
P value 0.088) (Fig. 5a). In a multivariate analysis, the
difference in RFS between the high-risk and low-risk
groups was significant (P = 0.043), but not between the
high- and intermediate-risk groups (P = 0.55).
Because this series of patients were compiled from four
historical datasets, each with their own patient selection
criteria, the proportion of patients in each risk group was not
expected to resemble that observed in the general population
or target demographic of the Oncotype DX assay. Despite
this, the proportions observed are similar to that reported in
Fig. 3 a Scatter plot of 246 FFPE breast cancer specimens analyzed
by both Oncotype DX and BreastPRS. Cases are colored according to
the risk group of each assay. Both prognosis scores were created to be
continuous markers of worsening prognosis with higher scores
correlating with higher rates of outcome and poor overall survival.
b Polynomial regression fitted line exploring the relationship between
the Oncotype DX recurrence score and BreastPRS prognosis index.
This plot suggests that the spread of risk scores may be greater in
BreastPRS as compared to Oncotype DX, where more scores are
compressed in the low-to-middle range distribution. OT Oncotype
DX, FFPE formalin-fixed paraffin-embedded
Table 2 Direct comparison of risk group stratification by Oncotype
DX (21-gene qPCR) and BreastPRS (200-gene Affymetrix signature)
Oncotype DX Risk Total BreastPRS BreastPRS
Low-risk High-risk
Low 125 95 (76 %) 30 (24 %)
Intermediate 91 55 (60 %) 36 (40 %)
High 30 3 (10 %) 27 (90 %)
Total 246 153 93
v2 P value for risk group association \0.0001
710 Breast Cancer Res Treat (2013) 139:705–715
123
Page 7
the NSABP B-14 trial, i.e., low-risk (51 %), intermediate-
risk (22 %), and high-risk (27 %) [11]. When the 200-gene
BreastPRS algorithm is applied to expression profiles from
this set of patients (Fig. 5b), 129 (49.6 %) were classified as
low-risk and 131 (50.4 %) were classified as high-risk with a
significant difference in 15-year RFS (log-rank test
P value = 0.0001). In a multivariate model with grade and
tumor size, the BreastPRS signature remains statistically
significant (P \ 0.0001) with a hazard ratio of 3.94 (95 %
CI: 1.99–7.78) as shown in Table 3.
BreastPRS was then applied to the 169 cases classified
as low-risk by the 120-gene Oncotype DX approximation
Fig. 4 Hierarchical clustering of the 120-gene Oncotype DX approximation algorithm identified by comparing whole genome profiles of
patients’ samples previously analyzed by Oncotype DX
Breast Cancer Res Treat (2013) 139:705–715 711
123
Page 8
algorithm to investigate the inconsistent risk group
assignments of cases from the Weill Cornell series that
were classified as low-risk by Oncotype DX and high-risk
by BreastPRS (24 % of low-risk cases, Table 2). Sixty-
three of the 169 (37 %) low-risk cases were classified as
high-risk by BreastPRS. Kaplan–Meier analysis was
Fig. 5 Kaplan–Meier analyses of a all ER?, node-negative,
untreated archival patients stratified by 120-gene Oncotype DX
approximation algorithm risk groups (n = 260) P = 0.088 and b by
200-gene BreastPRS algorithm risk groups (n = 260) P = 0.0001,
HR: 3.00 (95 % CI: 1.77–5.08). c Archival ER?, node-negative,
untreated breast cancer patients predicted to be Oncotype DX
intermediate-risk by the 120-gene approximation signature (from a),
reclassified as high- or low-risk by BreastPRS (n = 59) P = 0.029,
HR: 3.64 (95 % CI: 1.40–9.50). ER estrogen receptor, HR hazard
ratio, CI confidence interval
712 Breast Cancer Res Treat (2013) 139:705–715
123
Page 9
performed on these 63 and compared with the remaining
106 that were classified as low-risk by both methods. The
63 cases classified as high-risk by BreastPRS experienced
significantly shorted RFS compared to the cases classified
as low-risk by both methods [hazard ratio: 2.96 (95 % CI:
1.39–6.28), P = 0.0010]. At 10 years from diagnosis, the
chance of recurrence is 25 % lower in the low-risk group
compared to the high-risk group (90 vs. 65 %) and 15 %
lower at 15 years from diagnosis.
Subgroup analysis of Oncotype DX intermediate-risk
patients by BreastPRS
To test the hypothesis that BreastPRS is able to reclassify
Oncotype DX intermediate-risk patients as high- or low-
risk, the 59 (ER?, N0, untreated) intermediate-risk patients
identified in the retrospective series were analyzed by
BreastPRS (Fig. 5c). Twenty-three (39 %) patients were
classified as low-risk and 36 (61 %) as high-risk. The
hazard ratio of a high-risk classification was 3.64 (95 % CI:
1.40 to 9.50) and log-rank testing indicated that the results
were statistically significant (P = 0.029). At 10 years from
diagnosis, the low-risk group had a 90 % RFS rate, com-
pared to 60 % for the high-risk group. When adjusted for
tumor grade and size in a multivariate cox proportional
hazards model (Table 3), BreastPRS was the closest vari-
able to achieving statistical significance (P = 0.055, HR:
3.58, 95 % CI: 0.97–13.14). These data indicate that
BreastPRS is able to reclassify Oncotype DX intermediate-
risk patients as high- or low-risk for disease recurrence,
independent of clinical variables of size and tumor grade.
Discussion
While current guidelines recommend consideration of
chemotherapy for the majority of patients with invasive
breast carcinoma [12], most patients with small, ER?
tumors will not gain additional benefit from adding adju-
vant chemotherapy to Tamoxifen, and can likely be spared
the toxicities of the former. Clinicians have traditionally
relied on clinical and pathologic factors including tumor
size, axillary lymph node status, histologic grade, and
hormone receptor status when assessing the need for
adjuvant chemotherapy in patients with early breast cancer.
Recent advances in gene expression profiling and micro-
array technology have led to a greater understanding of the
biology of breast cancer at the molecular level. This
coincides with advances in imaging techniques that have
led to an increase in the detection of smaller invasive
cancers. Clinicians are increasingly incorporating genomic
data from patients’ tumors via multigene prognostic sig-
natures to gather additional information, particularly the
risk of distant recurrence, to aid in deciding as to whether
Table 3 Multivariate analysis
of BreastPRS and the 120-gene
Oncotype DX approximation
algorithm
NCBI GEO National Center for
Biotechnology Information
Gene Expression Omnibus,
LDA linear discriminant
analysis, HR hazard ratio, CI
confidence interval
Description Cox proportional hazards ratio and P value
Covariate P value HR 95 % CI of
HR
NCBI GEO archival series with
BreastPRS (n = 260)
Grade 2 vs. 1 0.274 1.64 0.68–3.93
Grade 3 vs. 1 0.825 1.12 0.40–3.13
Size T2 vs. T1 0.011 2.15 1.19–3.85
Size T3 vs. T1 0.005 5.97 1.74–20.51
BreastPRS high- vs.
low-risk
\0.001 3.94 1.99–7.78
NCBI GEO archival series with
120-gene Oncotype DX
approximation algorithm
(n = 260)
Grade 2 vs. 1 0.224 1.73 0.7154–4.22
Grade 3 vs. 1 0.593 1.34 0.45–4.03
Size T2 vs. T1 0.037 1.85 1.04–3.29
Size T3 vs. T1 \0.001 13.08 3.71–46.08
120-gene Oncotype
approximation high- vs.
intermediate-risk
0.545 1.32 0.54–3.23
120 gene Oncotype
approximation high-
vs. low-risk
0.043 2.49 0.16–0.97
120-gene predicted Oncotype
intermediate-risk patients with
size and grade information
(n = 59)
Grade 2 vs. 1 0.926 0.90 0.11–7.37
Grade 3 vs. 1 0.834 0.77 0.07–8.40
Size T2 vs. T1 0.087 2.73 0.86–8.63
BreastPRS high-
vs. low-risk
0.055 3.58 0.97–13.14
Breast Cancer Res Treat (2013) 139:705–715 713
123
Page 10
or not to add chemotherapy to a patient’s treatment regi-
men. The 21-gene Oncotype DX RT-PCR test (Genomic
Health, Redwood City, CA, USA) is currently the most
widely used breast cancer prognostic assay, largely due to
its performance on routinely prepared FFPE tissue blocks.
The assay was developed and validated from large series of
patients from National Surgical Adjuvant Breast and
Bowel Project (NSABP) trials, as well as other independent
studies and has been shown to be both prognostic of breast
cancer recurrence and predictive of chemotherapy benefit
[10, 11, 13, 14]. The Oncotype DX assay stratifies patients
into low-, intermediate-, and high-risk groups, corre-
sponding to risk of recurrence at 10 years in ER?, N0
patients treated with Tamoxifen. High-risk patients have
been shown to benefit from chemotherapy. The benefit of
adjuvant chemotherapy in the intermediate-risk group is
uncertain and is currently under investigation [15, 16].
BreastPRS is a 200-gene microarray-based prognostic
signature that was generated and validated on multiple
independent series of breast cancer patients using publicly
available gene expression profiles [1]. In a prior study, the
BreastPRS algorithm was applied to expression profiles of
1,016 patients, and separated them into risk groups with
significant differences in recurrence-free and overall sur-
vival [1]. In untreated, N0 patients, the sensitivity and
specificity of the assay for predicting RFS were 88 and
44 %, respectively, with positive and negative predictive
values of 30.5 and 92 %, respectively. In this study, we
compared the BreastPRS and Oncotype DX assays using
FFPE tissue from a series of patients treated at our insti-
tution, as well as publically available gene expression
profiles from the GEO, a public repository maintained by
the NCBI (http://www.ncbi.nlm.nih.gov/geo/). GEO serves
as a central database for high-throughput microarray and
next-generation sequencing data that is submitted by
researchers and is freely available to the public.
A major strength of Oncotype DX is the ability to per-
form the assay on FFPE tissue, as preserving tumor sam-
ples as FF tissue is not practical for routine clinical care at
most institutions. Moreover, the ability of an assay to be
performed on FFPE tissue allows retrospective study of
large cohorts of patients for validation. The BreastPRS
algorithm was originally developed from expression pro-
files of FF tumor tissue. The aim of this study was to
determine whether the BreastPRS algorithm could be
translated to FFPE and this was accomplished using mat-
ched pairs of FF and FFPE tissue from the same tumors.
Here we show that a linear relationship exists between
BreastPRS scores generated from FF and FFPE tissue with
an intraclass correlation coefficient of 0.90, indicating
strong positive agreement. Overall the FFPE specimens
resulted in lower BreastPRS scores when compared to the
paired FF specimen, therefore linear regression analysis
was used to adjust the threshold for high-/low-risk group
classification accordingly.
Next, we compared the Oncotype DX and BreastPRS
algorithms on a series of patients with known Oncotype DX
results performed as a part of their clinical care. We found
significant correlation between the prognostic metrics gen-
erated by Oncotype DX and BreastPRS when applied to
genomic profiles from the studied patients, with 90 % of
Oncotype DX high-risk cases predicted to be also high-risk
by BreastPRS. Among low-risk Oncotype DX patients, 24 %
were reclassified as high-risk by BreastPRS, a change in
classification that would have important prognostic and
treatment implications. Because outcome data were not
available for the Weill Cornell series, we investigated this
inconsistency in patients from the archival ER?, untreated,
N0 series used in this study. BreastPRS was applied to
tumors classified as low-risk by the 120-gene Oncotype DX
approximation algorithm. Patients in this group that were
reclassified as high-risk by BreastPRS showed significantly
shorter RFS compared with those classified as low-risk by
both assays. This finding raises the possibility that there is a
subset of patients classified as low-risk by Oncotype DX who
may benefit from chemotherapy. Certainly, this observation
warrants further investigation and confirmation by other
independent studies.
Finally, a retrospective head-to-head comparison of
BreastPRS and Oncotype DX was performed by applying a
novel 120-gene Oncotype DX approximation signature,
trained on commercial Oncotype DX results, to previously
published series of untreated, N0, ER? patients with out-
come data. BreastPRS resulted in a more statistically sig-
nificant log-rank test P value and fewer recurrences in the
low-risk group, as compared to the microarray-based
120-gene approximation of Oncotype DX. Subgroup
analysis of the retrospective cases classified as Oncotype
DX intermediate-risk was then performed. BreastPRS was
able to reclassify these patients as either high- or low-risk
for recurrence, with the reclassified groups having a highly
significant difference in outcome (hazard ratio 3.64,
P = 0.029). We found that 61 % of those classified as
intermediate-risk using the 120-gene Oncotype DX
approximation algorithm were classified as high-risk by
BreastPRS and would likely benefit from adjuvant che-
motherapy. It will be interesting to see whether the findings
of the TAILORx trial mirror these findings.
The gene lists used by BreastPRS and Oncotype DX
overlap by two genes only [Cyclin B1 (CCNB1) and Ki67
(MKI67)]. Despite this, DAVID gene ontology analysis of
both gene lists reveals a number of common gene families
between the two sets (Supplementary Table). Genes
involved in cell cycle regulation and apoptosis are signif-
icantly represented in both sets, however, BreastPRS also
contains genes involved in metabolism, intracellular
714 Breast Cancer Res Treat (2013) 139:705–715
123
Page 11
organization, hormone receptor binding, migration, and
immune function. Conversely, Oncotype contains genes
involved in metal binding and tissue development; cate-
gories not significantly represented in BreastPRS.
The submission and use of GEO data sets by researchers
continues to increase and these data sets represent a pow-
erful tool to perform meta-analyses using large volumes of
samples. The use of GEO data sets is likely to increase as
the availability of FFPE tissue from large trials with long-
term follow up becomes depleted. Using a combination of
FFPE material and expression profiles from GEO, we have
shown that the 200-gene BreastPRS prognosis algorithm is
comparable to the 21-gene Oncotype DX assay and
effectively separates Oncotype DX intermediate risk cases
into binary categories with significant differences in RFS.
Additional validation studies are forthcoming with the goal
of commercializing BreastPRS as a stand-alone breast
cancer prognostic assay that can be performed on FFPE
tissue.
Disclosures TMD declares no conflict of interest. RKVL is the
Head of Bioinformatics and New Product Development for Signal
Genetics and owns stock in the company. LTV declares no conflict of
interest. WH, RF, NB, and LSJ are employees of Signal Genetics. SJS
is a paid consultant of Signal Genetics.
References
1. Van Laar RK (2011) Design and multiseries validation of a web-
based gene expression assay for predicting breast cancer recur-
rence and patient survival. J Mol Diagn 13(3):297–304
2. Mittempergher L, de Ronde JJ, Nieuwland M et al (2011) Gene
expression profiles from formalin fixed paraffin embedded breast
cancer tissue are largely comparable to fresh frozen matched
tissue. PLoS One 6:e17163
3. Passing H, Bablok (1983) A new biometrical procedure for
testing the equality of measurements from two different analyti-
cal methods. Application of linear regression procedures for
method comparison studies in clinical chemistry, part I. J Clin
Chem Clin Biochem 21:709–720
4. Gentleman RC, Carey VJ, Bates DM et al (2004) Bioconductor:
open software development for computational biology and bio-
informatics. Genome Biol 5:R80
5. Dudoit S, Fridlyand J, Speed T (2002) Comparison of discrimi-
nation methods for the classification of tumors using gene
expression data. J Am Stat Assoc 97:77–87
6. Schmidt M, Bohm D, von Torne C et al (2008) The humoral
immune system has a key prognostic impact in node-negative
breast cancer. Cancer Res 68:5405–5413
7. Ivshina AV, George J, Senko O et al (2006) Genetic reclassifi-
cation of histologic grade delineates new clinical subtypes of
breast cancer. Cancer Res 66:10292–10301
8. Loi S, Haibe-Kains B, Desmedt C et al (2007) Definition of
clinically distinct molecular subtypes in estrogen receptor-posi-
tive breast carcinomas through genomic grade. J Clin Oncol
25:1239–1246
9. Desmedt C, Piette F, Loi S et al (2007) Strong time dependence
of the 76-gene prognostic signature for node-negative breast
cancer patients in the TRANSBIG multicenter independent val-
idation series. Clin Cancer Res 13:3207–3214
10. Paik S, Tang G, Shak S et al (2006) Gene expression and benefit
of chemotherapy in women with node-negative, estrogen recep-
tor-positive breast cancer. J Clin Oncol 24:3726–3734
11. Paik S, Shak S, Tang G et al (2004) A multigene assay to predict
recurrence of tamoxifen-treated, node-negative breast cancer.
N Engl J Med 351:2817–2826
12. Ma XJ, Patel R, Wang X et al (2006) Molecular classification of
human cancers using a 92-gene real-time quantitative polymerase
chain reaction assay. Arch Pathol Lab Med 130:465–473
13. Habel LA, Shak S, Jacobs MK et al (2006) A population-based
study of tumor gene expression and risk of breast cancer death
among lymph node-negative patients. Breast Cancer Res 8:R25
14. Albain KS, Barlow WE, Shak S et al (2010) Prognostic and
predictive value of the 21-gene recurrence score assay in post-
menopausal women with node-positive, oestrogen-receptor-
positive breast cancer on chemotherapy: a retrospective analysis
of a randomised trial. Lancet Oncol 11:55–65
15. Zujewski JA, Kamin L (2008) Trial assessing individualized
options for treatment for breast cancer: the TAILORx trial. Future
Oncol 4:603–610
16. Sparano JA (2006) TAILORx: trial assigning individualized
options for treatment (Rx). Clin Breast Cancer 7:347–350
Breast Cancer Res Treat (2013) 139:705–715 715
123