Yong (Tony) Wang, PhD Nick Navin’s Lab Department of Genetics UT MD Anderson Cancer Center Healthcare Seminar, January 15, 2015 Diagnosing Intratumor Heterogeneity in Breast Cancer with Single-Cell Genome Sequencing
Aug 14, 2015
Yong (Tony) Wang, PhD
Nick Navin’s Lab
Department of Genetics
UT MD Anderson Cancer Center
Healthcare Seminar, January 15, 2015
Diagnosing Intratumor Heterogeneity in Breast Cancer with Single-Cell Genome
Sequencing
Standard NGS vs. Single Cell Sequencing
Owens, Nature, 2012
Applications of Single Cell Sequencing
Wang et al., Molecular Cell, 2015
Timeline of Single Cell Sequencing Milestones
Exponential Growth of Single Cell Sequencing
2009 2010 2011 2012 2013 2014 20150
5
10
15
20
25
30
Year
Nu
mb
er
of
Pu
blic
ati
on
s
Publications by Field and Applications
Cancer24%
Developmental18%
Computational15%
Method15%
Microbiology10%
Neurobiology6%
Immunology5%
Mosaicism4%
Misc3%
Diagnosing Intratumor Heterogeneity
Intertumor vs. Intratumor heterogeneity
Burrell et al., Nature, 2013
Subclone 2 Subclone 3
Subclone 1
Intercellular genetic and non-genetic
heterogeneity
Intertumour heterogeneity
Intratumour heterogeneity
Tumor Evolution Models
Navin and Hicks, Mol. Oncol, 2011
1. Tumor heterogeneity confounds the
clinical diagnosis and basic research of
cancer
2. The extent of clonal diversity and
models for tumor evolution are poorly
understood in human breast cancer
3. Standard sequencing methods are
limited to reporting the average signal
of a complex population of tumor cells
Resolving Intratumor Heterogeneity
Zainal et al., Cell, 20121. Deep-sequencing 2. Spatial sampling
3. Single cell sequencing: The goal of this project is to develop a
single cell sequencing method to study intratumor heterogeneity
and genome evolution in breast cancer
Gerlinger et al., NEJM, 2012
Single Nucleus Sequencing (SNS)
Navin et al. 2011 Nature
Developing Single-Cell Sequencing Methods
at Base-pair Resolution
25.58% cells have all 22
chromosomes amplified
45.43% cells have all 22
chromosomes amplified
Whole-Genome or Exome Single-Cell
Sequencing
Experimental Strategy:
(1) Doubling the input DNA (4N) to decrease
allelic dropout
(2) Minimize false-positive
events by limiting the amplification reaction
NUC-SEQ
Phi29
NEB
Library (NEB)
Wang et al. (2014) Nature
Leung et al. (2015) under review
Monoclonal Copy Number in SK-BR-3
• Copy number profiling with SNS at 220kb resolution of 50 cells shows that amplifications of MET, MYC, ERBB2, BCAS1 and a deletion in DCC were present in
all single cells
• Single cell copy number profiles show a very high correlation (R2 = 0.91)
Deep-Sequencing of the SK-BR-3 Population (SKP)
• Coverage depth: 51X, coverage
breadth: 90.4%
• SNVs, CNAs and SVs were detected
• 409 nonsynonymous SNVs were
identified including many mutations in
cancer genes (CDH1, DBC1, BCR,
ETV1, PASK, PRCC)
Coverage of Single Cells from SK-BR-3 Cell Line
• DOP-PCR WGA methods (SNS) achieve low coverage breadth, even when sequenced at high coverage depth
• Multiple-displacement-amplification (MDA) can achieve high coverage breadth in single cells, similar to standard genome
sequencing
SKP SK1 SK2 SNS
Depth 51X 66X 56X 1-2X
Breadth 90.4% 87.1% 80.3% 10%
Lorenz Curve of Coverage Depth Uniformity
• Lorenz curves show coverage uniformity (‘evenness’) in single cell data
• Nuc-Seq provides very uniform coverage compared to SNS (previous method)
• Phi29-based coverage uniformity is similar to MALBAC (Zong et al 2012, Science)
Calculation of Error Rates
Allelic Dropout Rate (ADR)
False-Positive Rate (FPR)
False-Negative Coverage (FNC)
A B A A
Pop Single Cell
A A A B
Pop Single Cell
A A X
Pop Single Cell
X
9.73%
1.24e-6
NUC-SEQ
5.6%
Single-Cell Sequencing of an ER+ Breast Tumor
1. To delineate clonal diversity of the tumor and identify subpopulations
2. To trace the evolution of copy number alteration and point mutations during tumor growth
Experimental Design
Population Single Cell Whole Genome Exome SNS
Samples BCN BCT BC1 BC2 BC3 BC4 59 cells 50 cells
Breadth 89.9% 90.0% 73.4% 78.3% 89.0% 82.5% 93.0% 10%
Depth 54X 46X 43X 35X 49X 60X 47X 1-4X
• ER+/PR+/Her2-
• 53-year old patient
• Grade II invasive ductal carcinoma
Deep-Sequencing of the ER+ Tumor Population
• High coverage breadth (90%) and depth
(46X)
• A total of 4,162 somatic SNVs in the tumor
cell population.
• 12 nonsynonymous mutations, which were
validated by exome sequencing (66X).
• Several nonsynonymous mutations occurred
in cancer genes, including PIK3CA, CASP3,
FBN2 and PPP2R5E .
Neighbor-Joining Tree of 50 Single Cell CN Profiles
• Neighboring-joining tree was constructed from segmented copy number profiles of 50 single cells sequenced with SNS
• The profiles are highly similar, representing a single clonal subpopulation in the tumor (mean R2= 0.89), and a single homogeneous
population of normal diploid cells
ER+ Single Cell Whole-Genome Sequencing
• High coverage breadth (80.79%)
and depth (46.75X)
• 12 clonal nonsynonymous
mutations and 32 subclonal
mutations
• Many of the subclonal mutations
occurred in intergenic or intronic
regions
• However two subclonal mutations
(MARCH11 and CABP2) were
found in coding regions
ER+ Single Cell Exome Sequencing
• 47 single tumors cells and 12 normal cells, coverage depth 47X and coverage breadth 93%
• The 17 clonal mutations were present in many of the single tumor cells
• 22 new subclonal mutations were identified that were not detected by population sequencing
• In contrast, only a single subclonal mutation was detected between the 12 normal cells
Investigating Clonal Diversity in a Triple-Negative
Breast Tumor by Single-Cell Sequencing
Experimental Design
• ER-/PR-/Her2-
• grade III invasive ductal carcinoma
• 66 year-old woman
• no chemotherapy or hormonal therapy before lumpectomy
• no metastatic lesions detected
• Nuclei were flow-sorted from the aneuploid G2/M peak (6N), the
diploid G2/M peak (4N), the hypoploid peak and from matched
normal tissue for population sequencing, single cell CN profiling
and single cell exome sequencing
Deep-Sequencing of the TNBC Tumor Population
• We performed population sequencing of the bulk
tumor and matched normal tissue at high coverage
depths (72X and 74X) and identified 374
nonsynonymous mutations.
• A number of mutations occurred in cancer genes.
• No evidence of a TP53 mutation in this patient.
• There is a point mutation in PTEN.
• Copy number profiling identified many
chromosomal deletions, in addition to a focal
amplification on chromosome 19p13.2.
Single Cell Copy Number Profiling
• We flow-sorted 50 single cells for single-cell copy number profiling at 220kb resolution using SNS.
• Neighbor-joining was used to reconstruct a tree, revealing two distinct subpopulations of tumor cells (A and H) in addition to the
normal diploid cells (D).
Single Cell Exome Sequencing
• 16 single tumor cells and 16 single normal cells were used for exome sequencing with Nuc-Seq.
• The 374 clonal nonsynonymous mutations detected by bulk sequencing were found in the majority of the single tumor cells.
• We also identified 145 subclonal nonsynonymous mutations that were not detected by bulk tumor sequencing.
• Hierarchical clustering showed that many of the subclonal mutations occurred exclusively in one subpopulation (H, A1 or A2).
Mutation Validation With Single-Molecule Targeted
Deep-Sequencing
Single Molecule Deep Sequencing of Bulk Tumor Tissues
Schmitt et al., PNAS, 2012
12 randomized base tagFixed sequence
A
B
C
D
E
F
13 cycles of PCR amplification
Hybrid custom capture
G
H
3.4e-5
3.8e-10
Validation of ER+ Single Cell Mutations
• Raw coverage depth is 116,952X. Single molecule depth is 5,695X.
• Validated 94.44% (17/18) of the clonal mutations, 90.47% (19/21) of the subclonal mutations, and 19.40% (26/134) of the de novo mutations (p <
0.01), suggesting that many of these mutations are real biological variants in the tumor mass.
• Clonal mutations occurred at high frequencies (mean = 0.4212), while subclonal mutations were less prevalent (mean = 0.0895), and the de novo
mutations showed the lowest frequencies (mean = 0.0195) in the tumor mass.
Validation of TNBC Single Cell Mutations
• Raw coverage depth is 118,743X and single molecule depth is 6,634X.
• Validated 99.73% (374/375) of the clonal mutations, 64.83% (94/145) of the subclonal mutations and 26.99% (152/563) of the de novo mutations (p <
0.01).
• The clonal mutations showed high frequencies (mean = 0.4457), while the subclonal mutations were less prevalent (mean = 0.050) and the de novo
mutations showed the lowest frequencies (mean = 0.00047) in the tumor mass.
Investigating Mutation Rates and Clonal Evolution
Mathematical Modeling of Mutation Rates
• We used the single cell mutations frequencies and designed a mathematical stochastic birth-and-death branching tree process that uses
experimental parameters for cell birth rates (Ki-67 staining), cell death rates (caspase-3 staining), total tumor cell numbers (flow-sorting cell
counts)
• The simulation was run for a series of mutation rates, 1,000 times for each mutation rate and the average distributions were compared to the
single cell data
• Our data suggest that the ER+ breast tumor had a mutation rate similar to error rates reported for normal cells while the TNBC tumor had a
mutation rate 13.3X higher than normal cells.
Punctuated Evolution of Copy Number
The single cell copy number profiles are highly similar, suggesting that copy number rearrangements occurred early in punctuated bursts of
evolution, followed by stable clonal expansions to form the tumor masses .
Gradual Evolution of Point Mutations
• In both patients we observed a large number of intermediate tumor cells with subclonal and de novo mutations that were not detected by
sequencing the bulk tumor en masse.
• These data suggest that point mutations evolved gradually over long periods of time, generating extensive clonal diversity.
Genome Evolution Model in the Two Breast Tumors
Summary
1. Developed a single cell whole genome and exome sequencing method that can achieve high coverage data with low error rates.
2. Standard bulk sequencing of a ER tumor and a TNBC tumor revealed few somatic mutations, while single cell sequencing identified
hundreds of additional genomic mutations.
3. Single-cell sequencing can guide therapeutic targeting towards mutations that are present in the majority of tumor cells, or alternatively
towards therapies that target each subpopulation independently.
4. Single cell copy number data suggests that aneuploidy evolved early in tumor progression and remained stable as the tumor mass
expanded.
5. Point mutations evolved gradually and continuously over extended periods of time, generating extensive clonal diversity.
6. Future work will determine if measuring the amount of intratumor heterogeneity can predict patient survival or response to
chemotherapy in the clinic.
Single Cell Sequencing in the Clinic of Cancer Care
1. Non-invasive monitoring of tumor cells in the blood or bodily fluids (ex. residual disease).
2. Measuring the extent of intratumor heterogeneity and determining if it is predictive for survival or response to
chemotherapy.
3. Early detection of tumor cells in clinical samples (ex. Blood).
4. Scarce clinical sample analysis to obtain high quality genomic data
(fine-needle aspirates, circulating tumor cells).
Navin & Hicks 2011
Acknowledgements
Funding Agencies
T.C. Hsu and Alice-Reynolds Kleberg Foundation, Texas STARS
Center for Genetics & Genomics
Nick Navin
Marco Leung
Jill Waters
MDA Sequencing Core
Erika Thompson, Khadan Kahnov
Louis Ramagli, Hongli Tan
Clinical Collaborators
Funda Meric-Bernstam
Hong Zhang
Statistical Collaborators
Ken Chen, Han Liang
Paul Scheet, Selina Vattathil
Rui Zhao, Franziska Michor
Anna Unruh
Emi Sei
Alexander Davis
Navin Laboratory
Thank You!
Single Cell Isolation Methods
Amplification Methods
Improved Amplification Efficiency of G2/M vs. G1/0 Cells
The improved amplification efficiency
can be shown using panels of 22
chromosome-specific primer pairs for
PCR.
In G1/0 single cells we find that only
25.58% (11/43) of the cells show full
amplification of the PCR products,
In G2/M cells have 45.34% (39/86).
Protein Damaging Subclonal and De Novo Mutations
1. SIFT - based on sequence homology and the physical properties of amino acids
2. Polyphen - via analysis of multiple sequence alignments and protein 3D-structures
3. This plot shows that a lot of these DeNovo mutations are damaging to the structures of the proteins coded by the mutated genes
Duplex Sequencing Metrics
Duplex Sequencing Metrics
Clustered Heatmap of 50 ER+ Single Cell CN Profiles
Clustered heatmap of segmented single cell copy number profiles shows that all single tumor cells are highly clonal,
sharing amplifications of chromosome 1q, 5, 8, 10, 15, 16p, 19, 20, 21 and whole chromosome deletion of 1p, 6, 9, 13
and 18
Two Disparate Molecular Clocks Operate in the Tumor
Copy Number Evolution Mutational Evolution
(point mutations and indels)
Mutations Spectrum not Significantly Different
Kolmogorov Smirnov
test, p = 0.31
KS test, p = 0.14
Coverage Performance of Whole-Genome SCS
SKP SK1 SK2
Depth 51X 66X 56X
Breadth 90.4% 87.1% 80.3%
o High coverage breadth and depth
were achieved for whole-genome
single cell sequencing of two SK-BR-3
cells (SK1, SK2)
o Single cell coverage is less uniform
and correlates with GC content