Page 1
1 of 27
2012-10-29, Nikolas Pontikos, Automatic Analysis of Flow Cytometry Data
Analysis Methods in Flow Cytometry:
Can a Computer Do Better than a Human?
Nikolas Pontikos
Sackler Lecture Theatre Level 7Monday 29th October 2012
PhD Student, Todd Lab
Page 2
2 of 27
2012-10-29, Nikolas Pontikos, Automatic Analysis of Flow Cytometry Data
What is Flow Cytometry?
© 1998-2012 Abcam plc. All rights reserved
Event ForwardScatter
SideScatter
CD4 CD127 CD45RA CD25
1 2110 309 103 254 4 70
2 1565 252 57 278 341 59
... ... ... ... ... ... ...
110,992 964 256 78 199 9 345
0 1000 2000 3000 4000
020
040
060
080
010
00
Forward Scatter
Side
Sca
tter
110,992 points
1 point = 1 event = 1 cell
Page 3
3 of 27
2012-10-29, Nikolas Pontikos, Automatic Analysis of Flow Cytometry Data
Gating on Forward and Side Scatter
0 1000 2000 3000 4000
020
040
060
080
010
00
Forward Scatter
Side
Sca
tter
Lymphocytes
Granulocytes
Neutrophils
CD4+ Lymphocytes CD8+ Lymphocytes
Granularity
Cell Size
Lymphocyte Gate
Page 4
4 of 27
2012-10-29, Nikolas Pontikos, Automatic Analysis of Flow Cytometry Data
Manual Gating of Cell Phenotypes
% of CD25+ Naive Cells
% of Memory Cells
Page 5
5 of 27
2012-10-29, Nikolas Pontikos, Automatic Analysis of Flow Cytometry Data
My Work Follows from Manually Gated Data from this Paper
IL2RA gene codes for CD25
Memory CD25-
0.0 0.5 1.0 1.5
0.0
0.5
1.0
1.5
2.0
Log10 CD25 Intensity
Log 1
0 CD
45R
A In
tens
ity Naive CD25-Naive CD25+
MemoryCD25+
Memory
IL2RA associated with T1D
Page 6
6 of 27
2012-10-29, Nikolas Pontikos, Automatic Analysis of Flow Cytometry Data
Evaluation of Gating:Association and Repeatability
of Cell Phenotypes
repeatability of cell phenotypes.
✦ 180 individuals (matched for IL2RA genotype, age and sex).
✦ 15 individuals recalled up to 6 months later.Total of 195 samples.
association of cell phenotypes with:
• IL2RA SNPs (rs12722495, rs2104286 and rs11594656)
• age
• sex
Page 7
7 of 27
2012-10-29, Nikolas Pontikos, Automatic Analysis of Flow Cytometry Data
Memory CD25-
0.0 0.5 1.0 1.5
0.0
0.5
1.0
1.5
2.0
Log10 CD25 Intensity
Log 1
0 CD
45R
A In
tens
ity
4)Naive CD25-
CD25 Gate
CD45RA+ (Naive) Gate
CD45RA- (Memory) Gate
Naive CD25+
MemoryCD25+
Memory
% of CD25+ Naive Cells over total Naive Cells
Automatic Gating on CD25
Page 8
8 of 27
2012-10-29, Nikolas Pontikos, Automatic Analysis of Flow Cytometry Data
Automatic Gating on CD25:Defining Threshold
Defining threshold above which cells are CD25 positive:
automatic gating:
95th percentile of auto gated blank beads
Automatic Gating: Only one CD25 gate for all samples per day.
auto.beads
manual
manual gating:
manually gated blank beads + isotype
control + judgment call
−
−
−
−
−
−
−
−−
−
−
−
−
−
−
− −−−−
−
−
−
−
−−
−
−
−
−
−−−
−−
−−
−−
−−−
−
−
−−−
−−−−−−−−−
−
−−
−
−− −−−
−−−−−−
−
−−−−−−−−−−−−−−−−−−−−−−−−−−
−−−
−
−−−−−
−
−−−
−−−
−
−
−−
−−−
−−
−−
−−−−−−−
−−−
−−−−−−−−−−
−−
−
−−−−−−
−−−−−−
−−
−
−−
−
−−−−
−−
−
−
−
−−
−
−−
−
−
−
−
−
−−−
−
− −
−
− −−−
Mar May Jul Sep
78
910
1112
CD25+ Gate Over Time
CD
25+
Thre
shol
d
Page 9
9 of 27
2012-10-29, Nikolas Pontikos, Automatic Analysis of Flow Cytometry Data
Percentage of Naive CD25+ Cell Phenotype:
Association
Auto Gating: SNP and Sex Effect Detected
auto.beadsmanual
−0.6
−0.4
−0.2
0.0
0.2
0.4
rs12722495 rs2104286 rs11594656 Age/10 Male
●
●
●
●
●
●
●
●
●
●
Page 10
10 of 27
2012-10-29, Nikolas Pontikos, Automatic Analysis of Flow Cytometry Data
Percentage of Naive CD25+ Cell Phenotype:
Repeatability
a
b
c
d
e
f
gh
j
k
lm
n
o
p
5 10 15 20 25
510
1520
25
CD25+ Naive % Day 1
CD
25+
Nai
ve %
Day
2
a
b
c
d
e
f
gh
j
k
lm
n
o
p
R2
auto.beads 0.797
manual 0.598
15 recalled individuals (a, b, c, d, ..., o, p)
Auto Gating: Better Repeatability Than Manual
Page 11
11 of 27
2012-10-29, Nikolas Pontikos, Automatic Analysis of Flow Cytometry Data
Memory CD25-
0.0 0.5 1.0 1.5
0.0
0.5
1.0
1.5
2.0
Log10 CD25 Intensity
Log 1
0 CD
45R
A In
tens
ity
4)Naive CD25-
CD25 Gate
CD45RA+ (Naive) Gate
CD45RA- (Memory) Gate
Naive CD25+
MemoryCD25+
Memory
Automatic Gating on CD45RA
% of Memory Cells over total Non T Regs
Page 12
12 of 27
2012-10-29, Nikolas Pontikos, Automatic Analysis of Flow Cytometry Data
Cells which Transition from Naive to Memory
Lose Expression of CD45RA
Memory Naive
CD45RA
Memory Gate Naive Gate
manual gates:
identify peaks remove
middle bit
0.0 0.5 1.0 1.5 2.0
0.0
0.2
0.4
0.6
0.8
1.0
Log10 CD45RA Intensity
Den
sity
Given usual bimodal distribution
of CD45RA:
Page 13
13 of 27
2012-10-29, Nikolas Pontikos, Automatic Analysis of Flow Cytometry Data
Automatic Gating on CD45RA:
Fitting Mixtures of Two Distributions
Fit a mixture of two Gaussian (mm)
distributions.
0.0 0.5 1.0 1.5 2.0
0.0
0.2
0.4
0.6
0.8
1.0
Log10 CD45RA Intensity
Den
sity
mm
Page 14
14 of 27
2012-10-29, Nikolas Pontikos, Automatic Analysis of Flow Cytometry Data
Automatic Gating on CD45RA:
Fitting Mixtures of Two Distributions
Fit a mixture of two Gaussian (mm)
distributions.
mm
mm posterior
95%
Memory Gates Naive Gates
0.0 0.5 1.0 1.5 2.0
0.0
0.2
0.4
0.6
0.8
1.0
Log10 CD45RA Intensity
Den
sity
mm
Define the gates by choosing
thresholds at which the posterior
probability of group membership
exceeds 95%.
Page 15
15 of 27
2012-10-29, Nikolas Pontikos, Automatic Analysis of Flow Cytometry Data
Automatic Gating on CD45RA:
Fitting Mixtures of Two Distributions
sp.mm
sp.mm posterior
95%
Memory Gates Naive Gates
0.0 0.5 1.0 1.5 2.0
0.0
0.2
0.4
0.6
0.8
1.0
Log10 CD45RA Intensity
Den
sity
Fit a mixture of two semi-parametric
symmetric distributions (sp.mm)
Page 16
16 of 27
2012-10-29, Nikolas Pontikos, Automatic Analysis of Flow Cytometry Data
mm
sp.mm
manual
Naive GatesMemory Gates
%Memory
manual 66
sp.mm 66
mm 59
Percentage of Memory T Cell Phenotype
0.0 0.5 1.0 1.5 2.0
0.0
0.2
0.4
0.6
0.8
1.0
Log10 CD45RA Intensity
Den
sity
Page 17
17 of 27
2012-10-29, Nikolas Pontikos, Automatic Analysis of Flow Cytometry Data
Percentage of Memory T Cells Phenotype:
Association
Auto Gating: mm finds no age association
sp.mm
mm
manual−0.2
−0.1
0.0
0.1
0.2
0.3
rs12722495 rs2104286 rs11594656 Age/10 Male
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
Page 18
18 of 27
2012-10-29, Nikolas Pontikos, Automatic Analysis of Flow Cytometry Data
Percentage of Memory T Cells Phenotype:
Repeatability
15 recalled individuals (a, b, c, d, ..., o, p)
Auto Gating: Repeatability Compromised by d
a b
c
d
e
f
g
h
j
k
l
m n
o p
20 40 60 80
2040
6080
Memory % Day 1
Mem
ory
% D
ay 2
ab
c
d
e
f
g
h
j
k
l
m
no
pa
b
c
d
e
fg
h
j
k
l
mn
o
p
R2
sp.mm 0.404
mm 0.139
manual 0.768
Page 19
19 of 27
2012-10-29, Nikolas Pontikos, Automatic Analysis of Flow Cytometry Data
Problem with mm: outliers
Individual d looks completely different on day two.
a b
c
d
e
f
g
h
j
k
l
m n
o p
20 40 60 80
2040
6080
Memory % Day 1
Mem
ory
% D
ay 2
ab
c
d
e
f
g
h
j
k
l
m
no
pa
b
c
d
e
fg
h
j
k
l
mn
o
p
R2
sp.mm 0.404
mm 0.139
manual 0.768
Page 20
20 of 27
2012-10-29, Nikolas Pontikos, Automatic Analysis of Flow Cytometry Data
Other Samples From The Same Day
0.0 0.5 1.0 1.5 2.0
0.0
0.2
0.4
0.6
0.8
Log10 CD45RA Intensity
Den
sity
Log10 CD25 Intensity
Log 1
0 CD
45R
A In
tens
ity
0.0 0.5 1.0 1.5 2.0
0.0
0.5
1.0
1.5
2.0
0.0 0.5 1.0 1.5 2.0
0.0
0.2
0.4
0.6
0.8
1.0
Log10 CD45RA Intensity
Den
sity
Log10 CD25 Intensity
Log 1
0 CD
45R
A In
tens
ity
0.0 0.5 1.0 1.5 2.0
0.0
0.5
1.0
1.5
2.0
0.0 0.5 1.0 1.5 2.0
0.0
0.2
0.4
0.6
0.8
Log10 CD45RA Intensity
Den
sity
Log10 CD25 Intensity
Log 1
0 CD
45R
A In
tens
ity
0.0 0.5 1.0 1.5 2.0
0.0
0.5
1.0
1.5
2.0
Individual d
manual gatessp.mm gatesmm gates
CD25 Gate
%Memory
manual 52
sp.mm 57
mm 42
%Memory
manual 34
sp.mm 40
mm 62
%Memory
manual 57
sp.mm 35
mm 8
Page 21
21 of 27
2012-10-29, Nikolas Pontikos, Automatic Analysis of Flow Cytometry Data
A First Approach:Averaging Over Gate Positions
Averaging Gate Positions from Samples on Same Day:
mmGating
0.0 0.5 1.0 1.5 2.0
0.0
0.2
0.4
0.6
0.8
1.0
Log10 CD45RA Intensity
Den
sity
0.0 0.5 1.0 1.5 2.0
0.0
0.2
0.4
0.6
0.8
1.0
Log10 CD45RA Intensity
Den
sity
0.0 0.5 1.0 1.5 2.0
0.0
0.2
0.4
0.6
0.8
1.0
Log10 CD45RA Intensity
Den
sity
0.0 0.5 1.0 1.5 2.0
0.0
0.2
0.4
0.6
0.8
1.0
Log10 CD45RA Intensity
Den
sity
0.0 0.5 1.0 1.5 2.0
0.0
0.2
0.4
0.6
0.8
1.0
Log10 CD45RA Intensity
Den
sity
0.0 0.5 1.0 1.5 2.0
0.0
0.2
0.4
0.6
0.8
1.0
Log10 CD45RA Intensity
Den
sity
learned.mm
mmGating
mmGating
Page 22
22 of 27
2012-10-29, Nikolas Pontikos, Automatic Analysis of Flow Cytometry Data
Closer Agreement to Manual
learned.mm
manual
Percentage of Memory T Cells Phenotype:
Association
−0.2
−0.1
0.0
0.1
0.2
0.3
rs12722495 rs2104286 rs11594656 Age/10 Male
●
●
●
● ●
●
●
●
●
●
Page 23
23 of 27
2012-10-29, Nikolas Pontikos, Automatic Analysis of Flow Cytometry Data
Improved Repeatability: better than sp.mm
a b
c
de
fg
h
j
k
l
m
no p
20 40 60 80
2040
6080
Memory % Day 1M
emor
y %
Day
2
ab
c
d
e
f
g
h
j
k
l
m
no
pa
b
c
d
e
fg
h
j
k
l
mn
o
p
R2
learned.mm 0.666
mm 0.139
manual 0.768
a b
c
d
e
f
g
h
j
k
l
m n
o p
20 40 60 80
2040
6080
Memory % Day 1
Mem
ory
% D
ay 2
ab
c
d
e
f
g
h
j
k
l
m
no
pa
b
c
d
e
fg
h
j
k
l
mn
o
p
R2
sp.mm 0.404
mm 0.139
manual 0.768
Percentage of Memory T Cells Phenotype:
Repeatability
Page 24
24 of 27
2012-10-29, Nikolas Pontikos, Automatic Analysis of Flow Cytometry Data
A Smarter Approach:Hierarchical Mixture Model
Parameters fit to one sample influence parameters fitted to other samples
ParameterEstimation
ParameterEstimation
ParameterEstimation
0.0 0.5 1.0 1.5 2.0
0.0
0.2
0.4
0.6
0.8
1.0
Log10 CD45RA Intensity
Den
sity
0.0 0.5 1.0 1.5 2.0
0.0
0.2
0.4
0.6
0.8
1.0
Log10 CD45RA Intensity
Den
sity
0.0 0.5 1.0 1.5 2.00.
00.
20.
40.
60.
81.
0
Log10 CD45RA Intensity
Den
sity
Page 25
25 of 27
2012-10-29, Nikolas Pontikos, Automatic Analysis of Flow Cytometry Data
Can a Computer Do Better than a Human?
In picking a more consistent threshold:
More complex gating on CD45RA:
Yes as seen in the case of CD25 thresholding
Not yet mainly because of outliers
Page 26
26 of 27
2012-10-29, Nikolas Pontikos, Automatic Analysis of Flow Cytometry Data
Future
Moving Away from Manual Gating:-Full Auto Gating on Known/Defined Subsets.
-Automatic Gating of Unknown Subsets.
-Development of Automatic Pipeline.
Dealing with outliers:-Probabilistic Cell Phenotypes.
-Outlier Detection and Reporting of Anomalies.
Better Model Fitting:- Different Types of Distributions (skewed
distributions).
- Hierarchical Approach (Bayesian Mixture Models).
Page 27
27 of 27
2012-10-29, Nikolas Pontikos, Automatic Analysis of Flow Cytometry Data
Acknowledgments
Calliope Dendrou
Vincent Plagnol
Linda Wicker
John Todd
Stats Group:
Jason Cooper
Hui Guo
Xin Yang
Immunologists:
Tony Cutler
Ricardo Ferreira
Marcin Pekalski
Supervisor: Chris Wallace
Second Supervisor: Anna Petrunkina-Harrison