Challenges in Process Comparison Studies Seth Clark, Merck and Co., Inc. Acknowledgements: Robert Capen, Dave Christopher, Phil Bennett, Robert Hards, Xiaoyu Chen, Edith Senderak, Randy Henrickson 1
Challenges in Process Comparison Studies
Seth Clark, Merck and Co., Inc.
Acknowledgements: Robert Capen, Dave Christopher, Phil Bennett, Robert Hards, Xiaoyu Chen, Edith Senderak, Randy Henrickson
1
Key Issues
• There are different challenges for biologics versus small molecules in process comparison studies
• Biologic problem is often poorly defined
• Strategies for addressing risks associated with process variability early in product life cycle with limited experience
2
Biologic Process Comparison Problem• Biological products such as monoclonal antibodies have complex
bioprocesses to derive, purify, and formulate the “drug substance” (DS) and “drug product” (DP)
• The process definition established for Phase I clinical supplies may have to be changed for Phase III supplies (for example).– Scale up change: 500L fermenter to 5000L fermenter– Change manufacturing site– Remove additional impurity for marketing advantage– Change resin manufacturer to more reliable source 3
Separation & PurificationFermentation
FormulationFiltration DSDP
Cells
Medium
BuffersResins
Buffers
Comparison Exercise
4
ICH Q5E:The goal of the comparability exercise is to ensure the quality, safety and efficacy of drug product produced by a changed manufacturing process, through collection and evaluation of the relevant data to determine whether there might be any adverse impact on the drug product due to the manufacturing process changes
Comparison decision Meaningful change in CQAs or important
analytical QAs
Meaningful change in preclinical
animal and/or clinical S/E
Scientific justification for analytical only
comparisonN Y
Comparable
Not Comparable
Y
N
Y
N
5
What about QbD?
Knowledge Space
X spaceCritical process parms., Material Attrb.
Y spaceCritical Quality Attributes
Models
DSAcceptable
Quality Constraint Region
that links to Safety, efficacy,
etc.
Z spaceClinical Safety/Efficacy
(S/E)Acceptable Clincial S/E
S/E = f(CQAs) + e = f(g(CPP)) + eModels?
Complete?
QbD relates process parameters (CPPs) to CQAs which drive S/E in the clinic
Risks and Appropriate Test
6
Comparable Not ComparableComparable Correct Consumer Risk (mostly)
Not Comparable Producer Risk (mostly) Correct
Truth
Conclusion
Ha: Comparable AnalyticallyAction: Support scientific argument with evidence
for Comparable CQAs
H0: Not Comparable AnalyticallyAction: Examine with scientific judgment, determine if preclinical/clinical studies needed to determine comparability
• Hypotheses of an equivalence type of test• Process mean and variance both important• Study design and “sample size” need to be addressed• Meaningful differences are often not clear• Difficulty defining meaningful differences & need to demonstrate “highly similar” imply
statistically meaningful differences may also warrant further evaluation• Non-comparability can result from “improvement”
Specification SettingC
QA
USL
LSL
~Clinical Safety/Efficacy (S/E)
f(CQAs) = S/E ??
• In many cases for biologics an explicit f linking CQA to S/E is unknown• usually is an qualitative link between CQA and S/E
• Difficult to establish such an f for biologics
• Specs correspond to this link and are refined & supported with clinical experience and data on process capability and stability 7
URL
LRL
Preliminary specs and process 1 identified
Upper spec revised based on clinical S
Process revised to lower mean
Process revised again but is not tested in clinic (analytical comparison only)
Process 3 in commercial production with further post approval changes
Process and Spec Life Cycle
Preclinical Phase I
Phase IIIStudy
Commercial
CQ
A R
elea
se
USL
LSL
Process 1
Process Development Process 2 Process 3
Phase IStudy
Commercial
Clinical Trial Data
Phase III
1 2 3 4
Process 3Process 4
1
2
3
4
Time
Design Space in Effect
Preclinical/Animal data
8
Sample Size Problem• “Wide format”• Unbalanced (N old process > N new process)• Process variation, N = # lots
– Usually more of a concern– Independence of lots– What drives # lots available?
1. Needs for clinical program2. Time, resources, funding available3. Rules of thumb
– Minimum 3 lots/process for release – 3 lots/process or fewer stability– 1-2 for forced degradation (2 previous vs 1 new)
• DF for estimating assay variation– Usually less of a concern
• Have multiple stability testing results• Have assay qualification/validation data sets
9
More about # of Lots
Same source DS lot!
“…batches are not independent. This could be the case if the manufacturer does not shut down, clean out, and restart the manufacturing process from scratch for each of the validation batches.” Peterson (2008)
“Three consecutive successful batches has become the de facto industry practice, although this number is not specified in the FDA guidance documents” Schneider et. al. (2006)
10
DP Lot DS LotL00528578 07-001004L00528579 07-001007L00518510 07-001013L00518511 07-001013L00518542 07-001013
50
60
70
CQ
A R
esul
t
0 3 6 9Month
Stability Concerns
• Constrained intercept multiple temperature model gives more precise lot release means and good estimates of assay + sample variation
• Similar sample size problems• Generally don’t test for differences in lot variation given limited # lots
11
Long term Stability Forced DegradationEvaluate differences in slope between processes Evaluate differences in derivative curve
0
2
4
6
8
10
12
0 1 2 3 4Week
C
QA
/ w
eek
Blue process shows improvement in rateNot comparable
Y = ( + Lot ) + (1 + LotTemp + Temp)*f(Months) + eTest + eResidual
Methods and Practicalities• Methods used
– Comparable to data range– Conforms to control-limit
• Tolerance limits
• 3 sigma limits
• multivariate process control
– Difference test– Equivalence test
• Not practical– Process variance comparison– Large # lots late in development, prior to commercial
12
𝑌 2(𝑁2 )≤𝑌 1(𝑁1)
and 𝑌 2 (1 )≥𝑌 1(1)
𝑌 2(𝑁2 )≤𝑌 1+𝑘𝑆𝐸 and 𝑌 2(1)≥𝑌 1−𝑘𝑆𝐸
𝑌 1−𝑌 2+𝑡 𝑆𝐸𝑑𝑖𝑓𝑓 < Δand𝑌 1−𝑌 2− 𝑡 𝑆𝐸𝑑𝑖𝑓𝑓>− Δ𝑌 1−𝑌 2+𝑡 𝑆𝐸𝑑𝑖𝑓𝑓 < 0 or𝑌 1−𝑌 2− 𝑡 𝑆𝐸𝑑𝑖𝑓𝑓 >0
Methods and Practicalities
13
comparable to 3 sig Rangecomparable to data Rangecomparable to tolerance rangeDifference testEquivalence test
Method
Symbols are N historical lots
Comparisons to N2=3 new lots
LSL = -1Mean=0USL = 1Delta = 0.25Assay var = 2*lot varTotal SD = 0.19
Alpha = Pr(test concludes analytically comparable when not) = Pr(consumer risk)Beta = Pr(test concludes not analytically comparable when is) = Pr(producer risk)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1be
ta
345678910371089456
3
45678910
10978
65
4
9
3
10
78
4
6
3
5
0 0.2 0.4 0.6 0.8 1alpha
Defining a Risk Based Meaningful Difference
Starting process
1
2
3
Change not meaningful
Change meaningful
Change borderline meaningful
14Risk level of meaningful differences are fine tuned through Cpk or Cpu
𝐶𝑝𝑘=min(𝜇− 𝐿𝑅𝐿3𝜎 , 𝑈𝑅𝐿−𝜇3𝜎 )
LRL = Lower release limitURL = Upper release limit = process mean = process variance
(𝑈𝑅𝐿 ,0 )
𝐶𝑝 𝑢=( ln (𝑈𝑅𝐿)− ln (𝜇)3𝜎 )
0
RS
D
CpuCBoundary
2
1
3
2
1
0
𝑈𝑅𝐿− 𝐿𝑅𝐿6𝐶𝑝𝑘
CpkC Boundary
(𝑈𝑅 𝐿+𝐿𝑅𝐿2
,𝑈𝑅𝐿−𝐿𝑅𝐿6𝐶𝑝𝑘 )
(𝐿𝑅𝐿 ,0 ) (𝑈𝑅𝐿 ,0 )
3
(0 ,0 )
Key quality characteristic
Defining a Risk Based Meaningful Difference
15
0
RS
D
CpuCBoundary
(0 ,0 ) (𝑈𝑅𝐿 ,0 )
2
1
Underlying Assumption that we are starting with a process that already has acceptable risk
Starting process
1
2
Meaningful change
Meaningful change?
0
𝑈𝑅𝐿− 𝐿𝑅𝐿6𝐶𝑝𝑘
CpkC Boundary
(𝑈𝑅 𝐿+𝐿𝑅𝐿2
,𝑈𝑅𝐿−𝐿𝑅𝐿6𝐶𝑝𝑘 )
(𝐿𝑅𝐿 ,0 ) (𝑈𝑅𝐿 ,0 )
21
Two-sided meaningful change• Simplifying Assumptions
– Process 1 is in control with good capability (true Cpk>C) with respect to meaningful change window, (L,U)
– Process 1 is approx. centered in meaningful change window
– Process distributions are normally distributed with same process variance, 2
• Equivalence Test on process distribution mean difference
HA:
H0:
Δ=𝑈 −𝐿2 [1− 𝐶
𝐶𝑝𝑘 ]The power of this test at for unbalanced gives the sample size calculation:
Risk based in terms of Cpk:
𝑛1𝑛2
𝑛1+¿𝑛2≥ (𝑡𝑛1+𝑛2− 2,1−𝛼+𝑡𝑛1+𝑛2−2,1−𝛽 /2)2[ 13(𝐶𝑝𝑘−𝐶 ) ]
2
¿
Sample size driven by type I and II risks and , the process risk rel. to max risk 16
Two-sided meaningful change sample sizes
A comparison of 3 batches to 3 batches requires a 3 sigma effect size A 2 sigma effect size requires a 13 batch historical database to compare to 3 new batchesA 1 sigma effect size requires 70 batch historical database to compare to 10 new batches (not shown)
Effect size = process capability in #sigmas vs max tolerable capability in #sigmas
17
0
2
4
6
8
10
12
14
n1
2 3 4 5 6 7 8 9 10 11n2
His
toric
al
New
1,1.331,1.671,2
C,Cpk
One-sided (upper) meaningful change• Similar simplifying assumptions as with two-sided evaluation
– Meaningful change window is now (0,U)
• Test on process distribution mean difference
HA:
H0:
Δ=𝑈2 [1− 𝐶
𝐶𝑝𝑘 ]
The sample size at or 1 for unbalanced :
Risk based in terms of Cpk:
𝑛1𝑛2
𝑛1+¿𝑛2≥[ (𝑡𝑛1+𝑛2−2,1−𝛼+𝑡𝑛1+𝑛2−2,1−𝛽 )3 (𝐶𝑝𝑘−𝐶 ) ]
2
¿
HA:
H0:
Δ=2[1− 𝐶𝐶𝑝𝑘 ]
Risk based in terms of Cpk:
Linear
Ratio
Sample size driven by type I and II risks and , the process risk rel. to max risk 18
One-sided meaningful change sample sizes
A comparison of 3 batches to 3 batches requires a 3 sigma effect size A 2 sigma effect size requires a 6 batch historical database to compare to 3 new batchesA 1 sigma effect size requires 20 batch historical database to compare to 10 new batches (not shown)
Effect size = process capability in #sigmas vs max tolerable capability in #sigmas
19
10.8
0.6
108
654
3
2
20
30
405060
n1
2 3 4 5 6 7 8 9 10 11n2
His
toric
al
New
1,1.331,1.671,2
C,Cpk
Study Design Issues
20
Designs for highly variable assays: what is a better design?
Process 1 + assay
Process 1
Process 2 + assay
Process 2
Run 1
Run 2
Run na
…
P1L1P1L2
P2L1P2L2
P1LkP1Lk
Run 1
Run 2
Run na
…
P1L1P2L1
P1L2P2L2
P1LkP2Lk
Design
versus
0.4
0.5
0.6
0.7
0.8
0.9
1
Equ
iv T
est P
ower
5 10 15 20 25 30Historical N1 lot
21
Sample size with control of assay variation
1,1,N1,1,Y1,4,N4,1,N4,4,N4,4,Y
P1 run/lot, P2 run/lot, Same runs(Y/N)
Tested in same runs
Comparisons to N2=3 new lots
LSL = -1Mean=0USL = 1Delta = 0.25Run var = 2*lot varRep var = lot varTotal SD = 0.15
Summary• Many challenges in process comparison for biologics, chief being
number of lots to evaluate the change• For risk based mean shift comparison, process capability needs to
be at least a 4 or 5 sigma process within meaningful change windows, such as within release limits.
• Careful design of method testing and use of stability information can improve sample size requirements
• If this is not achievable, the test/criteria needs to be less powerful (increased producer risk), such as by “flagging” any observed difference to protect consumers risk
• Flagged changes need to be assessed scientifically to determine analytical comparability
22
Backup
23
References• ICH Q5E: Comparability of Biotechnological/Biological Products Subject to
Changes in their Manufacturing Process• Peterson, J. (2008), “A Bayesian Approach to the ICH Q8 Definition of
Design Space,” Journal of Biopharmaceutical Statistics, 18: 959-975• Schneider, R., Huhn, G., Cini, P. (2006). “Aligning PAT, validation, and post-
validation process improvement,” Process Analytical Technology Insider Magazine, April
• Chow, Shein-Chung, and Liu, Jen-pei (2009) Design and Analysis of Bioavailability and Bioequivalance Studies, CRC press
• Pearn and Chen (1999), “Making Decisions in Assessing Process Capability Index Cpk”
24
Defining a Risk Based Meaningful Difference
0
𝑈𝑅𝐿− 𝐿𝑅𝐿6𝐶𝑝𝑘
CpkC Boundary
(𝑈𝑅 𝐿+𝐿 𝑅𝐿2
,𝑈𝑅𝐿−𝐿𝑅𝐿6𝐶𝑝𝑘 )
(𝐿𝑅𝐿 ,0 ) (𝑈𝑅𝐿 ,0 )
3
2
1
Starting process
1
2
3
Change not meaningful
Change meaningful
Change borderline meaningful
25Risk level of meaningful differences are fine tuned through Cpk or Cpm
𝐶𝑝𝑘=min(𝜇− 𝐿𝑅𝐿3𝜎 , 𝑈𝑅𝐿−𝜇3𝜎 )
LRL = Lower release limitURL = Upper release limit = process mean = process variance
0
CpmCBoundary
(𝑈𝑅 𝐿+𝐿 𝑅𝐿2
,𝑈𝑅 𝐿−𝐿𝑅𝐿6𝐶𝑝𝑘 )
(𝐿𝑅𝐿 ,0 ) (𝑈𝑅𝐿 ,0 )
2
1
3
𝐶𝑝𝑚=(𝑈𝑅𝐿− 𝐿𝑅𝐿6𝜎 )/√1+(𝜇−𝑇𝜎 )2
Test Cpk?
26
Assume process 1 is in control and has good capability (true Cpk>1) with respect to the release limits.
Suppose process 2 is considered comparable to process 1 if . That is we want to test
How many lots are needed to have 80% power assuming they are measured with high precision (e.g., precision negligible) with alpha=0.05?
Pearn and Chen (1999), “Making Decisions in Assessing Process Capability Index Cpk”
)3,1,1(]2/)2[(3]2/)1[()1/(2
nCntnnnn
Power =Critical Value =
HA:
H0:
Evidence for Comparable CQAs
Examine with scientific judgment
Power
27
HA:
H0:
Assume process 1 is in control and has good capability (true Cpk>1) with respect to the release limits.
Suppose process 2 is considered comparable to process 1 if . That is we want to test
Power
Evidence for Comparable CQAs
alpha Cpk2
K Sigmas mean from
limits N Power0.05 1.33 4 49 0.800.05 1.67 5 17 0.820.05 1.33 4 10 0.230.05 1.33 4 5 0.130.05 1.33 4 3 0.090.05 1.67 5 10 0.540.05 1.67 5 5 0.250.05 1.67 5 3 0.13
Examine further with scientific judgment
P1L6P1L3
28
Comparability to Range Method
P1L4 P1L1 P1L2 P1L5
P2L2P2L1P2L3
Process Distribution?
1. Determine subset of all historical lots that are representative of historical lot distribution with sufficient data
2. List of historical true lot means defines our historical distribution3. New process (P2) has significant evidence of comparability if the range of true lot means for the
new process can be shown to be within the range of the historical true lots + meaningful difference4. If meaningful difference is not defined, set
HA:
H0: