presentation_6-7-2013-15-57-42

advertisement
Challenges in Process Comparison
Studies
Seth Clark, Merck and Co., Inc.
Acknowledgements: Robert Capen, Dave Christopher, Phil Bennett,
Robert Hards, Xiaoyu Chen, Edith Senderak, Randy Henrickson
1
Key Issues
• There are different challenges for biologics versus small
molecules in process comparison studies
• Biologic problem is often poorly defined
• Strategies for addressing risks associated with process
variability early in product life cycle with limited
experience
2
Biologic Process Comparison Problem
•
Biological products such as monoclonal antibodies have complex
bioprocesses to derive, purify, and formulate the “drug substance” (DS) and
“drug product” (DP)
Buffers
Resins
Cells
Separation &
Fermentation
Purification
Medium
Buffers
DP
•
Filtration
DS
Formulation
The process definition established for Phase I clinical supplies may have to
be changed for Phase III supplies (for example).
–
–
–
–
Scale up change: 500L fermenter to 5000L fermenter
Change manufacturing site
Remove additional impurity for marketing advantage
Change resin manufacturer to more reliable source
3
Comparison Exercise
ICH Q5E:
The goal of the comparability exercise is to ensure the quality, safety and
efficacy of drug product produced by a changed manufacturing process,
through collection and evaluation of the relevant data to determine whether there
might be any adverse impact on the drug product due to the manufacturing
process changes
Scientific justification
for analytical only
comparison
N
Comparison decision
Y
Meaningful change in
CQAs or important
analytical QAs
N
Comparable
Y
Meaningful
change in
preclinical animal
and/or clinical S/E
N
Y
Not
Comparable
4
What about QbD?
X space
Y space
Critical process parms., Material Attrb.
Critical Quality Attributes
Models
Complete?
Acceptable
Quality Constraint
Region
that links to
Safety, efficacy,
etc.
Knowledge Space
Models?
S/E = f(CQAs) + e = f(g(CPP)) + e
QbD relates process
parameters (CPPs) to CQAs
which drive S/E in the clinic
Z space
Clinical Safety/Efficacy
(S/E)
Acceptable
Clincial S/E
5
Risks and Appropriate Test
Truth
Comparable
Not Comparable
Comparable
Correct
Consumer Risk (mostly)
Conclusion
Not Comparable Producer Risk (mostly)
Correct
H0: Not Comparable Analytically
Action: Examine with scientific judgment, determine if
preclinical/clinical studies needed to determine comparability
Ha: Comparable Analytically
Action:
•
•
•
•
•
•
Support scientific argument with evidence
for Comparable CQAs
Hypotheses of an equivalence type of test
Process mean and variance both important
Study design and “sample size” need to be addressed
Meaningful differences are often not clear
Difficulty defining meaningful differences & need to demonstrate “highly similar” imply
statistically meaningful differences may also warrant further evaluation
6
Non-comparability can result from “improvement”
Specification Setting
USL
URL
CQA
f(CQAs) = S/E ??
~
LRL
Clinical
Safety/Efficacy
(S/E)
LSL
•
In many cases for biologics an explicit f linking CQA to S/E is unknown
•
usually is an qualitative link between CQA and S/E
•
Difficult to establish such an f for biologics
•
Specs correspond to this link and are refined & supported with clinical
experience and data on process capability and stability
7
Process and Spec Life Cycle
Time
Preclinical
CQA Release
USL
Phase I
Commercial
Phase III
Process 1
Process Development
1
Process 2
2
Process 3
3
LSL
Process 3
Process 4
4
Design Space in Effect
1 Preliminary specs and process 1
identified
Preclinical/Animal data
Phase I Phase III
Study
Study
Clinical Trial Data
Commercial
2 Upper spec revised based on clinical S
Process revised to lower mean
3 Process revised again but is not tested in
clinic (analytical comparison only)
4 Process 3 in commercial production8with
further post approval changes
Sample Size Problem
• “Wide format”
• Unbalanced (N old process > N new process)
• Process variation, N = # lots
– Usually more of a concern
– Independence of lots
– What drives # lots available?
1. Needs for clinical program
2. Time, resources, funding available
3. Rules of thumb
– Minimum 3 lots/process for release
– 3 lots/process or fewer stability
– 1-2 for forced degradation (2 previous vs 1 new)
• DF for estimating assay variation
– Usually less of a concern
• Have multiple stability testing results
• Have assay qualification/validation data sets
9
More about # of Lots
Same
source DS
lot!
DP Lot
L00528578
L00528579
L00518510
L00518511
L00518542
DS Lot
07-001004
07-001007
07-001013
07-001013
07-001013
“Three consecutive successful batches has become the de facto industry
practice, although this number is not specified in the FDA guidance
documents” Schneider et. al. (2006)
“…batches are not independent. This could be the case if the manufacturer
does not shut down, clean out, and restart the manufacturing process from
scratch for each of the validation batches.” Peterson (2008)
10
Stability Concerns
Forced Degradation
Evaluate differences in slope between processes
Evaluate differences in derivative curve
ο‚Ά CQA/ο‚Ά week
Long term Stability
Blue process shows improvement in rate
οƒ Not comparable
Y = (  + Lot ) + (1 + LotTemp + Temp)*f(Months) + eTest + eResidual
•
•
•
Constrained intercept multiple temperature model gives more precise lot release
means and good estimates of assay + sample variation
Similar sample size problems
Generally don’t test for differences in lot variation given limited # lots
11
Methods and Practicalities
• Methods used
– Comparable to data range
π‘Œ2(𝑁2 ) ≤ π‘Œ1(𝑁1 ) and π‘Œ2(1) ≥ π‘Œ1(1)
– Conforms to control-limit
• Tolerance limits
• 3 sigma limits
π‘Œ2(𝑁2 ) ≤π‘Œ1 + π‘˜π‘†πΈ and π‘Œ2(1) ≥ π‘Œ1 −π‘˜π‘†πΈ
• multivariate process control
– Difference test
π‘Œ1 −π‘Œ2 +𝑑𝑆𝐸𝑑𝑖𝑓𝑓 < 0 or π‘Œ1 −π‘Œ2 −𝑑𝑆𝐸𝑑𝑖𝑓𝑓 > 0
– Equivalence test
π‘Œ1 −π‘Œ2 +𝑑𝑆𝐸𝑑𝑖𝑓𝑓 < Δ and π‘Œ1 −π‘Œ2 −𝑑𝑆𝐸𝑑𝑖𝑓𝑓 > −Δ
• Not practical
– Process variance comparison
– Large # lots late in development, prior to commercial
12
Methods and Practicalities
Symbols are
N historical lots
Comparisons to
N2=3 new lots
LSL = -1
Mean=0
USL = 1
Delta = 0.25
Assay var = 2*lot var
Total SD = 0.19
Alpha = Pr(test concludes analytically comparable when not) = Pr(consumer risk)
Beta = Pr(test concludes not analytically comparable when is) = Pr(producer risk)
13
Defining a Risk Based Meaningful Difference

0
𝐿𝑅𝐿, 0
3
RSD
π‘ˆπ‘…πΏ − 𝐿𝑅𝐿
6πΆπ‘π‘˜
π‘ˆπ‘…πΏ + 𝐿𝑅𝐿 π‘ˆπ‘…πΏ − 𝐿𝑅𝐿
,
2
6πΆπ‘π‘˜
2
3
2
1
1
Cpkο‚³C Boundary

Cpuο‚³C
Boundary
0
π‘ˆπ‘…πΏ, 0
Starting process
πΆπ‘π‘˜ = min
1 Change not meaningful
2 Change meaningful
3 Change borderline meaningful
Risk level of meaningful differences
are fine tuned through Cpk or Cpu

0,0
𝐢𝑝𝑒 =
π‘ˆπ‘…πΏ, 0
πœ‡ − 𝐿𝑅𝐿 π‘ˆπ‘…πΏ − πœ‡
,
3𝜎
3𝜎
ln(π‘ˆπ‘…πΏ) − ln(πœ‡)
3𝜎
LRL = Lower release limit
URL = Upper release limit
 = process mean
Key quality 14
 = process variance
characteristic
Defining a Risk Based Meaningful Difference
π‘ˆπ‘…πΏ + 𝐿𝑅𝐿 π‘ˆπ‘…πΏ − 𝐿𝑅𝐿
,
2
6πΆπ‘π‘˜
π‘ˆπ‘…πΏ − 𝐿𝑅𝐿
6πΆπ‘π‘˜
0
RSD

2
2
1
Cpkο‚³C Boundary
𝐿𝑅𝐿, 0

Starting process
1 Meaningful change
2 Meaningful change?
1
Cpuο‚³C
Boundary
0
π‘ˆπ‘…πΏ, 0
0,0

π‘ˆπ‘…πΏ, 0
Underlying Assumption that
we are starting with a process
that already has acceptable risk
15
Two-sided meaningful change
• Simplifying Assumptions
–
Process 1 is in control with good capability (true Cpk>C) with respect to meaningful change
window, (L,U)
–
Process 1 is approx. centered in meaningful change window
–
Process distributions are normally distributed with same process variance, 2
• Equivalence Test on process distribution mean difference
H0: |πœ‡1 − πœ‡2 | ≥ Δ
HA: |πœ‡1 − πœ‡2 | < Δ
Risk based Δ in terms of Cpk:
Δ=
π‘ˆ−𝐿
𝐢
1−
2
πΆπ‘π‘˜
The power of this test at πœ‡1 − πœ‡2 = 0 for unbalanced 𝑛 gives the sample size calculation:
𝑛1 𝑛2
≥ 𝑑𝑛1 +𝑛2 −2,1−𝛼 + 𝑑𝑛1 +𝑛2 −2,1−𝛽/2
𝑛1+ 𝑛2
2
1
2
3(πΆπ‘π‘˜ − 𝐢)
Sample size driven by type I and II risks and πΆπ‘π‘˜ − 𝐢, the process risk rel. to max risk 16
Historical
Two-sided meaningful change sample sizes
New
A comparison of 3 batches to 3 batches requires a 3 sigma effect size
A 2 sigma effect size requires a 13 batch historical database to compare to 3 new batches
A 1 sigma effect size requires 70 batch historical database to compare to 10 new batches (not shown)
17
Effect size = process capability in #sigmas vs max tolerable capability in #sigmas
One-sided (upper) meaningful change
• Similar simplifying assumptions as with two-sided evaluation
–
Meaningful change window is now (0,U)
• Test on process distribution mean difference
H0: πœ‡2 − πœ‡1 ≥ Δ
Linear
HA: πœ‡2 − πœ‡1 < Δ
H0: πœ‡2 /πœ‡1 ≥ Δ
Ratio
HA: πœ‡2 /πœ‡1 < Δ
Risk based Δ in terms of Cpk:
Δ=
π‘ˆ
𝐢
1−
2
πΆπ‘π‘˜
Risk based Δ in terms of Cpk:
Δ=2
1−
𝐢
πΆπ‘π‘˜
The sample size at πœ‡2 − πœ‡1 = 0 or πœ‡2 /πœ‡1 =1 for unbalanced 𝑛 :
𝑛1 𝑛2
≥
𝑛1+ 𝑛2
𝑑𝑛1 +𝑛2 −2,1−𝛼 + 𝑑𝑛1 +𝑛2 −2,1−𝛽
3(πΆπ‘π‘˜ − 𝐢)
2
Sample size driven by type I and II risks and πΆπ‘π‘˜ − 𝐢, the process risk rel. to max risk 18
Historical
One-sided meaningful change sample sizes
New
A comparison of 3 batches to 3 batches requires a 3 sigma effect size
A 2 sigma effect size requires a 6 batch historical database to compare to 3 new batches
A 1 sigma effect size requires 20 batch historical database to compare to 10 new batches (not shown)
19
Effect size = process capability in #sigmas vs max tolerable capability in #sigmas
Study Design Issues
Designs for highly variable assays: what is a better design?
Design
Process 1 + assay
Run 1
Run 1
P1L1
P1L2
P1L1
P2L1
Run 2
Run 2
Process 1
Process 2 + assay
versus
P1L2
P2L2
…
…
Process 2
P2L1
P2L2
Run na
Run na
P1Lk
P1Lk
P1Lk
P2Lk
20
Sample size with control of assay variation
Tested in
same runs
Comparisons to
N2=3 new lots
LSL = -1
Mean=0
USL = 1
Delta = 0.25
Run var = 2*lot var
Rep var = lot var
Total SD = 0.15
21
Summary
• Many challenges in process comparison for biologics, chief being
number of lots to evaluate the change
• For risk based mean shift comparison, process capability needs to
be at least a 4 or 5 sigma process within meaningful change
windows, such as within release limits.
• Careful design of method testing and use of stability information can
improve sample size requirements
• If this is not achievable, the test/criteria needs to be less powerful
(increased producer risk), such as by “flagging” any observed
difference to protect consumers risk
• Flagged changes need to be assessed scientifically to determine
analytical comparability
22
Backup
23
References
•
•
•
•
•
ICH Q5E: Comparability of Biotechnological/Biological Products Subject to
Changes in their Manufacturing Process
Peterson, J. (2008), “A Bayesian Approach to the ICH Q8 Definition of
Design Space,” Journal of Biopharmaceutical Statistics, 18: 959-975
Schneider, R., Huhn, G., Cini, P. (2006). “Aligning PAT, validation, and
post-validation process improvement,” Process Analytical Technology
Insider Magazine, April
Chow, Shein-Chung, and Liu, Jen-pei (2009) Design and Analysis of
Bioavailability and Bioequivalance Studies, CRC press
Pearn and Chen (1999), “Making Decisions in Assessing Process Capability
Index Cpk”
24
Defining a Risk Based Meaningful Difference
π‘ˆπ‘…πΏ + 𝐿𝑅𝐿 π‘ˆπ‘…πΏ − 𝐿𝑅𝐿
,
2
6πΆπ‘π‘˜
π‘ˆπ‘…πΏ − 𝐿𝑅𝐿
6πΆπ‘π‘˜
π‘ˆπ‘…πΏ + 𝐿𝑅𝐿 π‘ˆπ‘…πΏ − 𝐿𝑅𝐿
,
2
6πΆπ‘π‘˜
2
2


3
3
1
1
Cpkο‚³C Boundary
0
𝐿𝑅𝐿, 0

Cpmο‚³C
Boundary
0
π‘ˆπ‘…πΏ, 0
Starting process
πΆπ‘π‘˜ = min
1 Change not meaningful
2 Change meaningful
3 Change borderline meaningful
Risk level of meaningful differences
are fine tuned through Cpk or Cpm

𝐿𝑅𝐿, 0
πΆπ‘π‘š
π‘ˆπ‘…πΏ, 0
πœ‡ − 𝐿𝑅𝐿 π‘ˆπ‘…πΏ − πœ‡
,
3𝜎
3𝜎
π‘ˆπ‘…πΏ − 𝐿𝑅𝐿
=
6𝜎
πœ‡−𝑇
1+
𝜎
LRL = Lower release limit
URL = Upper release limit
 = process mean
 = process variance
2
25
Test Cpk?
Assume process 1 is in control and has good capability (true Cpk>1)
with respect to the release limits.
Suppose process 2 is considered comparable to process 1 if πΆπ‘π‘˜,2 > 1.
That is we want to test
H0: πΆπ‘π‘˜,2 ≤ 1
Examine with scientific judgment
HA: πΆπ‘π‘˜,2 > 1
Evidence for Comparable CQAs
How many lots are needed to have 80% power assuming they are
measured with high precision (e.g., precision negligible) with
alpha=0.05?
Critical Value =
2 /( n ο€­ 1)  [( n ο€­ 1) / 2 ]
3 n  [( n ο€­ 2 ) / 2 ]
t ( n ο€­ 1,1 ο€­  , 3 C
n)
Power =
1
− Pr 𝑑(𝑛 − 1,3πΆπ‘π‘˜ 𝑛)
Pearn and Chen (1999), “Making Decisions in Assessing Process Capability Index Cpk”
26
Power
Assume process 1 is in control and has good capability (true Cpk>1)
with respect to the release limits.
Suppose process 2 is considered comparable to process 1 if πΆπ‘π‘˜,2 > 1.
That is we want to test
H0: πΆπ‘π‘˜,2 ≤ 1
Examine further with scientific judgment
HA: πΆπ‘π‘˜,2 > 1
Evidence for Comparable CQAs
alpha
0.05
0.05
0.05
0.05
0.05
0.05
0.05
0.05
K Sigmas
mean from
Cpk2
limits
1.33
1.67
1.33
1.33
1.33
1.67
1.67
1.67
4
5
4
4
4
5
5
5
N
49
17
10
5
3
10
5
3
Power
0.80
0.82
0.23
0.13
0.09
0.54
0.25
0.13
Power
27
Comparability to Range Method
Process
Distribution?
P1L4 P1L1
P2L3
P1L3 P1L2
P1L6
P2L1
P2L2
P1L5
H0: πœ‡2(𝑁2) ≥ πœ‡1(𝑁1) or πœ‡2(1) ≤ πœ‡1(1)
HA: πœ‡2(𝑁2) < πœ‡1
1.
2.
3.
4.
𝑁1
+βˆ† and πœ‡2(1) > πœ‡1(1) −βˆ†
Determine subset of all historical lots that are representative of historical lot distribution with
sufficient data
List of historical true lot means defines our historical distribution
New process (P2) has significant evidence of comparability if the range of true lot means for the new
28
process can be shown to be within the range of the historical true lots + meaningful difference
If meaningful difference is not defined, set βˆ† = 0
Download