Mark Varney
Statistics Program Manager
Abbott Quality and Regulatory
Abbott Park, IL
• Statistics mention 15 times
• “statistical”
• “statistics”
• “statistically”
• “statistician” – as a suggested team member
• Clear that FDA expects more statistical thinking in validation
• Some statisticians asked to be a team member may not be familiar with Quality Assurance applications and jargon
• Acceptance Sampling
• Statistical Process Control (SPC)
• Process Capability
2
2
Process Validation : The collection and evaluation of data, from the process design stage through commercial production, which establishes scientific evidence that a process is capable of consistently delivering quality product.
3
3
The new Guidance specifies a lifecycle approach:
• Stage 1 – Process Design
• Statistically designed experiments (DOE)
• Stage 2 – Process Qualification
• Design of facility and equipment/utilities qualification
• Process Performance Qualification (PPQ)
• SPC; Variance components; Acceptance Sampling; CUDAL, etc.
• Number of lots required is no longer specified as three
• Must complete Stage 2 before commercial distribution
• Stage 3 – Continued Process Verification (CPV)
• Continual assurance the process is operating in a state of control
• Data trending, SPC, Acceptance Sampling, etc.
• Guidance recommends scrutiny of intra- and inter-batch variation
4
4
Requirements: Process performance to consistently meet attributes related to identity, strength, quality, purity, and potency
Statistical confidence required may be based on…
• Risk
• Scientific knowledge
• Criticality of attribute (AQL, etc.)
• Prior / historical knowledge
• Stage 1 knowledge
• Revalidation
• Is test an abuse test?
5
5
Provide statistical confidence that…
1. A high percent of the population is within specification
2. A population parameter is within specification
- Mean; Standard Deviation; RSD; Cpk/Ppk
3. A standard test (UDU, Dissolution, etc.) will pass
6
6
“90% confidence at least 99% of population meets spec”
“90% confidence nonconformance rate <1%”
“90% confidence for 99% reliability ”
Common statistical methods
• Continuous data: Normal Tolerance Interval
• Discrete data: High Confidence Binomial Sampling Plan
7
7
Which sampling plan provides more confidence?
1. n=90, accept=0, reject=1 99% confidence ≥95% conforming
2. n=300, accept=0, reject=1 95% confidence ≥99% conforming
3. n=2300, accept=0, reject=1 90% confidence ≥99.9% conforming
you want to be confident
is usually more important than
you want to be
8
8
• Phrase “ high degree of assurance ” mentioned four times
• “…the PPQ study needs to be completed successfully and a high degree of assurance in the process achieved before commercial distribution of a product.”
ICH Q7A GMP for APIs:
• “A documented program that provides a high degree of assurance that a specific process, method, or system will consistently produce a result meeting pre-determined acceptance criteria.”
• Suggest 90% or 95% confidence is acceptable
• This confidence is more related to Type II error and Power
• Although α=0.05 / 95% confidence is common for Type I error, it is not as common for power, where 80% and 90% also common.
9
9
Variables data: Normal Tolerance Interval*
• Example: Show with 90% confidence that at least 99.6% of powdered drug fill weights meet spec of 505-535mg.
• Test n=50 bottles; 1 every 5 minutes for 4 hrs
• Acceptance criterion: 90% confidence ≥99.6% meet spec
•
• Variables data with average, s.d.: use tolerance interval method
Mean
ks must be within specification limits
• Why 99.6%? Production AQL is 0.4% for fill weight.
*other methods may be used, such as variables sampling; may give lower Type I error
530
520
I Chart
Filler Validation Run
UCL=529.10
_
X=519.31
LSL
Capability Histogram
USL
Specifications
LSL 505
U SL 535
522
516
510
0
510
1 6 11 16 21 26 31
Moving Range Chart
36 41 46
10
5
0
1 6 11 16 21 26 31 36
Last 50 Observations
41 46
10 20 30
Observation
40
LCL=509.52
UCL=12.03
508 512 516 520 524 528 532
Normal Prob Plot
A D: 0.464, P: 0.245
50
__
MR=3.68
LCL=0
510 515 520 525
Within
StDev 3.26386
C p 1.53
C pk 1.46
Capability Plot
Within
O v erall
O v erall
StDev 3.24963
Pp 1.54
Ppk 1.47
C pm *
Specs
• Process is in statistical control, normality not rejected
• 90% confidence / 99.6% coverage tolerance interval: x
ks
(
519 .
31
508 .
4
3 .
35
530 .
2 )
3 .
25
• Pass: Tolerance interval lies within spec of 505 - 535
• We can be 90% confident ≥99.6% of containers meet spec
• Will be able to pass in-process 0.4% AQL sampling
• If process is stable, 90% confidence ≥95% of lots will pass
• Engineer friendly: tables or software can be used
Two-Sided Normal Tolerance Limit k-Factors
90% Confidence 95% Confidence
% Coverage n 95% 99% 99.6% 99.9%
% Coverage n 95% 99% 99.6% 99.9%
20 2.56
3.37
3.76
4.30
30 2.41
3.17
3.55
4.05
40 2.33
3.07
3.43
3.92
50 2.28
3.00
3.35
3.83
60 2.25
2.96
3.30
3.77
20 2.75
3.62
4.04
4.61
30 2.55
3.35
3.74
4.28
40 2.45
3.21
3.59
4.10
50 2.38
3.13
3.49
3.99
60 2.33
3.07
3.43
3.92
70 2.22
2.92
3.27
3.73
80 2.20
2.89
3.24
3.70
90 2.19
2.87
3.21
3.67
70
80
90
2.30
2.27
2.25
3.02
2.99
2.96
3.38
3.34
3.31
3.86
3.81
3.78
100 2.17
2.85
3.19
3.65
100 2.23
2.93
3.28
3.75
Mean ± 3.35s must be within spec limits
One-Sided Normal Tolerance Limit k-Factors
90% Confidence 95% Confidence
% Coverage n 95% 99% 99.6% 99.9%
% Coverage n 95% 99% 99.6% 99.9%
20 2.21
3.05
3.46
4.01
30 2.08
2.88
3.27
3.79
40 2.01
2.79
3.17
3.68
50 1.97
2.73
3.11
3.60
60 1.93
2.69
3.06
3.55
20
30
2.40
2.22
3.30
3.06
3.73
3.47
4.32
4.02
40 2.13
2.94
3.33
3.87
50 2.07
2.86
3.25
3.77
60 2.02
2.81
3.19
3.70
70 1.91
2.66
3.02
3.51
80 1.89
2.64
3.00
3.48
70 1.99
2.77
3.14
3.64
80 1.96
2.73
3.10
3.60
90 1.87
2.62
2.97
3.46
90 1.94
2.71
3.07
3.57
100 1.86
2.60
2.96
3.44
100 1.93
2.68
3.05
3.54
• Need to check for normality to use normal tolerance interval
• Process quality data is often rounded
• Or data is “granular”
• Most normality tests will interpret rounding as non-normality
• Example: n=100 from N(100,1.5
2 )
Unrounded Rounded
100.071
100
98.238
99.122
98
99
99.190
100.623
… , n=100
99
101
… , n=100
Normal Probability Plot of Unrounded
99.9
99
95
90
80
70
Mean 99.78
StDev 1.502
N
AD
100
0.379
P-Value 0.400
30
20
10
5
1
0.1
95.0
97.5
100.0
Unrounded
102.5
105.0
Unrounded data: normality not rejected by Anderson-Darling test
n=100, N(100, 1.5^2) Rounded to 0 Decimals
30
25
Mean 99.84
StDev 1.536
N 100
20
15
10
5
0
96 98 100
Rounded
102 104
• Rounding data causes most normality tests to fail
• SAS 9.2 Proc Univariate Tests:
Unrounded Data
Test --Statistic-------p Value------
Shapiro-Wilk W 0.98874 Pr < W 0.5643
Kolmogorov-Smirnov D 0.063184 Pr > D >0.1500
Cramer-von Mises W-Sq 0.06213 Pr > W-Sq >0.2500
Anderson-Darling A-Sq 0.378299 Pr > A-Sq >0.2500
OK
Rounded Data (to whole numbers)
Test --Statistic---
Shapiro-Wilk
-----p Value------
W 0.956808 Pr < W 0.0024
Kolmogorov-Smirnov D 0.148507 Pr > D <0.0100
Cramer-von Mises W-Sq 0.358655 Pr > W-Sq <0.0050
Anderson-Darling A-Sq 1.926991 Pr > A-Sq <0.0050
Reject normality
• Two normality tests not substantially affected by granularity
• Ryan-Joiner test (Minitab 16)
• Omnibus skewness/kurtosis test
Probability Plot of Rounded and Ryan-Joiner Test
99.9
99
95
90
80
70
Mean
StDev
99.84
1.536
N
RJ
100
0.999
P-Value >0.100
30
20
10
5
1
0.1
95.0
97.5
100.0
Rounded
102.5
105.0
For more information, see Seier, E. “Comparison of Tests for Univariate Normality.”
• Usual 2-sided normal tolerance interval controls both tails
• This can present a problem for an uncentered process
2-Sided Normal Tol Int for 99% Coverage Will Fail
Both tails controlled to 0.5%, half of the non-coverage
LSL USL
0.6%
94 95 96 97 98 99 100 101 102 103 104 105
Example: Removal Torque, Spec = 5.0 – 10.0 in-lbs
95% conf / 99% coverage tolerance interval: (4.85, 8.62) FAILS
Tol_Int
.
Torque: 95% Confidence / 99% Coverage Tolerance Interval
5 10
5.0
5
5.5
6.0
6
6.5
7.0
7
7.5
8.0
8
8.5
9.0
9
9.5
10.0
10
Statistics
N
Mean
StDev
Normal
30
6.738
0.561
Lower
Upper
4.852
8.623
Normality Test
AD 0.281
P-Value 0.617
Normal Probability Plot
99
90
50
10
1
5.5
6.0
6.5
7.0
7.5
8.0
8.5
• Usual 2-sided normal tolerance interval controls both tails
• This can present a problem for an uncentered process
• Can use estimation for proportion conforming
• Also called bilateral conformance proportion
• Reduce probability of failing for uncentered processes
• Similar method used by ANSI Z1.9 for routine production sampling
Let
The
Pass
Y be the quality characteri bilateral if upper conformanc e
C .
I .
for
is
stic with proportion
acceptance specificat
Pr( A
value ion
Y
[ A ,
B
( usually
)
B ] the AQL )
Estimation for Conformance Proportion for Removal Torque:
95% confidence ≥ 99.07% conforming: PASS
Sample Size =
Average =
Standard Deviation =
Skewness =
Excess Kurtosis =
Torque
No Transformation (Normal Distribution)
30
6.737513
0.561204
0.47
0.72
Test of Fit: p-value = 0.2808
(SK All) Decision = Pass
(SK Spec) Decision = Pass
LSL = 5 USL = 10
Pp =
Ppk =
Est. % In Spec. =
1.48
1.03
99.901940% 5.5723100
6.8713900
8.1704700
With 95% confidence more than 99% of the values are between 4.8576399 and 8.6173854
With 95% confidence more than 99.0666% of the values are in spec.
Lee, H., and Liao, C. “Estimation for Conformance Proportions in a Normal Variance Components
Mode.” Journal of Quality Technology, Jan., 2012.
Taylor, W. Distribution Analyzer, version 1.2.
• Overall process tolerance limits may be constructed to take between-lot variation into account
• Example: Impurities n Mean StDev Min Max
Lot 1 20 0.047 0.0113 0.018 0.069
Lot 2 20 0.054 0.0055 0.046 0.065
Lot 3 20 0.050 0.0087 0.035 0.068
• Approx 90% confidence / 95% coverage tolerance interval 1 : x ..
t k
1 (
1
,
) k ( k ss
1 ) n
0 .
12
Appears conservative
• Usual 90/95% tolerance interval for all data combined: 0.07
1 Krishnamoorthy, K. and Mathew, T. “One-Sided Tolerance Limits in Balanced and Unbalanced One-Way
Random Models Based on Generalized Confidence Intervals.” Technometrics, Vol. 46, No. 1, Feb. 2004.
• AQL = "Acceptance Quality Limit“
• The quality level that would usually (95% of the time) be accepted by the sampling plan
• RQL = "Rejection Quality Limit“
• The quality level that will usually (90% of the time) be rejected by the sampling plan
• Also called LTPD (Lot Tolerance Percent Defective)
• Also called LQ (Limiting Quality)
AQL : Pr(accept)=0.95
RQL : Pr(accept)=0.10
AQL and RQL (LTPD) for n=50, a=1
1.0
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
0 1 2
AQL = 0.72%
3 8 9 10 4 5 6
Percent Defective
7
RQL = 7.56%
Can be cast as a hypothesis test or confidence interval
For routine acceptance sampling…
H o
: p ≤ Assigned AQL
H
1
: p > Assigned AQL
α=0.05, “accept” lot if Ho not rejected
But for validation…
H o
H
1
: p > Assigned AQL
: p ≤ Assigned AQL or desired performance level
α
=1-confidence; i.e., 1-.90 = .10 for 90% confidence
Pass validation if H o rejected
Product Attribute
Critical
Major Functional
Minor Functional
Cosmetic Visual
Typical Assigned AQLs
0.04%, 0.065%, 0.1%, 0.15%, 0.25%
0.25%, 0.4%, 0.65%, 1.0%
0.65%, 1.0%, 1.5%
1.5%, 2.5%, 4%, 6.5%
• For validation, suggest 90% confidence that process ≤ assigned AQL
• Why 90%?
• Traditional probability used for RQL/LTPD/LQ
• This means for validation the assigned AQL is treated as an RQL
• If nonconforming rate is at AQL, will fail validation 90% of the time
• Selection of the AQL more important than confidence selected
• Much tighter than ANZI Z1.4/Z1.9 tightened (10-20% confidence)
Conforming
99.9%
99.6%
99.0%
97.5%
95.0%
90% Confidence 95% Confidence n
2300 a
0 n
3000 a
0
3890
5320
1
2
4745
6295
1
2
575
970
1330
0
1
2
750
1185
1575
0
1
2
230
390
530
90
155
210
45
77
105
0
1
2
0
1
2
0
1
2
300
475
630
120
190
250
29
45
60
0
1
2
0
1
2
0
1
2
Attributes data is binomial pass/fail data
Example: n=230, a=0 provides
90% confidence ≥ 99% conforming;
90% confidence ≤1% nonconforming
• Production assigned AQL is 1.0%
• AQL = “Acceptance Quality Limit”
• Assigned based on risk assessment
• If process is better than AQL, almost all mfg lots will be accepted
• Validation: Show with 90% confidence that the process produces ≤1.0% nonconforming units
• Multi-head filler; we know data are non-normal
• 90% confidence ≥99% are in spec
• Medical devices: 90% confidence for 99% “reliability”
• Assures that future AQL production sampling can be passed
• If process is at the AQL, ~95% of lots will pass AQL sampling
• Attributes plans: 90% confidence ≤1.0% nonconforming
Sampling Plan RQL
0.10
n=230, acc=0, rej=1 1.0% n=390, acc=1, rej=2 1.0%
Z1.4 normal: n=80, acc=2
Z1.4 tightened: n=80, acc=1 n1=250, a1=0, r1=2 n2=250, a2=1, r2=2
1.0%
• If the validation sampling plan passes…
• We have 90% confidence the nonconforming rate is ≤1.0%
• ANSI Z1.4 plans provide far less than 90% confidence
• Normal sampling: typically about 5% confidence
• Tightened: typically about 15% confidence
• Note: RQL=“Rejection Quality Limit”
• Also called LTPD (Lot Tolerance Pct Defective) or LQ (Limiting Quality)
Attribute type
AQL attributes
• Fill volume
• Tablet defects
• Extraneous matter, etc.
Non-AQL attributes
• Dissolution / UDU / Batch Assay
• Other tests
Statistical Parameters
• Mean / sigma / RSD(CV)
• Cpk, Ppk
Comment
≥90% confidence that
• Nonconformance rate ≤ assigned AQL
≥90% confidence that…
• USP test will be met ≥95% of the time
• ≥99% of results in spec (critical)
• ≥95% of results in spec (non-critical)
≥90% confidence that…
• Mean / sigma / RSD in spec
• Ppk ≥1.0, 1.33 or related to % coverage
No within batch variation expected
• pH of a solution
• Label copy text
Statistics not usually necessary
• May consider 3X-10X testing
• Assess between lot variation
• Show confidence interval for parameter in spec
• Example: API mean potency; spec of 98.0-102.0
• n=30 test results (3 from each of 10 drums)
• 95% C.I. for mean is traditional
Summary for API 95% C.I. for mean is
100.26 – 100.54; pass.
100.0
100.4
100.8
9 5 % Confidence Inter vals
101.2
A nderson-Darling Normality Test
A -Squared
P-V alue
0.70
0.059
M ean
StDev
V ariance
Skew ness
Kurtosis
N
100.40
0.37
0.14
0.440565
-0.997480
30
M inimum
1st Q uartile
M edian
3rd Q uartile
M aximum
99.84
100.08
100.29
100.75
101.13
95% C onfidence Interv al for Mean
100.26
100.54
95% C onfidence Interv al for Median
100.14
100.62
95% C onfidence Interv al for StDev
0.30
0.50
Also need to analyze data across drums!
Mean
Median
100.1
100.2
100.3
100.4
100.5
100.6
Measures process capability of meeting the specifications
P pk
Min
USL
3
LT x
, x
3
LSL
LT
σ
LT is long-term sd, usual formula, includes variation over time;
Cpk uses short-term estimate of sd
USL 105
104
103
102
101
100
99
98
97
96
95
Ppk=2.0
Ppk=1.0
Ppk=1.0
Ppk=1.33
Ppk=0.9
LSL
Ppk
0.6
0.7
0.8
0.9
1.0
1.1
1.2
1.3
1.5
2.0
Percent
Nonconforming
7.2%
3.6%
1.6%
0.69%
0.27%
0.10%
0.03%
0.01%
0.0007%
0.0000002%
• In statistical control
• Normally distributed
• Centered in spec
• If1-sided: half % shown
• Provide 90% confidence process Ppk≥1.3
• n=15 assay tests were obtained across each of 3 PPQ lots
• No significant difference in mean/variance in the 3 lots; pool data?
Process Capability of Assay Result
(using 90.0% confidence)
LSL
Target
U SL
Process Data
95
*
105
Sample Mean 101.298
Sample N 45
StDev (Within) 0.637147
StDev (O v erall) 0.760421
O bserv ed Performance
% < LSL 0.00
% > U SL 0.00
% Total 0.00
LSL
96.0
97.5
Exp. Within Performance
% < LSL 0.00
% > U SL 0.00
% Total 0.00
USL
99.0
100.5
102.0
Exp. O v erall Performance
% < LSL 0.00
% > U SL 0.00
% Total 0.00
103.5
105.0
Within
Overall
Potential (Within) C apability
C p 2.62
Low er C L 2.24
C PL 3.30
C PU
C pk
1.94
1.94
Low er C L 1.66
O v erall C apability
Pp 2.19
Low er C L 1.88
PPL
PPU
2.76
1.62
Ppk 1.62
Low er C L 1.39
C pm *
Low er C L *
Pass
105
104
103
102
101
100
99
98
97
96
95
Batch 1 Batch 2 Batch 3 Batch 4 Batch 5 Total process variation
Process in Classical Statistical Control
Common Cause Variation Only
105
104
103
102
101
100
99
98
97
96
95
Batch 1 Batch 2 Batch 3 Batch 4 Batch 5 Total process variation
Variance Components Model
Intra=Within batch: σ w
Inter=Between batch: σ b
105
104
103
102
101
100
99
98
97
96
95
?
Batch 1 Batch 2 Batch 3 Batch 4 Batch 5 Total process over time
Process not in Statistical Control - Special Cause Variation
Ppk if Process is Not in Statistical Control
• Use of Ppk is controversial if process not in statistical control 1
• “Ppk has no meaningful interpretation”
• “statistical properties are not determinable”
• “a waste of engineering and management effort”
• Note: between-batch variation means not in classic statistical control
• If variance components model holds, estimate Ppk with σ w and
• Usual confidence intervals from standards/software not applicable
σ b
• Confidence interval must take degrees of freedom for σ b into account
• Difficulty in proving variance components model assumptions with small number of lots
1 Montgomery, Introduction to Statistical Quality Control 6 th edition, p 363
Potential problems with Ppk over multiple lots
• Usual Ppk confidence interval assumes normal distribution and process stable / in statistical control
• Any changes/trends within or between lots invalidates assumption
• Often differences in mean between batches
• Usual Ppk C.I. does not consider variance components
• Example: 30 samples from each of 5 lots
• (30-1)x5 = 145 degrees of freedom for within lot variation
• (5-1) = 4 degrees of freedom for between lot variation
• ASTM reference E2281 does not address this
• Alternative: Show Ppk for each lot meets requirement for
• 3 lots: ~87% overall confidence median process Ppk meets spec
• 4 lots: ~94% confidence
• 5 lots: ~97% confidence
Or calculate a modified Ppk based on variance components C.I.
Example: Ppk if process not in statistical control
101
100
Process Capability for Validation
Xbar Chart Capability Histogram
1
LSL
1
UCL=100.342
X=100.037
LCL=99.733
USL
Specifications
LSL 95
U SL 105
1
99
1
1 2 3
S Chart
4 5 96.0
97.6
99.2
100.8
102.4
104.0
Normal Prob Plot
A D: 0.512, P: 0.192
0.8
0.6
UCL=0.7689
_
S=0.5509
0.4
1
102
100
98
1
2 3
Last 5 Subgroups
4 5
LCL=0.3330
Within
StDev 0.5557
Cp
Cpk
PPM
3.00
2.98
0.00
98 100
Capability Plot
Within
102
Overall
104
Overall
StDev 0.9321
Pp
Ppk
1.79
1.77
Cpm
PPM
*
0.08
Specs
5 2 3
Sample
4
Ppk=1.77; lower 95% C.I. for Ppk using Minitab is 1.60.
But should PPQ pass? Scientific understanding of trend?
Plot your data!
• Example: Uniformity of Dosage Units (Content Uniformity)
• Requirement: Pass USP‹905› Uniformity of Dosage Units
• ≥90% confidence USP test would be passed ≥95% of the time (coverage)
• See Bergum 1 for specifics to determine acceptance criteria
• Why 90% confidence? Comparable to RQL probability.
• Why 95% coverage? Comparable to AQL probability for single test.
• Bayesian approach also available 2
1 Bergum, J. and Li, H. “Acceptance Limits for the New ICH USP 29 Content-Uniformity Test”,
Pharmaceutical Technology , Oct 2, 2007
2 Leblond, D., and Mockus, L. “Posterior Probability of Passing a Compendial Test.” Presented at Bayes-Pharma 2012, Aachen, Germany.