Impact of Stability on Setting and Meeting Specifications in an Uncertain World William R. Porter MSBW 2012 Setting and Meeting Specifications Statistical approaches to setting specifications and devising methods to meet them have been developed in many different contexts, including: Widget manufacturing, Biologicals manufacturing, Classical stability trial design and Measurement uncertainty. Impact of Stability on Specs MBSW May 2012 Copyright 2012 W. R. Porter 2 Widget Manufacturing Classical industrial quality control was developed for the durable goods industries, especially for manufacture of machined items. Parts had to mate together without customization. Primary source of product variability was the manufacturing process itself. Measurement tools were highly precise and accurate… • …as shown by Gage R&R (repeatability & reproducibility) studies Impact of Stability on Specs MBSW May 2012 Copyright 2012 W. R. Porter 3 Biologics Manufacturing The three most important things required for developing a biologics product have been: Analytical methods, analytical methods and analytical methods. • (i.e., location, location and location—just as in real estate investment—but scale [dispersion] is even more important.) • The major source of variability in product performance was traceable to the methods used to monitor quality, particularly bioassays. Impact of Stability on Specs MBSW May 2012 Copyright 2012 W. R. Porter 4 Classical Stability Trial Design Stability trial design was formalized in ICH guidelines Q1A, Q1B, Q1C, Q1D and especially Q1E. ICH Q1E Appendix B suggests designs and methods for data analysis and interpretation. • Focus is on studies with only a few batches using ANCOVA. • Methods to evaluate batch variation are unconvincing and established by fiat. Impact of Stability on Specs MBSW May 2012 Copyright 2012 W. R. Porter 5 Measurement Uncertainty All measurements are uncertain; there are none, which are not uncertain. All measurements are wrong, but some are useful. • (with apologies to G. E. P. Box) Only a quantitative estimate of uncertainty distinguishes useful data from worthless garbage. • In Bayesian terms, numbers without informative prior distributions are worthless. Useful measurements have informative priors. Beginning in the 1990’s, mainly in Europe, efforts to formalize evaluation of measurement uncertainty were undertaken. • GUM: Guide to the expression of uncertainty of measurement (1995, 2008). • EURACHEM/CITAC Guide: Use of uncertainty information in compliance assessment (2007). Impact of Stability on Specs MBSW May 2012 Copyright 2012 W. R. Porter 6 Measurement Uncertainty (2) The pharmaceutical industry has been slow to incorporate accepted international practices for quality control of chemical measurements. Producers must set tighter specifications than required by customers to allow for measurement uncertainty. Formal uncertainty studies as part of method validation need improvement. This should not be a problem, as methods for assessing measurement uncertainty are now well-established—we just have to get on with it. Impact of Stability on Specs MBSW May 2012 Copyright 2012 W. R. Porter 7 Specification Uncertainty Many specifications are digital—that is, expressed as decimal numbers with a specified number of significant digits. Digital specifications also imply the expected maximum uncertainty in measurements used to confirm compliance to specifications. Frequently, the number of significant digits in written specifications are TOO SMALL to meet actual quality expectations. • ICH impurity guidelines are at least one order of magnitude too imprecise. • Round off errors result. Impact of Stability on Specs MBSW May 2012 Copyright 2012 W. R. Porter 8 Customer Expectations We know what patients expect. What P(failure) is acceptable? Regulators expect that any sample, selected for the convenience of the regulators from normal supply channels, will meet specifications for purity and potency at all times within the stated shelf life of the drug product for all batches. The testing can be performed by any qualified laboratory using any qualified equipment and reagents by any trained personnel following validated SOP’s. Impact of Stability on Specs MBSW May 2012 Copyright 2012 W. R. Porter 9 Sampling What distinguishes the approaches used for widgets, biologics and classical stability trials is how they: Address the extent to which measurement uncertainty contributes to overall variation, and. Address within-batch and betweenbatch sampling as sources of McConnell J, Nunnally BK, variation. McGarvey B. Sampling—The • Sampling has been described as “The Mother Lode” of all errors. Impact of Stability on Specs MBSW May 2012 ‘Mother Lode of All Errors. J. Validation Technol. 18(1); 45-49 (2012). Copyright 2012 W. R. Porter 10 Convenience Sampling Since any sample from any batch (not just ‘random’ samples) must meet specifications during the entire shelf life, we need to know: What is the variability of samples selected within batches? What is the variability of samples selected between batches? How confident are we of our estimates for these components of variance? How does measurement uncertainty affect our estimates? Impact of Stability on Specs MBSW May 2012 Copyright 2012 W. R. Porter 11 Batch Variation in Widgets Traditional approaches to industrial quality control, as promoted by Walter A. Shewhart, W. Edwards Deming, Joseph M. Juran and their followers, rely on control charts. Typically, a minimum of ~30 batches is thought to be needed to demonstrate that a process is in statistical control, but Donald J. Wheeler suggests that as few as 10 batches may be adequate. • Measurement uncertainty is small compared to batch variation. • QbD initiatives are based largely on methods devised for controlling the quality of widgets. Formal random sampling plans are well-defined. Experimental (non-continuous) data can be handled using the ANOM (Analysis of Means) graphical method. Impact of Stability on Specs MBSW May 2012 Copyright 2012 W. R. Porter 12 Example Widget Batch Data Batch 1 Batch 2 Batch 3 Batch 4 Batch 5 250 310 250 340 250 260 330 230 270 240 230 280 220 300 270 270 360 260 320 290 1-WAY ANOVA Source of Variation Between Batches Within Batches Total SS 19830 9425 29255 df 4 15 19 MS 4957.5 628.3 F P-value 7.89 0.0012 F crit 3.06 Wheeler DJ. Advanced Topics in Statistical Process Control. Knoxville TN: SPC Press . p. 374 (1995). The example in the reference is analyzed graphically using Analysis of Means. Batches 2 & 4 were detectably higher than the Grand Average. Batch 3 was detectably lower than the Grand Average. Note that ANOM control limits are tighter than Average & Range chart limits for continuous processes. Impact of Stability on Specs MBSW May 2012 Copyright 2012 W. R. Porter 13 Now That’s Odd… Batch 1 Batch 2 Batch 3 Batch 4 Batch 5 250 310 250 340 250 260 330 230 270 240 230 280 220 300 270 270 360 260 320 290 The data represent grams of coating per sample aliquot. Note that each value ends in ‘0’. The laboratory weighed the recovered coating using a balance readable only to the nearest 0.01 kilogram. The smallest range within batches is 0.04 kg. The data granularity is too big; within-batch variation is on the same order of magnitude as measurement uncertainty. The measurement tool is not capable of providing sufficient accuracy and precision. Impact of Stability on Specs MBSW May 2012 Copyright 2012 W. R. Porter 14 Batch Variation in Biologics Assay variation is the dominate factor in controlling the quality of biologics, and batch variation takes a back seat. Biologics are typically compared, batch by batch, to a certified reference standard. From personal experience, variation between testing laboratories participating in round-robin certification of new global standards grossly exceeds in-house assay variability. • Batch variation is swamped by measurement uncertainty. • ICH recognized the need for a separate guideline (Q5C). Sampling plans are poorly defined • Biologics traditionally were homogeneous solutions, so sampling was not considered to be an issue. Impact of Stability on Specs MBSW May 2012 Copyright 2012 W. R. Porter 15 Example Bioassay Data Table 4. FSH. LH and HCG immunoactivity in different HMG preparations. Product Drug FSH LH HCG product immunoactivity immunoactivity immunoactivity batch no. lU/vial (relative lU/vial (relative lU/vial {relative SD. n = 5) SD. n = 5) SD. n = 5) Pergonal 0331206B 58.77 (2.2) 13.49(3.6) 3.39(1.7) Humegon 43905119 65.12(1.7) 5.77(1.0) 6.86(1.8) 2nd WHO standard (FSH 54/LH 46) 77.72(5.0) 7.39 (2.4) 7.22 (4.8) 4th WHO standard (FSH 72/LH 70) 86.14(5.3) 3.82(1.8) 10.10(5.1) Menopur 32509 74.17(1.9) 0.29 (5.2) 9.61 (2.3) Menopur 32307 73.44 (3.9) 0.48(1.7) 9.05 (3.3) Menopur 34104 82.62(1.3) 0.39(3.1) 11.06 (1.8) The relative standard deviation is expressed as a percentage and is obtained by multiplying the standard deviation (per batch) by 100 and dividing this value by the average (per batch). Wolfenson C, Groisman J, Couto AS, Hedenfalk M, Cortvrindt RG, Smitz JE, Jespersen S. Batch-to-batch consistency of humanderived gonadotrophin preparations compared with recombinant preparations. Reprod Biomed Online. 2005 Apr;10(4):442-54. Impact of Stability on Specs MBSW May 2012 Copyright 2012 W. R. Porter 16 Now That’s Odd… Two generations of the WHO standard were tested. The 2nd generation standard is certified to contain FSH 54 IU/ampoule and LH 46 IU/ampoule. The 4th generation standard is certified to contain FSH 71.9 (69.0-74.9 [95% fiducial limits]) IU/ampoule and LH 70.2 (61.7-80.0 [95% fiducial limits]) IU/ampoule. The values reported are in comparison with the standards provided by the test kit vendor. The reported values for LH differ grossly from the certified values; the FSH values are systematically high. The reported values are overly precise; the last two digits in the four-digit reported values are meaningless. The ‘uncertainty’ reported is the within day within analyst repeatability, and does not include inter-day, inter-analyst, or most importantly, interlaboratory uncertainty. There is insufficient evidence that any of the batches are different. Impact of Stability on Specs MBSW May 2012 Copyright 2012 W. R. Porter 17 Batch Variation in Stability Trials Conventional experimental design (ICH Q1E) relies on a small number of batches. Minimally 3 batches, and who even does more than the minimum? • Not enough batches are studied to demonstrate that the process is in a state of statistical control, using the widget-making approach to quality control. • Recall that at least 10, and preferably 30 batches are required for control charts. Measurement uncertainty is a substantial component of variance, even for small molecule drugs. Sampling issues are not addressed in the guidance. Impact of Stability on Specs MBSW May 2012 Copyright 2012 W. R. Porter 18 Example Stability Batch Data Batch Variation 101 100 99 % Potency 98 97 Batch 1 96 Batch 2 Batch 3 95 Batch 4 94 Batch 5 Outlier? 93 92 Batch 6 Outlying batch? 91 0 3 6 9 12 15 18 21 24 Time, months Subbarao N, Huynh-Ba K. Evaluation of Stability Data. In: Huynh-Ba K (Ed) Handbook of Stability Testing in Pharmaceutical Development. New York: Springer 266-267 (2009). Impact of Stability on Specs MBSW May 2012 Copyright 2012 W. R. Porter 19 Now That’s Odd… Batch Variation 101 100 99 % Potency 98 97 Batch 1 96 Batch 2 Batch 3 95 Batch 4 94 Batch 5 93 Batch 6 92 91 0 3 6 9 12 15 18 21 24 Time, months Batch 4 is much less stable than the other batches tested. But there is no reason not to believe that Batch 4 is just as representative of the process as the other batches. Could the initial potency for Batch 4 have been mis-measured? The 9-month result for batch 1 seems unusually low. But, after the fact, there is no way to determine if this low result is a laboratory error (e.g., due to inadequate sample preparation). We must include it. Impact of Stability on Specs MBSW May 2012 Copyright 2012 W. R. Porter 20 Batch Variation & Measurements In widget-making, the accuracy and precision of measurements, as demonstrated by Gage R&R studies, is typically a minor component of total variance. Batch variation is easy to detect and measure. In biologics manufacture, the accuracy and precision of measurements, as demonstrated by interlaboratory testing, is typically the dominant component of total variance. Batch variation is difficult to detect and measure. Conventional small molecule stability trials occupy a middle ground. Batch variation is handled by crude conventions. Impact of Stability on Specs MBSW May 2012 Copyright 2012 W. R. Porter 21 Components of Variance The relative importance of different sources of variation in the measured quality of drug products must be rigorously assessed in order to define realistic release limits. Obtaining sufficient data is a challenge. R&D (non-GMP) data and GMP data may need to be combined to increase the reliability of our assessment. Impact of Stability on Specs MBSW May 2012 Copyright 2012 W. R. Porter 22 Measurement Variation Measurement uncertainty can be affected by : Within operators within equipment within labs within days (repeatability). Between operators. (intermediate Between equipment. precision) Between days. Between labs (interlaboratory precision). Between dosage forms/strengths (sample preparation, excipient interactions). } • Interactions with other components of measurement uncertainty. Impact of Stability on Specs MBSW May 2012 Copyright 2012 W. R. Porter 23 Initial Product Variation Initial uncertainty can be affected by: Manufacturing sites. Manufacturing scale (equipment variation). Dosage strength. Packaging. Between batches. Homogeneity within batches. Interactions between all of the above. Impact of Stability on Specs MBSW May 2012 Copyright 2012 W. R. Porter 24 Stability (Time-Dependent) Variation Degradation rate uncertainty can be affected by : Overall “average” rate of degradation Interaction with: • • • • • • • • Manufacturing sites. Manufacturing scale (equipment variation). Dosage strength. Packaging. Environmental excursions (temperature, humidity). Between batches. Within batches. Interactions between all of the above. Impact of Stability on Specs MBSW May 2012 Copyright 2012 W. R. Porter 25 Estimating Uncertainties Not all of the factors enumerated in the previous slides will have equal weight. We need to distinguish between “the vital few and the useful many” (Juran). • Pareto’s principle: 80% of the variation in product quality is caused by 20% of the quality-impacting factors. Designed experiments and observational studies can provide insight. Impact of Stability on Specs MBSW May 2012 Copyright 2012 W. R. Porter 26 But Wait! There’s More! What About Supply Chain Control? Control of storage conditions during manufacture and distribution (e.g., maintaining cold conditions for temperature sensitive products) has been a major recent concern. Proper design of stress degradation experiments during product development and monitoring of conditions during distribution can address these issues. Impact of Stability on Specs MBSW May 2012 Copyright 2012 W. R. Porter 27 Unaddressed Supply Chain Issues What about mail-order pharmacies? Regulatory expectations are that the supply chain ends when the patient takes the drug, and not before then. Medications are dispensed with storage instructions. The U.S. Post office specifically states that control of temperature is NOT provided, and that the shipper is responsible for protection against temperature extremes. What about military deployments, where troops are issued 180 day supplies of medications? Impact of Stability on Specs MBSW May 2012 Copyright 2012 W. R. Porter 28 Using Stress Experiments In order to study uncertainty due to stability issues, we need ways to SHRINK TIME, because the issue is what will happen at the end of shelf life. Properly designed stress degradation studies can map the temperature×humidity design space. • In many cases, results for two years under ‘normal’ storage can be achieved in weeks. Impact of Stability on Specs MBSW May 2012 Copyright 2012 W. R. Porter 29 Error Budget Given a set of final specifications, these must be narrowed by amounts sufficient to account for: Measurement uncertainty (guard banding). Stability-related changes in quality metrics, especially within and between batch uncertainty at end-of-shelf-life. Whatever remains are the narrow limits that define the release specifications. Impact of Stability on Specs MBSW May 2012 Copyright 2012 W. R. Porter 30 Fixed vs. Random Effects We tend to do at least an adequate job of designing experiments to study the effects of ‘fixed’ factors on product quality targets. That is, we can estimate bias [location]. We tend to do a poor job of designing experiments to study the effects of ‘random’ factors on variation of product quality. That is, we do not do enough to estimate uncertainty or variation [scale]—of measurements, sampling or degradation. Impact of Stability on Specs MBSW May 2012 Copyright 2012 W. R. Porter 31 Degrees of Freedom 95% Confidence Limits for Mean 35 35 Governs design for fixed effects (bias) 15 5 -5 μ -15 -25 Governs design for random effects (uncertainty) 30 Standard Deviation Units 25 Standard Deviation Units 95% Confidence Limits for Standard Deviation -35 25 20 15 σ 10 5 0 0 5 10 15 20 25 30 35 40 45 50 0 5 Sample Size 15 20 25 30 35 40 45 50 Sample Size For n = 3, C.L is 220% larger than for n = ∞ For n = 4, C.L is 62% larger than for n = ∞ Impact of Stability on Specs MBSW May 2012 10 For n = 7, C.L is 220% larger than for n = ∞ For n = 14, C.L is 62% larger than for n = ∞ Copyright 2012 W. R. Porter 32 Insurance When estimates of a component of variance are based on limited data, then: Consider meta-analysis combining data from nonGMP development studies with GMP data to increase degrees of freedom. Inflate the estimated variance component by a factor to account for the uncertainty of estimation of the variance component to compensate for low degrees of freedom. Impact of Stability on Specs MBSW May 2012 Copyright 2012 W. R. Porter 33 OOS and OOT Failure to adequately estimate measurement uncertainty and stability variation will result in many OOS and OOT results. Either measurement uncertainty is underestimated, or Presumed shelf life is too long. Thus release specifications are too wide. If a process is under control, OOS and OOT results will be rare; contrapositively if OOS and OOT results are not rare, the process is not under control. Impact of Stability on Specs MBSW May 2012 Copyright 2012 W. R. Porter 34 Root Causes for Failure Have all sources of variation been studied enough to obtain a reliable estimate of their magnitude? HAVE YOU IDENTIFIED THE VITAL FEW? Are factors in experimental designs assumed to be fixed really random instead? ARE YOU ESTIMATING LOCATION OR SCALE? Are release specifications based on sufficient degrees of freedom for EACH factor? DO YOU REALLY HAVE ENOUGH DATA? Impact of Stability on Specs MBSW May 2012 Copyright 2012 W. R. Porter 35 But Really, Who Cares? In a QbD world, we should aim to produce consistent product with minimum variance. “While conformance to specifications is important, the fundamental concept that some processes are predictable, while others are not, makes the issue of conformity to specifications an issue which cannot be addressed directly. If a process is predictable, then its conformity or nonconformity will also be predictable. If a process is unpredictable, then its conformity will be unpredictable, and anything we say about the process will amount to little more than wishes and hopes.” • Wheeler DJ. Advanced Topics in Statistical Process Control: The Power of Shewhart’s Charts, Knoxville, TN: SPC Press, p. 187 (1995). If we can reduce process variation, reduce product degradation variation and reduce measurement uncertainty to low enough levels through QbD, then setting release limits becomes moot. Impact of Stability on Specs MBSW May 2012 Copyright 2012 W. R. Porter 36 Discussion σ→σ →σ →σ →σ → Impact of Stability on Specs MBSW May 2012 Copyright 2012 W. R. Porter σ 37