Session 3: Statistical and Alternative Approaches to Setting Specifications
Setting Specifications with limited data is always a problem yet we aim to assure potency from the first tox lot to the last commercial lot.
Regulators share our concerns and also seek assurance that we do not miss something important about our product due to the relatively high “noise” typical of bioassays. Likewise, we seek to avoid wasting time investigating results that are a consequence of normal bioassay variation and do not represent issues of product quality. From early to late development, product and assay knowledge is incrementally acquired, leading to a specification range that balances the risk of passing “bad” lots with the risk of failing “good” lots. We present a variety of approaches that help us manage this risk and tips for avoiding common pitfalls. We recommend a practical yet comprehensive way of mining and querying data that employs modern tools and emphasizes training.
2
• Extensive experience in development of novel innovative biologics
– Wide range including antibodies, hormones, cytokines, clotting factors, enzymes, fusion proteins, conjugates, aptamers, and vaccines
– 16 products currently marketed
• Diversity of products that must all be well-characterized
Any one of these which are a biological/vaccine could have a Mechanism of Action (MOA) requiring more than one biological interactions/functions: Multiple bioassays may be needed. Testing required at release, on stability and for process characterization
4
• Better understanding of the basis for bioassay specification ranges and minimizing the two main types of risks
• Awareness of the approaches applied for early and late bioassay specification ranges and limits and how to mine/use all the data to make better informed decisions
– How to use charts, graphs and number crunching … and apply some of the art gleaned from corporate, industry and regulatory experience (to enhance quantitative assessments)
5
1.
What is a bioassay?
2.
What is unique about bioassay specifications?
3.
Balancing the Two Risks (Type 1 and Type 2 errors)
– Protection against Type 1 and examples
– Protection against Type 2 and examples
4.
A comprehensive strategy that defines what is done early and late
Disclaimers:
•I am in R&D (not commercial), so emphasis is on clinical trial material
-But we do our work in a cGMP laboratory
•I am not a statistician
6
Concentration
(mg/mL)
These assays share common issues, designs and data analysis…
In some cases these terms are synonymous (e.g. for bind/block MOA)
Bioassay
(cell based)
“immunochemical” properties physico-chemical/other testing
2.
3.
4.
5.
10.
11.
Additional (semi quantitative, report results)
2.
3.
4.
5.
6.
7.
8.
9.
Release (quantitative or specific criteria)
1.
Physical Properties 1
Physical Properties 2
Physical Properties 2
Physical Properties 2
Identity 1
Identity 2
Purity 1
Purity 2
Purity 3
Purity 4
………
1.
2.
3.
4.
Purity 5
Purity 6
Purity 7
Purity 8
5.
6.
Purity 9
….......
Characterization (data display)
1.
e.g. mass spetrometry e.g. biophysical e.g. immunological e.g. SPR
………..
7
• 21 CFR 600.3(s): “the specific ability or capacity of the product, as indicated by appropriate laboratory tests or by adequately controlled clinical data obtained through the administration of the product in the matter intended, to effect a given result”
– Does not define a “potency test ”
– For percent relative potency (%RP) either an animal, cell or immunological assay may be possible, depending on MOA and availability of cell lines and reagents
“
There is no single test that can adequately measure those product attributes that predict clinical efficacy
. Manufacturers demonstrate clinical effectiveness by ‘substantial evidence’, i.e., evidence that the product will have the effect it purports or is represented to have under the conditions of use prescribed, recommended, or suggested in the labeling or proposed labeling thereof (section 505(d) of the FDC Act)”.
Duff and Hooper, 2011, Expert.Opin. Ther.
Targets, 15 p. 160
8
• Units are in percent relative potency (%RP) – directly vs. a reference standard
• They take a long time to develop (assay and data analysis)
– Ranges unduly influenced by method variation (vs other methods)
• They are wide (50-200%, 50-150% , 70-130%, 80-125% , 80-120% )
– This is due to (1) the necessary complexity of the assay procedure; and (2) inherent variability from use of immunological reagents and the biology of living cells
• One size does not fit all
– Bioassays cover a range of types (animal, cell based, immunoassay)
• Within each type, there are many modes (e.g. of ELISA configurations)
– Purpose of test (typically %RP, but may be toxicity)
• Unusual clinical dosing increments may be a factor
– If the degree of similarity needs more assurance
9
• Common usages of risk terms
– Type 1 error= “producer” risk, “false alarm”, “presumed guilt”, failing a good lot
– Type 2 error= “consumer” risk, “false security”, “presumed innocence”, passing a bad lot
• Both risks are undesirable and need our attention
– Reduction in one typically increases it in the other
– Some actions we take reduce risks in both categories Both have some controllable aspects which reduce both risks simultaneously
• The Trade-Offs
Type 1
Product Specifications too narrow
Good batches appear inferior and require time and effort to investigate and may be failed (wasted)
Type 2
Product Specifications too wide
Bad batches could be passed and compromise efficacy in clinical trials
Patients in clinic don’t get drugs in time if un-necessary delays to supply chain
Patients in treatment group would not gain benefit during clinical trials if bad batches were unwittingly used
10
• Optimize method during development
– Confirm relevant response (MOA) being measured, dose response is appropriate, etc)
– Understand / address sources of variability
• Data Reduction similar to that recommended in USP
– Normalize signal, Free 4 PL Fit, Outlier Analysis, Parallelism, Reduced 4PL, calculation of %RP and relative potency
• Establish unique Assay Validity Criteria in test method (and data to collect/trend)
– Avoids un-necessary investigations when unknown random errors occur and gives key information that is useful for understanding potential changes in reagents / cells
Reference Sample Control Sample
Curve is typical
Root Mean Square or R2
Effective Assay Response
Percent Relative Potency
Slope Ratio
Upper Asymptote Ratio
Effective Asymptote Ratio
Visually similar to image in method
≤ 25 or 0.98
≥ 0.5 signal to noise
≤ 25 or 0.98
80-120%
0.7-1.3 (or tighter based on statistical analysis ~+/-3sd)
0.8-1.20
0.8-1.20
– Sample results not calculated if criteria not achieved
11
• For this example, a sample prepared at the lowest possible dilution is compared to a different type of reference sample (toxin, at very high dilution). Response ratios are calculated. The sample has been shown to be safe preclinically and in another relevant qualitative cytotoxicity assay, so the goal is to choose a dilution of sample where all results are below a threshold (1.0) and apply in a quality control environment to confirm more quantitatively the absence of toxicity in samples.
12
4
5
Qualification Data (1-2 weeks)
Histogram of n=16
Normal
1
Mean 1.423
StDev 0.1658
N 16
First confirm this is not due to a change in material
Stability, Lot Release Data (1-2 years )
3
Histogram of n=27
Normal
1
2
5 Mean 1.299
StDev 0.2205
N 27
1
4
0
0.8
1.0
1.2
n=16
1.4
1.6
1.8
3
2
4σ 3σ 2σ
1
0
0.8
1.0
1.2
n=27
1.4
1.6
1.8
• What Happened? What did we learn?
– Initial spec range (≥1.0) un-realistic and arbitrary, more noise in assay (even after further optimization)
– 2σ too tight even under ideal qualification period, 4σ too broad since would not have stimulated enough investigations , 3σ just right to “catch” assays like this but not typically precise assays
13
• Understand method early in development and calculate “method” capability index
(MCI) approach
– Method is a process itself and typically the primary source of variation in test results
• Collect wide array of assay performance data as early as possible
• Apply method validity requirements as soon as possible
– Assure that platform range can be achieved
• Platform= 50-150%
• Method must be tighter or at least capable of this range using MCI approach
• Study “mock potency” during qualification beyond 100% RP level
• Use assay (and process /stability) data collected throughout clinical development in order to establish acceptable ranges for late stage / ICH method validation
– By this time method vs process variability will be known
– Continue to refine assay validity requirements so that appropriate tools are available for commercial production
– Base specification range on combination of assay + process + stability data
• strive for ~80-120%
14
• Relationship of standard error / failure rates to Method Capability
Index (MCI) and Justification of 1.0 Target Value
• A Chart Showing how MCI Translates to needed method Precision and Accuracy
• Design of Experiments to minimize bias to achieve necessary precision/accuracy
– ELISA case study given
• Control Charting shows results over time
– Well behaved ELISA case study
15
LSL
LSL
LSL USL
MCI=0.67 → <4.5% assay failure rate
MCI=1.0 → <0.3% assay failure rate
MCI=1.33 → <0.01% assay failure rate
USL
Rates based on the assumption that the normal distribution is centered at target (no significant assay bias)
Kun Zhang, Informa Life Sciences’ 9 th Annual Biological Assays Conference, London UK, Nov 3-4, 2010
USL
18.0%
16.0%
14.0%
12.0%
10.0%
8.0%
6.0%
4.0%
Specification of 50%-150%
Specification of 70%-130%
Specification of
75%-125%
2.0%
Specification of
80-120%
0.0%
0% 2% 4% 6% 8% 10% 12% 14% 16% 18%
%CV (precision)
Method is designed and developed to have a MCI value of 1.0. This corresponds to >99% chance that any assay result will be within specification limits.
Kun Zhang, Informa Life Sciences’ 9
London UK, Nov 3-4, 2010 th
Annual Biological Assays Conference,
• Shows the range of specifications possible based on precision and accuracy
This evaluation should be done at key milestones during product development to assure specs are supported... fuller understanding of factors will take time and road testing
17
• Each point an independently derived assay result (replicates of individual dilution series)
• Can be used to balance effort (throughput, FTEs) vs benefit (precision)
18
Orange= Mean +/- 3 σ , blue= mean +/- 2 σ
Grand Mean = 100.4%, n=278, several years of data
Life of method shows typical pattern
One method developer very precise in one lab with well controlled reagents and instrumentation
During validation / transfer the “real” precision is realized
Reagent change causes temporary problems
Root cause of change identified, resolved, back to “real” precision
Kun Zhang, Informa Life Sciences’ 9 th Annual Biological Assays Conference, London UK, Nov 3-4, 2010
19
Lower Specification Limit of 50%, where assay CV=16%
1.00
0.90
0.80
0.70
0.60
0.50
0.40
0.30
0.20
0.10
0.00
0% 20% 40% 60% 80% 100%
Test Sample True Potency, %RP
1 test
4 tests
if sample were actually 50%RP
In theory a sample at ~50%RP would pass a 50-
150% specification half the time in a single test.
However, in practice further testing when a result is outside of assay variability (3σ) greatly reduces this probability. In addition, more tests on stability (e.g.
T0,3,6,9,12…mo) further reduce the probability of missing detection of a sub (or super-) potent lot.
• This is just one way we reduce this risk….
20
• A host of bioassay data is queried and monitored routinely: We do not rely on a single parameter
– Specific numerical criteria for samples are also included in “pass” criteria and must be achieved before %RP is reported
• Parallelism (slope ratio, tolerance intervals)
• Upper asymptote ratio
• Effective asymptote ratio
– Visual inspection of curves, EC
50 monitor ed
and raw signals
• Specs tighter than platform applied for safety tests and when higher confidence in similarity required
• A stable reference standard monitored closely assures no “drift” that could influence %RP
• Every batch is also tested by a large number of orthogonal (physico-chemical) methods which
– Incidences where bioassays see changes that others don’t are exceptionally rare !
21
In most cases, specifications can be tightened over time using method, process
& stability data
22
• Type 1 Risk is managed throughout development by understanding/improving the assay and collecting key data
– Tighten over time but driven by data / capabilities
• Type 2 is managed (1) between the method capabilities and the platform specification range plus
(2) a plethora of other bioassay data collected plus (3) other data from physico-chemical methods
23
• Bioassays are complicated and (consequently) have high variability – but not an excuse for sloppiness!
– High quality drugs in clinical development and for commercial use are a common goal we all have (manufacturers, regulators, and all of the public [our families])
• Bioassays demand more data (than just %RP) to assure our clinical drug candidates are safe and efficacious
• A documented system of data collection and management is invaluable for managing risks
– Facile control charting tools are invaluable
• Both risks (type 1 and type 2) are always present and need to balanced and actively managed over time
– Product specification is a key part of this determination
• Bioassays are key to assure product biological activity in vitro, but only a part of a comprehensive overall control strategy where well characterized structural analysis provides assurance that the product will have the appropriate potency in human clinical trials
24
• Bioassay and Impurity Group
– Kun Zhang, Aparna Deora, Jim DuMontelle, many others in the Bioassay and Impurity Testing Group
• Biostatistics
– Andrew Rugaiganisa, Brad Evans, Greg Steeno
25